All times in Central Standard Time (GMT-8).
Across the three days for ISPD 2025, we have 3 keynotes, 18 accepted papers, 12 invited talks, one panel on Monday with 4 panelists, 4 speakers with longer talks for Professor Jason Cong's commemorative session, and finally the ISPD 2025 contest results.
Papers are available at the ACM Digital Library.
Videos are available through the links in the program below or at the ISPD 2025 YouTube channel.
18:00 - 20:00: Welcome Reception
Location: Holiday Inn Austin Midtown, Austin, Texas
9:00 - 9:10: Opening [slides] [video]
9:10 - 10:00: Keynote
Chair: Gracieli Posser (Cadence)
"Towards Designing and Deploying Ising Machines", Sachin Sapatnekar (University of Minnesota) [abstract] [slides] [video]
Abstract: Today, NP-complete or NP-hard combinatorial problems are often solved on classical computers, using heuristics with no optimality guarantees or approximation algorithms with loose optimality bounds. Ising computation provides a new paradigm for solving these problems using networks of coupled oscillators. In contrast with traditional Ising machines that use supercooled chips, recent approaches have proposed the use of coupled CMOS ring oscillators, reducing the power dissipation of these systems by several orders of magnitude. This talk will overview the Ising model, discuss the challenges of building CMOS Ising machines, including issues related to layout and timing, and point to directions that are helping deploy these methods to solve ever-larger combinatorial problems.
10:00 - 10:20: Break
10:20 - 11:40: Placement and DTCO
Chair: Jerrica (Jhih-Rong) Gao (Cadence)
1. "GOALPlace: Begin with the End in Mind", Anthony Agnesina, Rongjian Liang, Geraldo Pradipta, Anand Rajaram and Haoxing Ren, (Nvidia) [abstract] [slides] [video]
Abstract: Co-optimizing placement with congestion is integral to achieving high-quality designs. This paper presents GOALPlace, a learning- based approach to improving placement congestion by controlling cell density. It efficiently learns from an EDA tool’s post-route op- timized results and uses an empirical Bayes technique to adapt the target to a specific placer’s solutions, effectively beginning with the end in mind. Our method enhances correlation with the tool’s router and timing-opt engine, while solving placement globally without ex- pensive incremental congestion estimation and mitigation methods. A statistical analysis with hierarchical netlist clustering establishes the importance of density and the potential for an adequate cell density target across placements. Our experiments show that our method, when integrated into an academic GPU-accelerated global placer, consistently produces macro and standard cell placements that match or exceed the quality of commercial tools. Our empirical Bayes methodology also shows a substantial quality improvement over leading academic mixed-size placers, achieving up to 10× fewer design rule check (DRC) violations, a 5% decrease in wirelength, and a 30% and 60% reduction in worst and total negative slack (WNS/TNS).
2. "Invited: Scaling Standard Cell Layout Using Track Height Compression and Design Technology Co-optimization", Chung-Kuan Cheng (UCSD CSE) [abstract] [slides] [video]
Abstract: Moore’s law scaling is approaching physical limits, as indicated by the technology roadmap. Recent standard cell layout reductions rely on track height compression, which increases pin density and routing congestion. To address these challenges, design technology co-optimization (DTCO) was introduced. This paper explores how much track height can be compressed and how DTCO features can sustain layout scaling. To support this exploration, we developed an SMT-based cell synthesis tool that integrates gear ratio, M1 metal grid offset, local-interconnect source–drain (LISD) merging, adjustable gate cut lengths, and double-height architecture with pass-throughs, and various power delivery options. In our exploration, we compress horizontal track numbers from four to two. Our synthesis tool enables flexible gear ratio options through a graph-based data structure that allocates vertical rout- ing resources at a smaller pitch than the contacted poly pitch. For double-height architecture, we allow pass-through options to al- leviate routing congestion. The empirical results identify critical design strategies to meet scaling demands and overcome pin den- sity challenges. Overall, the study demonstrates the design options that lead to potential scaling capability in the near future.
3. "Invited: Physical Design Challenges on Design Technology Co-optimization", Taewhan Kim (Seoul National University) [abstract] [slides] [video]
Abstract: Design technology co-optimization (DTCO) is the process of optimizing design and process technology together to enhance performance, power efficiency, chip utilization, and manufacturing cost/yield. As device pitch scaling slows down due to the physical limits beyond 7nm node, DTCO becomes an important enabling method for the continuation of transistor density scaling in advanced process nodes. Through DTCO, we are able to evaluate technologies, design rules, and cell architectures using block-level PPA (performance, power, area) analysis, which greatly helps semiconductor fabs reduce cost and shorten time-to-market in advanced process development with substantial architectural innovation. The parameters that DTCO targets to evaluate include design rules (e.g., gate poly pitch, M1 pitch, side-to-top spacing rule, via- enclosure rule), cell architectures (e.g., single-row or multi-row- height, 2D or 1D M1, preference of metal direction), and technologies (e.g., Fin-FET, Nanosheet-FET, Complementary- FET), which are collectively called DTCO parameters. A complete DTCO should carry out two roles: (1) analyzing DTCO parameters to get a comprehensive overview of how DTCO parameters interact each other and influence design implementations and (2) optimizing DTCO parameters to identify an optimal combination of DTCO parameter values that yields the design implementations with optimal- PPA. For efficiency, ML (machine-learning) based modeling for DTCO parameter analysis and optimization would be essential. The evidences of the DTCO effectiveness can be found in industry [1] and academia [2] where TSMC applied DTCO to improve the clock frequency of HPC (Arm processor) by 9% on N3 node with no power budget increase [1] while IMEC claimed in 2019 that DTCO and STOC (system technology co- optimization) would significantly improve the chip performance as the process node advances [2]. (In the EDA context, an example of complet
4. "LEGALM: Efficient Legalization for Mixed-Cell-Height Circuits with Linearized Augmented Lagrangian Method", Jing Mai, Chunyuan Zhao, Jinwei Chen, Zuodong Zhang, Zhixiong Di, Yibo Lin, Runsheng Wang and Ru Huang (Peking University, Southwest Jiaotong University) [abstract] [slides] [video]
Abstract: Advanced technologies increasingly adopt mixed-cell-height circuits due to their superior power efficiency, compact area usage, enhanced routabil- ity, and improved performance. However, the complex constraints of modern circuit design, including routing challenges and fence region constraints, increase the difficulty of mixed-cell-height legalization. In this paper, we introduce LEGALM, a state-of-the-art mixed-cell-height le- galizer that can address routability and fence region constraints more effi- ciently. We propose an augmented Lagrangian formulation coupled with a block gradient descent method that offers a novel analytical perspec- tive on the mixed-cell-height legalization problem. To further enhance efficiency, we develop a series of GPU-accelerated kernels and a triple- fold partitioning technique with minor quality overhead. Experimental results on ICCAD-2017 and modified ISPD-2015 benchmarks show that our approach significantly outperforms current state-of-the-art legalization algorithms in both quality and efficiency.
11:40 - 11:50: Break
11:50 - 12:30: Acceleration
Chair: Tsung-Wei Huang (University of Wisconsin at Madison)
1. "Cypress: VLSI-Inspired PCB Placement with GPU Acceleration", Niansong Zhang, Anthony Agnesina, Noor Shbat, Yuval Leader, Zhiru Zhang and Haoxing Ren (Cornell University, NVIDIA) (Best Paper) [abstract] [slides] [video]
Abstract: The scale of printed circuit board (PCB) designs has increased sig- nificantly, with modern commercial designs featuring more than 10,000 components. However, the placement process heavily re- lies on manual efforts that take weeks to complete, highlighting the need for automated PCB placement methods. The challenges of PCB placement arise from its flexible design space and limited routing resources. Existing automated PCB placement tools have achieved limited success in quality and scalability. In contrast, very large-scale integration (VLSI) placement methods have proven to be scalable for designs with millions of cells and delivering high- quality results. Therefore, we propose Cypress, a scalable, GPU- accelerated PCB placement method inspired by VLSI. It incorporates tailored cost functions, constraint handling, and optimized tech- niques adapted for PCB layouts. In addition, there is an increasing demand for realistic and open-source benchmarks to (1) enable meaningful comparisons between tools and (2) establish perfor- mance baselines to track progress in PCB placement technology. To address this gap, we present a PCB benchmark suite synthesized from real commercial designs. We evaluate our method against state-of-the-art commercial and academic PCB placement tools with the benchmark suite. Our approach demonstrates a 1–5.9× higher routability on the proposed benchmarks. For fully routed designs, Cypress achieves 1–19.7× shorter routed track lengths. With GPU acceleration, Cypress delivers up to 492.3× speedup in run time. Finally, we demonstrate scalability to real commercial designs, a capability unmatched by existing tools.
2. "GPU-Accelerated Inverse Lithography Towards High Quality Curvy Mask Generation", Haoyu Yang and Haoxing Ren (NVIDIA) [abstract] [slides] [video]
Abstract: Inverse Lithography Technology (ILT) has emerged as a promis- ing solution for photo mask design and optimization. Relying on multi-beam mask writers, ILT enables the creation of free-form curvilinear mask shapes that enhance printed wafer image quality and process window. However, a major challenge in implementing curvilinear ILT for large-scale production is mask rule checking, an area currently under development by foundries and EDA vendors. Although recent research has incorporated mask complexity into the optimization process, much of it focuses on reducing e-beam shots, which does not align with the goals of curvilinear ILT. In this paper, we introduce a GPU-accelerated ILT algorithm that im- proves not only contour quality and process window but also the precision of curvilinear mask shapes. Our experiments on open benchmarks demonstrate a significant advantage of our algorithm over leading academic ILT engines. Source code will be available at https://github.com/phdyang007/curvyILT.
3. "Invited: Trailblazing the Future: Innovative Chip Design in the Era of Pervasive AI", Sudipto Kundu (Synopsys) [abstract] [slides] [video]
Abstract: Few engineering challenges are as complex and arduous as chip design, which typically requires multiple teams of experts and months of dedicated and diligent PPA (Performance, Power and Area) exploration work to achieve the desired goal. In the era of pervasive AI, Chip design automation tools are witnessing a seismic shift, emerging as a powerful intelligent AI agent that orchestrates various decision-making process at every aspect of chip design flow by leveraging multiple computes to explore different PPA strategies. This radical shift demands an AI ecosystem that connects data storage, compute resources, real time data analytics and efficient search space navigation technologies. In this talk, we will explore how physical design implementation tools are embracing AI agents to drive PPA exploration by using novel reinforcement learning techniques powered by deep insights of design and flow execution experience of prior run. The core of such system is built on continuous learning paradigm so that with each successive iteration of the same design, the syst
12:30 - 13:30: Lunch
13:30 - 14:30: Keynote
Chair: Zhuo Li (Cadence)
"How Automotive Functional Safety is Disrupting Digital Implementation", Charles J Alpert (Cadence) [abstract] [slides]
Abstract: The automotive industry is experiencing transformative disruption as the demand for vehicle electrification, connectivity, and autonomy drives manufacturers toward creating a “datacenter on wheels.” As a result, the cost of silicon in vehicles is projected to rise significantly in the coming years, attracting many semiconductor companies to the market. However, unlike smartphones or data centers, safety is paramount in the automotive sector, prompting widespread adoption of the ISO 26262 functional safety standard. Meeting this standard introduces additional design time, rigorous processes, and increased silicon costs. In design implementation, safety can be achieved through inserting safety mechanisms such as parity, triple-voting flops, and dual-core lockstep. However, the silicon cost of implementing safety can significantly increase chip area (e.g., from 30-80%), so design teams need advanced methodologies to achieve safety with minimum pain, but also minimum area and power. In particular, the Dual Core Lock Step is a popular safety mechanism since it provides excellent safety coverage for the logic to which it is applied. However, having numerous DCLS modules in a single design can become a floorplanning nightmare, leading to massive congestion, area bloat, and overall performance degradation. We propose a novel methodology for DCLS insertion during logic synthesis and digital implementation to address these issues.
14:30 - 14:40: Break
14:40 - 15:40: Emerging Technologies
Chair: Ben Trombley (IBM)
1. "ML-QLS: Multilevel Quantum Layout Synthesis", Wan-Hsuan Lin and Jason Cong (UCLA) [abstract] [slides] [video]
Abstract: Quantum Layout Synthesis (QLS) plays a crucial role in optimizing quantum circuit execution on physical quantum devices. As we enter the era where quantum computers have hundreds of qubits, optimal OLS tools face scalability issues, while heuristic methods suffer significant optimality gap due to the lack of global opti- mization. To address these challenges, we introduce a multilevel framework, which is an effective methodology for solving large- scale problems in VLSI design. In this paper, we present ML-QLS, the first multilevel quantum layout tool with a scalable refinement operation integrated with novel cost functions and clustering strate- gies. Our clustering provides valuable insights into generating a proper problem approximation for quantum circuits and devices. The experimental results demonstrate that ML-QLS can scale up to problems involving hundreds of qubits and achieve a remarkable 69% performance improvement over leading heuristic QLS tools for large circuits, which underscores the effectiveness of multilevel frameworks in quantum applications.
2. "LiDAR: Automated Curvy Waveguide Detailed Routing for Large-Scale Photonic Integrated Circuits", Hongjian Zhou, Keren Zhu and Jiaqi Gu (Arizona State University, Fudan University) [abstract] [slides] [video]
Abstract: As photonic integrated circuit (PIC) designs advance and grow in complexity, driven by innovations in photonic computing and inter- connects, traditional manual physical design (PD) processes have become increasingly cumbersome. Available PIC layout automation tools are mostly schematic-driven, which has not alleviated the bur- den of manual waveguide planning and layout drawing. Previous research in PIC routing largely relies on off-the-shelf algorithms designed for electrical circuits, which only support high-level route planning to minimize waveguide crossings. It is not customized to handle unique photonics-specific routing constraints and metrics, such as curvy waveguides, bending, port alignment, and insertion loss. These approaches struggle with large-scale PICs and cannot produce real layout without design-rule violations (DRVs). This highlights the pressing need for electronic-photonic design automa- tion (EPDA) tools that can streamline the PD of PICs. In this paper, for the first time, we propose an automated PIC detailed routing tool, dubbed LiDAR, to generate DRV-free PIC layout for large-scale real- world PICs. LiDAR features a grid-based curvy-aware A ∗ engine with adaptive crossing insertion, congestion-aware net ordering and objective, and crossing-waveguide optimization scheme, all tailored to the unique property of PIC. On large-scale real-world photonic computing cores and interconnects, LiDAR generates a DRV-free layout with 14% lower insertion loss and 6.25× speedup than prior methods, paving the way for future advancements in the EPDA toolchain. Our codes are open-sourced at link.
3. "Invited: Physical Design for Systolic Array-Based Integrated Circuits", Jiang Hu (TAMU) [abstract] [slides] [video]
Abstract: Systolic arrays have become a popular hardware architecture for machine learning computing, which is a key driver for the growth of the semiconductor industry. Unlike many other circuits, systolic arrays exhibit distinct 2D regularity, which holds significant potential for improving physical design quality. However, this regularity is largely overlooked in existing physical design methodologies. Recent studies have demonstrated that leveraging this regularity can significantly enhance placement quality for FPGAs [1,3], cell placement [2], and mixed-size placement [4]. For instance, utilizing the regularity has resulted in over 20% wirelength reduction in FPGA placement compared to an industrial tool and a remarkable 53% wirelength reduction in mixed-size placement compared to a commercial tool. Despite these advantages, exploiting the regularity is not as simple as duplicating the schematic. FPGA DSP architectures, typically column-based, often fail to align with the 2D regularity of systolic arrays. Additionally, in cell and mixed-size placement, the placement of IO and control logic can disrupt this regularity. This invited talk will highlight techniques to address these challenges. Beyond placement, the discussion will extend to other opportunities in physical design for systolic arrays, including routing, routability prediction, clock network synthesis, and lithographic hot-spot prediction.
15:40 - 16:00: Break
16:00 - 16:40: Reliability
Chair: Jens Lienig (TU Dresden)
1. "Photonic Side-Channel Analyzer: Enabling Security-Aware Physical Design Methodology", Meizhi Wang, Yi-Ru Chen, S. S. Teja Nibhanupudi, Yinan Wang, Elham Amini, Antonio Saaverdra, Jean-Pierre Seifert and Jaydeep Kulkarni (The University of Texas, Austin, Technical University Berlin) [abstract] [slides] [video]
Abstract: Photon Emission (PE) from Integrated Circuits (IC) is an emerg- ing non-invasive side channel that poses a serious security risk to modern System-on-Chips (SoCs). These emissions, generated during transistor switching, are determined by circuit operations and can be exploited in Side-Channel Analysis (SCA). Furthermore, physical design choices, such as standard cell placement and rout- ing, affect how these emissions propagate and are detected. This makes it crucial to assess and mitigate such risks during the design phase. This paper presents a novel photonic side-channel analysis framework that integrates directly into the physical design flow. The framework enables designers to assess security vulnerabilities in digital ASIC designs by generating both time-resolved and ac- cumulated PE maps at the standard-cell gate level. These PE maps can be applied to various side-channel analysis methods to identify vulnerable regions in the circuit. We demonstrate the framework by applying it to a 40nm 128-bit Advanced Encryption Standard (AES) core, where we employ lo- calized Correlation PE Attacks (CPEA) on simulated time-resolved PE maps. This approach pinpoints regions with high side-channel leakage. The results showcase the framework’s effectiveness in pro- viding early detection and allow designers to enhance the overall security of the design against PE-related vulnerabilities. To validate our simulation framework, we compared the simulated accumu- lated PE maps with real-world measurements from a 40nm AES test chip. The close alignment between simulated and measured data confirms the accuracy of our simulator in predicting photon emission behavior across the chip.
2. "Multi-Stage CSM Timing Waveform Propagation Accelerated by NLDM Assistance", Shih-Kai Lee, Pei-Yu Lee and Iris Hui-Ru Jiang (Synopsys, National Taiwan University) [abstract] [slides] [video]
Abstract: Static timing analysis (STA) is essential for timing closure. To ad- dress the complicated effects emerging at advanced technology nodes, the Current Source Model (CSM) has been developed to com- pute timing waveforms for timing propagation. Compared with Non-Linear Delay Model (NLDM), CSM provides superior accuracy but suffers from the efficiency and scalability issue. In this paper, we propose a multi-stage CSM timing propagation framework with three acceleration techniques with the assistance of NLDM. Our acceleration techniques are general and compatible with any CSM- based STA engine. Experimental results demonstrate the effective- ness of our acceleration techniques: Compared with CSM-based analysis, we achieve 4× speedups with only 0.4% accuracy loss.
16:40 - 16:50: Break
16:50 - 17:50: Invited Session on Retrospective and Prospective of Physical Design
Chair: David Chinnery (Siemens EDA)
1. "Invited: The Future of Functional ECO Automation and Logical Equivalence Checking for Advanced Digital Design Flows", Zhuo Li (Cadence) [abstract] [slides]
Abtract: Logical equivalence checking (LEC, or EC) is critical to design implementation and for decades has allowed cost-efficient RTL- level functional testing to be the dominant type of verification done on a project. Test once, then formally prove that the subsequent design stages later in the implementation flow are 100% logically equivalent. But over the last 10-15 years, SoCs have grown 100X in complexity, creating new challenges. Concurrently, the use of functional ECOs to shortcut long design implementation cycles has skyrocketed. While automated approaches have greatly improved the ECO process quality and accelerated this overall trend, the setup can be challenging for inexperienced designers. In order to significantly speed up and simplify EC and improve the entire functional ECO process, we require a new approach to both flows. This talk will highlight some of Cadence’s recent breakthrough research in this space, including the use of AI and ML to improve single-run results and multiply designer productivity while gathering insights and leveraging learnings across the duration of a project.
2. "Invited: Toward an ML EDA Commons: Establishing Standards, Accessibility, and Reproducibility in ML-driven EDA Research", Vidya A. Chhabria (Arizona State University) [abstract] [slides] [video]
Abstract: Machine learning (ML) is transforming electronic design automation (EDA), offering innovative solutions for designing and optimizing integrated circuits (ICs). However, the field faces significant chal- lenges in standardization, accessibility, and reproducibility, limiting the impact of ML-driven EDA (ML EDA) research. To address these barriers, this paper presents a vision for an ML EDA Commons, a collaborative open ecosystem designed to unify the community and drive progress through establishing standards, shared resources, and stakeholder-based governance. The ML EDA Commons focuses on three objectives: (1) Maturing existing EDA infrastructure to sup- port ML EDA research; (2) Establishing standards for benchmarks, metrics, and data quality and formats for consistent evaluation via governance that includes key stakeholders; and (3) Improving acces- sibility and reproducibility by providing open datasets, tools, models, and workflows with cloud computing resources, to lower barriers to ML EDA research and promote robust research practices via artifact evaluations, canonical evaluators, and integration pipelines. Inspired by successes of ML and MLCommons, the ML EDA Commons aims to catalyze transparency and sustainability in ML EDA research.
3. "Invited: Mapping Two Decades of Innovation: Lessons from 25 Years of ISPD Research", Matthew Guthaus (UCSC) [abstract] [slides] [video]
Abstract: The design automation research community has driven the evo- lution of integrated circuits from a handful of transistors in the 1960s to billions today. The International Symposium on Physi- cal Design (ISPD) has been instrumental in tackling challenges like scaling complexities, hardware security, and the exponential growth in transistor counts. This study conducts a comprehensive bibliometric analysis of ISPD publications using Natural Language Processing, machine learning, and network analysis. It explores research themes, collaboration dynamics, and global contributions through citation networks, co-authorship graphs, geographical and spatial mapping, and topic modeling. Key areas of focus include Physical Design Optimization, Power Efficiency, and Emerging Technologies, with prominent topics such as placement, routing, clock skew, lithography, machine learning, and hardware security. The analysis highlights the evolution of foundational techniques like placement and routing while identifying emerging trends such as AI-driven design automation. These insights provide a roadmap for sustaining innovation in physical design over the next 25 years.
9:00 - 9:50: Panel: Hetrogenous Integration
Chair: JT Li (National Tsing Hua University) [slides]
Panelists:
Lihong Cao (ASE)
[slides]
Henry Sheng (Synopsys)
Ksenia Roze (Cadence)
9:50 - 10:50: AI for Chip Design
Chair: Duo Ding (Samsung)
1. "HeLO: A Heterogeneous Logic Optimization Framework by Hierarchical Clustering and Graph Learning", Yuan Pu, Fangzhou Liu, Zhuolun He, Keren Zhu, Rongliang Fu, Ziyi Wang, Tsung-Yi Ho and Bei Yu (The Chinese University of Hong Kong, Fudan University) (Best Paper Nominee) [abstract] [slides] [video]
Abstract: Modern very large-scale integration (VLSI) designs usually consist of modules with various topological structures and functionalities. To better optimize such large and heterogeneous logic networks, it is essential to identify the structural and functional characteristics of its modules, and represent them with appropriate DAG types (such as AIG, MIG, XAG, etc.) for logic optimization. This paper proposes HeLO, a hetero-DAG logic optimization framework empowered by hierarchical clustering and graph learning. HeLO leverages a hi- erarchical clustering algorithm, which splits the original Boolean network into sub-circuits by considering both topological and func- tional characteristics. A novel graph neural network model is cus- tomized to generate the topological-functional embedding (used for distance calculation in hierarchical clustering) and predict the best-fit DAG type of each sub-circuit. Experimental results demon- strate that HeLO outperforms LSOracle, the SOTA heterogeneous logic optimization framework, in terms of node-depth product (for technology-independent logic optimization) and delay-area product (for technology mapping) by 8.7% and 6.9%, respectively.
2. "GraphCAD: Leveraging Graph Neural Networks for Accuracy Prediction Handling Crosstalk-affected Delays", Fangzhou Liu, Guannan Guo, Yuyang Ye, Ziyi Wang, Wenjie Fu, Weihua Sheng and Bei Yu (The Chinese University of Hong Kong, Huawei Design Automation Lab, HK, HiSilicon Technologies Co.) [abstract] [slides] [video]
Abstract: As chip fabrication technology advances, the capacitive effects be- tween wires have become increasingly pronounced, making crosstalk- induced incremental delay a serious issue. Traditional static timing analysis involves complex and iterative calculations through timing windows, requiring precise alignment of aggressor and victim nets, along with delay and slew estimations, which significantly increase runtime and licensing costs. In our work, we develop a Graph Neural Network framework to predict crosstalk-affected delays, focusing on the impacts of the coupling effect and overlapping nets. Moreover, we employ a curriculum learning strategy that gradually integrates aggressors with victims, improving model convergence through pro- gressively complex scenarios. Experimental results show that our framework precisely predicts crosstalk-affected delays, matching commercial tools’ performance with a fivefold speedup.
3. "Invited: AI-assisted Routing", Evangeline Young (CUHK) [abstract] [video]
Abstract: Routing is an important but complicated step in physical synthesis. Considering the potential of leveraging AI to seek higher efficiency and better quality in solving routing problems, we study in this work the methodology of AI-assisted routing in a systematic way. Decoupling the functionalities of different routing components will give a high flexibility in determining where and how AI can be used in an effective manner, while maintaining a high degree of interpretability. Two applications along this direction are presented, aiming at tackling the difficulties in routing with AI assistance. These provide examples of how to implement the methodology in practice, while revealing its effectiveness and potential.
10:50 - 11:10: Break
11:10 - 12:10: LLM for Chip Design
Chair: Tiago Reimann (Siemens EDA)
1. "DRC-Coder: Automated DRC Checker Code Generation Using LLM Autonomous Agent", Chen-Chia Chang, Chia-Tung Ho, Yaguang Li, Yiran Chen and Haoxing Ren (Duke University, NVIDIA) [abstract] [slides] [video]
Abstract: In advanced technology nodes, the integrated design rule checker (DRC) is often utilized in place and route tools for fast optimization loops for power-performance-area. Implementing integrated DRC checkers to meet the standard of commercial DRC tools demands extensive human expertise to interpret foundry specifications, analyze layouts, and de- bug code iteratively. However, this labor-intensive process, requiring to be repeated by every update of technology nodes, prolongs the turn- around time of designing circuits. In this paper, we present DRC-Coder, a multi-agent framework with vision capabilities for automated DRC code generation. By incorporating vision language models and large lan- guage models (LLM), DRC-Coder can effectively process textual, visual, and layout information to perform rule interpretation and coding by two specialized LLMs. We also design an auto-evaluation function for LLMs to enable DRC code debugging. Experimental results show that targeting on a sub-3nm technology node for a state-of-the-art standard cell layout tool, DRC-Coder achieves perfect F1 score 1.000 in generating DRC codes for meeting the standard of a commercial DRC tool, highly outperforming standard prompting methods (F1=0.631). DRC-Coder generates code for each rule within average four minutes, significantly accelerating technology advancement and reducing engineering costs.
2. "LEGO-Size: LLM-Enhanced GPU-Optimized Signoff-Accurate Differentiable VLSI Gate Sizing in Advanced Nodes", Yi-Chen Lu, Kishor Kunal, Geraldo Pradipta, Rongjian Liang, Ravikishore Gandikota and Haoxing Ren (NVIDIA) (Best Paper Nominee) [abstract] [slides] [video]
Abstract: On-Chip Variation (OCV)-aware and Path-Based Analysis (PBA)- accurate timing optimization achieved by gate sizing (including 𝑉 𝑡ℎ -assignment) remains a pivotal step in modern signoff. However, in advanced nodes (e.g., 3𝑛𝑚), commercial tools often yield subopti- mal results due to the intricate design demands and the vast choices of library cells that require substantial runtime and computational resources for exploration. To address these challenges, we introduce LEGO-Size, a generative framework that harnesses the power of Large Language Models (LLMs) and GPU-accelerated differentiable techniques for efficient gate sizing. LEGO-Size introduces three key innovations. First, it considers timing paths as sequences of tokenized library cells, casting gate sizing prediction as a language modeling task and solving it with self-supervised learning and su- pervised fine-tuning. Second, it employs a Graph Transformer (GT) with a linear-complexity attention mechanism for netlist encoding, enabling LLMs to make sizing decisions from a global perspective. Third, it integrates a differentiable Static Timing Analysis (STA) engine to refine LLM-predicted gate size probabilities by directly optimizing Total Negative Slack (TNS) through gradient descent. Experimental results on 5 unseen million-gate industrial designs in a commercial 3𝑛𝑚 node show that LEGO-Size achieves up to 125x speed up with 37% TNS improvement over an industry-leading commercial signoff tool with minimal power and area overhead.
3. "Invited: Artificial Netlist Generation for Enhanced Circuit Data Augmentation", Seokhyeong Kang (Pohang University of Science and Technology) [abstract] [slides] [video]
Abstract: Optimizing power, performance, and area (PPA) at advanced nodes has become an increasingly challenging and complex task. To address these challenges, approaches such as machine learning (ML) and design-technology co-optimization (DTCO) have emerged as promising solutions. However, their effectiveness is limited by the lack of diverse training data and prolonged turnaround times (TAT). Artificial data has been widely used in various fields to address the limitations of real-world data. By augmenting datasets, artificial data improve the robustness of ML models against input perturbations, leading to improved performance. Similarly, in the physical design flow, artificial data has great potential for overcoming the scarcity of real-world circuit data [1], [2], [3]. Artificial circuits proposed in previous studies are typically designed for specific applications. By developing a method to generate artificial circuit which resemble real circuits, we can address data scarcity and TAT challenges in physical design. In this talk, we will discuss how leveraging artificial circuits to explore a wide range of circuit characteristics can enhance ML model performance for unseen real-world circuits and accelerate the PPA exploration flow.
12:10 - 13:10: Lunch
13:10 - 14:30: Cell Design
Chair: Mark Ren (NVIDIA)
1. "Cell-Flex Metrics for Designing Optimal Standard Cell Layout With Enhanced Cell Layout Flexibility", Byeonggon Kang, Chung-Kuan Cheng, Bill Lin, Yucheng Wang and Ying Yuan (University of California, San Diego) [abstract] [slides] [video]
Abstract: As physical pitch scaling slows, efforts to match its pace by reducing standard cell height and sacrificing horizontal routing tracks have introduced placement and routing challenges, making the design of high-quality standard cell layouts increasingly crucial. However, ex- isting cell metrics only focus on pin accessibility and are insufficient to address issues in advanced nodes (e.g., Power Delivery Networks (PDN), increased routing blockages, etc.). We propose Cell Layout Flexibility (Cell-Flex) metrics, novel metrics that evaluate flexibility of standard cell layouts. Flexibility reflects the versatility of cell layouts to placement and routing demands, which influences op- timizing block design. By using Cell-Flex metrics as objectives in designing cell layout, we achieve a 13.2% reduction in block area without increasing total Design Rule Violations (DRVs). We develop a Machine Learning (ML) model using Kolmogorov–Arnold Net- works (KAN) that utilizes the Cell-Flex metrics as features to make DRV prediction. By adding Cell-Flex features, we improve accuracy from 0.65 to 0.79 and F1 score from 0.52 to 0.78, demonstrating that our metrics are important for DRV prediction and serve as robust indicators of cell layout quality.
2. "Scalable CFET Cell Library Synthesis with A DRC-Aware Lookup Table to Optimize Valid Pin Access", Ting-Wei Lee, Ting-Xin Lin and Yih-Lang Li (National Yang Ming Chiao Tung University) [abstract] [slides] [video]
Abstract: With the advent of CFET technology, which stacks P and N transistors together, the number of available tracks in a cell decreases. This poses a substantial challenge of hard-to-access pins during upper-level routing, which has been addressed in previous works by lengthening IO pins and increasing the spacing between adjacent IO pins. However, upper-level routing may generate DRC violations around IO pins in a cell, which compromises these efforts to improve pin accessibility. To overcome this challenge, we propose a scalable satisfiability modulo theories-based cell routing that establishes a DRC-aware scheme to enumerate potential DRC violations, enabling pin accessibility to be improved without producing DRC violations in upper-level routing. Our experimental results demonstrate that the proposed CFET cell generator is 100 times faster than previous work on average while delivering the same or better cell quality in terms of cell area. The scalability of the proposed method allows for the synthesis of large cells, including high-driving-strength cells and multi-bit flip flop (MBFF). Moreover, compared to previous work, the proposed method reduces DRC violations by an average of 99% in upper- level routing, and reduces both wire length and via usage effectively as well.
3. "LVFGen: Efficient Liberty Variation Format (LVF) Generation Using Variational Analysis and Active Learning", Junzhuo Zhou, Ting-Jung Lin, Haoxuan Xia, Li Huang, Wei Xing and Lei He (UCLA, Ningbo Institute of Digital Twin, Eastern Institute of Technology , Shenzhen BTD Technology Co., Ltd., The University of Sheffield) [abstract] [slides] [video]
Abstract: As transistor dimensions shrink, process variations significantly impact circuit performance, signifying the need for accurate statis- tical circuit analysis. In digital circuit timing analysis, the Liberty Variation Format (LVF) has emerged as an industrial leading rep- resentation of timing distributions in cell libraries at 22 nm and below. However, LVF characterization relies on the Monte Carlo (MC) method, which requires excessive SPICE simulations of cells with process variations. Similar challenges also exist for uncer- tainty propagation and quantification in chip manufacturing and the broader scientific communities. To resolve this foundational challenge, this paper presents LVFGen, a novel method that reduces the simulation costs of MC while generate high-accuracy LVF li- brary. LVFGen utilizes an active learning strategy based on vari- ational analysis to identify process variation samples that impact timing distributions more significantly. Compared to the state-of- the-art Quasi-MC method, LVFGen demonstrates an overall 2.27× speedup in LVF library generation within an accuracy level of 5k- sample MC and a 4.06× speedup within a 100k-sample MC accuracy.
4. "Abuttable Analog Cell Library and Automatic AMS Layout", Tianjia Zhou, Cheng Chang, Lei He, Li Huang, Jingyun Gu, Zexin Ji, Hailang Liang, Ting-Jung Lin, Zhanfei Chen, Xiangyang Liu, Song Wang, Zhengping Li and Na Bai (University of California, Los Angeles, BTD Technology, Inc, Ningbo University, Ningbo Institute of Digital Twin, Eastern Institute of Technology, Anhui University) [abstract] [slides] [video]
Abstract: The state-of-the-art analog circuit design applies mainly a full- custom layout methodology. This demands high expertise and heavy manual workload. Additionally, neither can the resulting layout be re-used easily across different designs or different PDKs. Learning from digital standard cells, existing work has proposed analog stem cells that are abuttable. But stem cells have a fixed 2× area overhead of same-sized Pcells, limiting their wide application. In this paper we develop a new type of abuttable analog cells (called Acells) for transistors and passive elements. Acells are compatible with digital standard cells and can be abutted in all directions, en- abling the use of automatic digital place and route (PnR) engines. We automate Acell generation and show that the average area ratio over same-sized Pcell is 1.49 for 65nm technology and 1.3 for 28nm technology, and is expected to decrease for more advanced tech- nologies. We then use digital PnR to automatically generate layout of several analog and mixed-signal (AMS) circuits mainly in 28nm. Compared to Pcell-based manual layout, Acell-based layout obtains similar performance and its circuit-level layout area is about 2% higher for large scale AMS circuits in our experiments.
14:30 - 14:40: Break
14:40 - 15:40: 3D IC Part I
Chair: Piyush Verma (Synopsys)
1. "Placement-Aware 3D Net-to-Pad Assignment for Array-Style Hybrid Bonding 3D ICs", Pruek Vanna-Iampikul, Junsik Yoon, Chaeryung Park, Gary Yeap and Sung Kyu Lim (Burapha University, Synopsys, Georgia Institute of Technology) [abstract] [slides] [video]
Abstract: Hybrid bonding is emerging as a key technology for 3D integration, offering finer bonding pitches that address the high interconnect density requirements of modern VLSI applications. In advanced node technologies, where metal pitches are significantly smaller than bonding pitches, 3D net assignment becomes critical for achiev- ing optimal design performance. Existing approaches primarily focus on either ensuring the legality of the assignment or opti- mizing the flexibility of 3D net locations for timing purposes in isolation. This limitation restricts the performance improvements of 3D designs over traditional 2D counterparts. To overcome these challenges, we introduce AnchorGrid, a novel 3D net assignment framework designed to concurrently assign 3D nets to legal lo- cations while supporting their movement to enhance timing op- timization. By modeling 3D nets as pairs of specialized “anchor” cells, accompanied by relative placement constraints, precise move- ment and alignment are achieved during the pre-route optimization phase, before final placement onto grid-based locations. Experimen- tal results on advanced node commercial designs demonstrate that AnchorGrid achieves up to a 24.35% improvement in power, perfor- mance, and area (PPA) metrics, while reducing design rule check (DRC) violations by 90%, outperforming state-of-the-art methods.
2. "Invited: Physical Design for Advanced 3D ICs: Challenges and Solutions", Bei Yu (CUHK) [abstract] [slides]
Abstract: As technology scaling predicted by Moore’s law slows down, 3D integrated circuits (3D ICs) have emerged as a promising alternative to enhance performance while maintaining cost-effectiveness. With the advancement of fabrication and bonding technologies, wafer- level 3D integration enables fine-grain 3D interconnects that maxi- mize the benefits in power, performance, and area (PPA). However, a multitude of challenges have obstructed traditional electronic de- sign automation (EDA) methodologies for 3D IC implementations. This paper surveys the major challenges in the physical design of advanced 3D ICs. We provide a comprehensive review of existing solutions, analyzing their advantages and disadvantages in depth. Finally, we discuss open problems and research opportunities in the development of native 3D EDA tools.
3. "Invited: Chiplet-Based Integration - Scale-Down and Scale-Out", Boris Vaisband (University of California, Irvine) [abstract] [slides]
Abstract: Motivation: The demand for increased computation and memory in applications such as large language models, has increased well beyond the reticle boundaries of a system-on-chip (SoC). Chiplet-based integration is a paradigm shift that shapes the way we design our future high-performance systems. The concept is to move away from large SoCs that are limited by communication, thermal design power, and reticle size, toward a robust plug-and-play approach, where small, hardened IP heterogeneous off-the-shelf chiplets are seamlessly integrated on a single platform. Problem statement: Recent technological breakthroughs in advanced packaging platforms, have enabled the integration of hundreds to thousands of chiplets within a single platform. Nonetheless, building a functional and efficient ultra-large-scale high-performance computation system, requires overcoming important system-level design challenges. Specifically, short- and long-range communication, power delivery and thermal management, testing, synchronization, hardware security, and others. Approach: In this talk, we will discuss the current state-of-the- art and challenges in chiplet integration as well as the scale- down and scale-out concepts. We will introduce the silicon interconnect fabric (Si-IF), an ultra-large wafer-scale heterogeneous integration platform, for applications such as high-performance computing. We will discuss paths to address the system-level challenges for designing and integrating a high- performance computation system on the Si-IF.
15:40 - 16:00: Break
16:00 - 18:00: Lifetime Achievement Session
Chair: David Z. Pan (UT Austin) and Peipei Zhou (Brown University)
1. "Invited: Innovation in Times of Technology Disruption ", Bryan Preas [abstract] [slides]
Abstract: I hosted Jason Cong when he was an intern at the Xerox Palo Alto Research Center in 1987. Since then, we have been friends and collaborators. I have watched his extraordinary accomplishments with pride and pleasure over the years.Disruptive technologies are a well-studied business school topic. New, or significantly improved, technologies present new problems and allow new approaches for older challenges. These disruptions are often accompanied by substantial technical innovation and creation of new business values. In their time, electric power, automobiles and television disrupted society. More recent examples include the internet and e-commerce. This theme provides a lens through which to view Jason’s enormous contributions to physical design automation. His intelligence, creativity, drive and a well-stocked toolbox allow him to find radically new solutions. Established organizations often focus on incremental changes while Jason pursues revolutionary improvements.
2. "Impact of FlowMap on my Research in Partitioning and Clustering", Martin Wong (Hong Kong Baptist University) [slides]
3. "Invited: Shaping the Future of Interconnected Physical Design", David Pan (University of Texas, Austin) [abstract] [slides]
Abstract: Physical design has been a cornerstone of electronic design automation (EDA) since the early days of chip and board development, with placement and routing at its core. By the late 1980s, the field was prematurely declared “dead,” as many believed its challenges had been resolved. However, the advent of deep submicron scaling in the 1990s revitalized physical design research, establishing it as an indispensable part of the design process. Today, with technology advancing to the 1x nm regime and the rise of 3D heterogeneous integration (3DHI), physical design remains pivotal in achieving design closure across power, performance, area, cost, and turnaround time (PPACT). Over time, physical design has transformed from its classical “place and route” framework into a more holistic and interconnected discipline, crosscutting into physical synthesis, design for manufacturing, 3DHI, analog and RF, emerging technologies, and AI/ML. The seminal contributions of Prof. Jason Cong have been instrumental in shaping the field of interconnected physical design. The seeds he planted have grown into thriving forests, with his academic descendants emerging as key leaders driving advancements in the field. This talk will explore the synergistic aspects of interconnected physical design and highlight Prof. Cong’s profound influence and legacy in shaping the future of the field.
4. "Invited: Coping with Interconnects", Jason Cong (UCLA) [abstract] [slides]
Abstract: In this paper, I review the multi-decade research on overcoming the performance bottleneck of VLSI interconnects in deep sub- micrometer and nanometer technologies that started at UCLA in the early 1990s. Our research spans from interconnect topology and geometry optimization, to wire length reduction via scalable placement, to use of novel interconnect technologies such as 3D IC and RF-interconnects, to recent work on interconnect pipelining in chiplet designs, and the shift from interconnect to entanglement in quantum computing. The latter two efforts go beyond the typical physical design space and involve space-time co-optimization. This paper is dedicated to multiple generations of Ph.D. students, post- docs, and visiting researchers who contributed
18:30 - 20:30: Banquet [video]
9:00 - 9:50: Keynote
Chair: Tung-Chieh Chen (Synopsys)
"Automation and Optimization of Heterogeneous Systems", Henry Sheng (Synopsys) [abstract] [slides]
Abstract: Advances in heterogeneous integration have enabled the creation of systems built from chips and interconnects using multiple silicon node technologies, package technologies, optical technologies, thermal mitigation, and more. The scale of integration has increased non-linearly with the pitch of die-to- die interconnect such as bump and hybrid bonds. Traditionally, many systems have been constructed as an 'assembly' of parts using manual layout techniques. Advanced package technologies have evolved to a point where they have achieved densities that challenge the viability of assembly-based manual methods as multi-die systems scale to 5X, 8X or even 40-60X reticle size at wafer scale. The scale complexity grows both with the densification of die-to-die connections as well as the increase of allowable system footprints. Furthermore, the integration of heterogeneous components in a single design mandates a workflow that fuses previously disconnected heterogeneous workflows and competencies together. These are secular shifts that are opening new classes of problems including migration from manual assembly to automated assembly, and then from automated assembly to optimization of system QoR (quality of results). This requires design automation tools to have a unified representation and treatment of heterogeneous systems and operate on optimization at a system scale across heterogeneous components for
9:50 - 10:50: 3D IC Part II
Chair: Bill Swartz (Timberwolf and University of Texas at Dallas)
1. "ML-Based Fine-Grained Modeling of DC Current Crowding in Power Delivery TSVs for Face-to-Face 3D ICs", Zheng Yang, Zhen Zhuang, Bei Yu, Tsung-Yi Ho, Martin D.F. Wong and Sung-Kyu Lim (Georgia Institute of Technology, The Chinese University of Hong Kong, Hong Kong Baptist University) [abstract] [slides] [video]
Abstract: In contrast to uniform distribution in power wires, actual currents tend to exhibit complicated crowding phenomena at the connec- tions between Through-silicon-via (TSV) and power wires. The current crowding effect degrades power integrity and increases the difficulty of 3D IC power delivery network (PDN) analysis. There- fore, a detailed analysis of current distribution and IR drops in power TSVs within 3D IC PDN is important. This paper will explore the complicated current behavior within TSVs and PDNs of the promising face-to-face 3D IC architecture. Since existing simulation methods are computationally intensive and time-consuming, we propose a graph attention network-based (GAT-based) framework, with novel aggregation methods in the GAT models and informative fine-grained graph generation methods, to achieve efficient analysis of current crowding and IR drops in face-to-face 3D IC TSVs. For current density and voltage predictions, the proposed framework attains R 2 scores of 0.9776 and 0.9952 compared to ground truth results, respectively. Our framework also demonstrates over 837× speedup than ANSYS Q3D Extractor. Furthermore, the proposed framework outperforms other machine learning-based (ML-based) methods, including the state-of-the-art method.
2. "Invited: Modeling and Design Methodology for Backside Integration of Voltage Converters", Sung Kyu Lim (Georgia Institute of Technology) [slides] [video] [video]
Abstract: Qubit mapping is crucial in optimizing the performance of quantum algorithms for physical executions on quantum computing architectures. Many qubit mapping algorithms have been proposed for superconducting systems recently. However, due to their limitations on the physical qubit connectivity, costly SWAP gates are often required to swap logical qubits for proper quantum operations. Trapped-ion systems have emerged as an alternative quantum computing architecture and have gained much recent attention due to their relatively long coherence time, high-fidelity gates, and good scalability for multi-qubit coupling. However, the qubit mapping of the new trapped-ion systems remains a relatively untouched research problem. This paper proposes a new coupling constraint graph with multi-pin nets to model the unique constraints and connectivity patterns in one-dimensional trapped-ion systems. To minimize the time steps for quantum circuit execution satisfying the coupling constraints for trapped-ion systems, we devise a divide-and-conquer solution using Satisfiability Modulo Theories for efficient qubit mapping on trapped-ion quantum computing architectures. Experimental results demonstrate the superiority of our approach in scalability and effectiveness compared to the previous work.
3. "Invited: Next-Generation Power Integrity Concepts and Applications for Physical Design", Emrah Acar (Ansys) [abstract] [slides] [video]
Abstract: The rapid pace of innovation in electronics, driven by advancements in computing, machine learning, and artificial intelligence, has created an unprecedented demand for more efficient and powerful computing platforms. As integrated circuits (ICs) continue to scale and integrate into increasingly complex systems, they consume more power, leading to significant challenges in power integrity. These challenges are further exacerbated by the growing complexity of modern IC designs, necessitating more intelligent and actionable approaches to ensure robust power delivery networks. This presentation introduces a novel methodology for addressing power integrity issues in next-generation IC designs. We propose a victim/aggressor interaction model as a foundational concept for IR drop analysis. This model enables the decomposition of IR drop in a victim instance into contributions from multiple aggressors and other components. By understanding these interactions, designers can implement corrective actions during the placement and routing stages, as well as enhance power connectivity at higher levels of the design hierarchy. We will discuss the foundational advantages of RedHawk-SC SigmaDVD™ Technology, a cutting-edge solution for power integrity analysis and signoff. SigmaDVD™ is designed to address dynamic voltage drop (DVD) issues at advanced process nodes, providing comprehensive coverage and enabling early detection and prevention of voltage-drop-related problems. This technology is instrumental in achieving robust power integrity signoff, fixing IR violations, and ensuring timing closure with high confidence. Key applications of SigmaDVD™ in physical design, IR/STA (Static Timing Analysis), and IR/ECO (Engineering Change Order) tools will be highlighted. The presentation will demonstrate how SigmaDVD™ is becoming the industry-leading method for avoiding DVD-induced voltage and timing problems, enabling shift-left prevention of voltage-drop issues, and delivering high-coverage power integrity signoff for advanced- node designs. By leveraging these next-generation concepts and tools, designers can achieve more efficient, reliable, and high- performance ICs, paving the way for continued innovation in the electronics industry.
10:50 - 11:10: Break
11:10 - 11:50: Contest Summary/results
Chair: Stephan Held (University of Bonn)
"ISPD 2025 Performance-Driven Large Scale Global Routing", Rongjian Liang (Nvidia) [abstract] [slides] [video]
Abstract: Global routing is a critical aspect of VLSI design, significantly im- pacting timing, power consumption, and routability. The ISPD2024 contest focused on addressing the scalability challenges of global routing by leveraging GPU and machine learning techniques. Build- ing on this foundation, the ISPD2025 contest introduces several important updates to better reflect real-world routing challenges. These updates include the provision of industry-standard input files for more precise modeling and integration with OpenROAD for accurate performance assessment. Collectively, these updates aim to bring the contest closer to practical routing scenarios, fostering the development of scalable and efficient solutions for large-scale chip designs.
Stephan Held (University of Bonn) [slides] [video]
11:50 - 12:00: Outlook to ISPD 2026
12:00 - 17:00: Social Outing
12:00 - 12:30 – Depart from the hotel
13:15 - 14:00 - The Salt Lick BBQ
14:30 - 16:30 - City Tour with two stops: one at Texas State Capitol and a stop at the Greetings from Austin mural
17:00 - Drop off back to the hotel