Across the three days for ISPD 2026, we have 4 keynotes, 14 accepted papers, 15 invited talks, one panel on Monday with 6 panelists, 5 speakers with longer talks for Professor Jens Lienig's commemorative session, and finally the ISPD 2026 contest results.
16:30: Free Arithmeum Tour, Lennéstr. 2, 53113 Bonn (registration required)
Explore the history of calculating machines and enjoy constructive art!
18:00: Welcome Reception in the Arithmeum
9:00 - 9:10: Opening
9:10 - 10:00: 1. Keynote
Chair: Stephan Held (University of Bonn)
"Chip Design in the Era of AI & Quantum", Leon Stok (IBM) [abstract]
Abstract: By 2030, we will design monolithic chips with hundreds of billions of transistors and 3D-integrated systems surpassing a trillion --- pushing the boundaries of complexity and creativity. These designs will emerge from millions of lines of RTL, but will they be crafted by engineers, or by intelligent agents? Generative AI is no longer a curiosity; it is reshaping the EDA landscape. Are we witnessing the birth of an era where autonomous AI agents orchestrate entire design flows, transforming high-level intent into 2nm, DRC-clean, timing-optimized layouts—bypassing traditional toolchains? Or will quantum computing claim the crown, solving optimization problems once deemed intractable? Or will most of this still be a decade out? This keynote looks beyond incremental progress to a future where heterogeneous compute—CPUs, AI accelerators, and quantum processors— collaborate seamlessly. What breakthroughs are needed and what disruptions and paradigm shifts lie ahead as we try to reimagine the very nature of chip design?
10:00 - 11:00: 2. 3D IC Placement & Planning
Chair: Patrick Madden (Binghamton University)
1. "Technology-Aware 3D Placement with ILP-Based Region Planning for Soft Modules", Cheng-Xun Song, Minh Anh Phan, Sheng-Tan Huang, Shao-Yun Fang, Tung-Chieh Chen, Kai-Shun Hu and Chin-Fang Cindy Shen (National Taiwan University of Science and Technology, Synopsys) [abstract]
Abstract: With the advancement of 3D IC technology, multilayer chip stacking enables improved performance and reduced power consumption, albeit at the cost of increased design complexity. In 3D IC designs, soft modules are also adopted to enable flexible floorplan- ning across multiple dies. To preserve module integrity and simplify interconnects, the standard cells and macros within the same module are typically constrained to be placed on the same layer and thus using the same technology. In addition, determining an appropriate shape and placement location for each soft module is critical for minimizing wirelength. This paper presents the first technology-aware 3D placement framework for mixed-size designs that incorporates region planning for soft modules. It combines an analytical placement engine with an integer linear programming (ILP)-based region planning strategy to handle the flexible nature of soft modules. To further enhance placement flexibility while limiting region complexity, the ILP formulation identifies an optimal L-shaped or rectangular region for each soft module. Experimental results demonstrate that, compared to a baseline approach, our flow achieves over 21&percnt wirelength reduction, with only a 5&percnt overhead compared to placement without the soft module constraint.
2. "IDDA-3D: Inter-Die Delay Aware Timing-Driven Placement on Face-to-Face Bonded 3D ICs", Zixian Yang, Shanyi Li, Leilei Jin, Tsung-Yi Ho and Chien-Nan Liu (National Yang Ming Chiao Tung University, The Chinese University of Hong Kong) [abstract]
Abstract: 3D ICs extend integration freedom beyond post-Moore limits and can improve performance. Yet, existing true-3D placers remain primarily wirelength-driven, and partition-based 3D flows struggle to incorporate timing during design-space exploration. Prior 2D timing-driven approaches often rely on RSMT-based routing lookahead, which is unstable under 𝑧-moves and lacks a smooth objective for gradient-based optimization; simple net weighting further fails to capture path-level timing. We present IDDA-3D, the first timing-driven placement framework for face-to-face (F2F) bonded 3D ICs. IDDA-3D introduces a quadratic RC formulation that models intra-/inter-die driver-sink delay as a differentiable timing cost for analytical placement. The RC parameters are derived directly from the technology library, ensuring that the model reflects physical delay accurately and remains applicable across diverse technology nodes without manual tuning. To handle the discrete nature of die assignment, we employ a finite-difference approximation (FDA)-based gradient computation with preconditioning, which integrates seamlessly with the analytical placement engine. Experimental results show that IDDA-3D improves total negative slack (TNS) by up to 44% and worst negative slack (WNS) by 22%, while maintaining competitive wirelength and runtime compared with state-of-the-art true-3D placers.
3. "Multi-Level Interconnect Planning for Signal-Power-Thermal Integrity in 2.5D/3D Integration", Siyuan Miao, Lingkang Zhu, Xiangqiao Meng, Wenkai Yang, Chengyu Zhu, Chen Wu and Lei He (University of California, Los Angeles, The Hong Kong Polytechnic University, Shanghaitech University, BTD Technology Inc., Ningbo Institute of Digital Twin, Eastern Institute of Technology) [abstract]
Abstract: Chiplets are a promising architecture for high-performance AI computing, but their package-level interconnects create a tightly coupled multiphysics problem involving signal delivery, power delivery, and heat dissipation. This challenge is compounded by the need to co-optimize the interposer and substrate, which have divergent design rules and performance sensitivities. To address these challenges, we propose MIP-SPT, a framework for multi-level interconnect planning. We introduce a hierarchical variable scheduling strategy that decouples interposer and substrate variables, significantly reducing the search space. MIP-SPT then employs a multi-phase Bayesian optimization scheme to fully explore the streamlined design space. Crucially, our framework quantitatively models the effects of multiphysics coupling during planning to achieve rapid design closure. Experimental results show that our work reduces manufacturing cost by 22.4% compared to the baseline single-phase Bayesian optimization under equivalent design constraints. In addition, it outperforms two existing works, lowering interconnect cost by 23.1% and 18.1%, respectively.
11:00 - 11:10: Break
11:10 - 12:10: 3. Routing for Photonics and Advanced Packaging
Chair: Rickard Ewetz (University of Florida)
1. "LiDAR 3.0: Photonics-Aware Planning-Guided Automated Electrical Routing for Large-Scale Active Photonic Integrated Circuits", Hongjian Zhou, Haoyu Yang, Nicholas Gangi, Bowen Liu, Meng Zhang, Haoxing Ren, Xu Wang, Rena Huang and Jiaqi Gu (Arizona State University, NVIDIA, Rensselaer Polytechnic Institute, Cadence) (Best Paper Nominee) [abstract]
Abstract: The rising demand for AI training and inference, as well as scientific computing, combined with stringent latency and energy budgets, is driving the adoption of integrated photonics for computing, sensing, and communications. As active photonic integrated circuits (PICs) scale in device count and functional heterogeneity, physical implementation by manual scripting and ad-hoc edits is no longer tenable. This creates an immediate need for an electronic–photonic design automation (EPDA) stack in which physical design automation is a core capability. However, there is currently no end-to-end fully automated routing flow that coordinates photonic waveguides and on-chip metal interconnect. Critically, available digital VLSI and analog/custom routers are not directly applicable to PIC metal routing due to a lack of customization to handle constraints induced by photonic devices and waveguides. We present, to our knowledge, the first end-to-end routing framework LiDAR 3.0 for large-scale active PICs that addresses waveguides and metal wires within a unified flow. We introduce a physically-aware global planner that generates congestion- and crossing-aware routing guides while explicitly accounting for the region of photonic components and waveguides. We further propose a sequence-consistent track assignment and a soft guidance-assisted detailed routing to speed up the routing process with significantly optimized routability and via usage. Evaluated on various large PIC designs, our router delivers fast, high-quality active PIC routing solutions with fewer vias, lower congestion, and competitive runtime relative to manual and existing VLSI router baselines; on average it reduce via count by ∼99%, user-specified design rule violation by ∼98%, and runtime by 17×, establishing a practical foundation for EPDA at system scale.
2. "Invited: Navigating the Frontier of Optimality and Complexity: Advanced Design Automation for Wavelength-Routed ONoCs", Ulf Schlichtmann (Technical University of Munich) [abstract]
Abstract: The sustained growth of high-performance computing (HPC) workloads places increasing pressure on on-chip communication, where data movement, bandwidth, and power budget have become primary limiting factors. Optical networks-on-chip (ONoCs), particularly wavelength-routed ONoCs (WRONoCs), offer a transformative path toward ultra-high-speed and energy-efficient communication. The design of a WRONoC involves a complex interplay between two main design aspects: logic connection design (topological design) and layout synthesis (physical design). Existing design methodologies either separate the two design aspects into two sequential steps, which is computationally affordable but prone to suboptimality, or perform concurrent optimization that is theoretically holistic but computationally intractable. Such inefficiencies limit the scalability and generalizability of WRONoCs. To address these challenges, this paper presents two advanced design methodologies that have demonstrated their effectiveness in synthesizing high-performance WRONoCs. By pre-integrating layout constraints into the topological design phase, the logic connections created by both methodologies can be compatible with physical design, thereby resolving the long-standing conflict between solution quality and synthesis overhead.
3. "Any-Angle Die-to-Die Routing for Advanced Packages with Asymmetric Pin Row Structures, Via Constraints, and Shielding-Aware Reservation", Hsin-Tzu Chang, Iris Hui-Ru Jiang, Hua-Yu Chang and Chun-Hao Lai (National Taiwan University, Synopsys) [abstract]
Abstract: Die-to-die (D2D) routing in advanced packages now faces unprecedented challenges due to extremely dense signal communications, large via size, and strict requirements such as teardrops, staggered vias and full shielding. These constraints severely limit routing resources and necessitate any-angle routing to fully exploit limited space, yet this flexibility drastically increases algorithmic complexity, particularly when dies exhibit irregular, asymmetric pin rows typical in heterogeneous integration. Existing works mostly fo- cus on die-to-substrate (D2S) routing, primarily adopt fixed-angle routing, and most importantly, they all employ sequential route methodologies. As a result, they cannot handle the tight global resource coupling and geometric irregularity of dense D2D scenarios, leading to inferior performance in modern heterogeneous package designs. This paper introduces a global, concurrent any-angle D2D routing framework that unifies routing and via planning, directly incorporates staggered via and teardrop constraints, handles arbitrary asymmetric die structures, and reserves space for full shielding. Experimental results on industrial-inspired D2D benchmarks show that our approach achieves 100% routability in dense regimes, while also accommodating full shielding, reducing total wirelength and maintaining near-linear runtime scalability.
12:10 - 13:10: Lunch
13:10 - 14:10: 4. Placement and Global Routing
Chair: Mehmet Yildiz (Cadence)
1. "Gradient-Guided RC Weighting for Timing-Driven Global Routing", Liang Xiao, Qinkai Duan, Leilei Jin, Jinwei Liu, Tsung-Yi Ho, Evangeline F.Y. Young and Martin Wong (The Chinese University of Hong Kong, Hong Kong Baptist University) (Best Paper Award) [abstract]
Abstract: As a critical step in electronic design automation (EDA), global routing provides a guide to subsequent steps and provides valuable feedback to previous steps, including congestion, timing, and power estimation. However, given the complexity of timing and power calculation, it is difficult to estimate the impact on timing and power during the routing process. To address this issue, we propose a gradient-guided framework that computes the “capacity sensitivity” and “resistance sensitivity” of each segment to estimate their influence on the timing objectives. Integrating these two values as weights to constrain the changes in capacitance and resistance of the wire segments, we develop a timing-driven global router with superior performance. Power is also considered by optimizing the cells’ switching power. Tested on ISPD25 Contest benchmarks, we can achieve 14.3% and 18.5% improvements in worst negative slack and total negative slack, respectively, with comparable congestion. With power optimization, we can further improve switching power by 10.6%.
2. "GrandPlan: Differentiable, Simultaneous Top-Level Floorplanning and Partition-Level Cell Placement for Large-Scale IP-Cores", Zhili Xiong, Yi-Chen Lu, David Z. Pan and Haoxing Ren (The University of Texas at Austin, NVIDIA) [abstract]
Abstract: Top-level floorplanning is a critical step in industrial physical design, where the die is partitioned into exactly abutted regions with carefully allocated areas to enable efficient hierarchical place-and-route and achieve desired power, performance, and area (PPA) trade-offs. In current practice, however, floorplanning remains largely manual and sub-optimal, as designers rely on RTL hierarchy with limited physical guidance; commercial tools cannot feasibly perform flat optimization at IP-core scale. As a result, late-stage routability-driven partition resizing often triggers cascading boundary changes, disrupting neighboring partitions and significantly increasing turn- around time and engineering cost. To address this challenge, we present GrandPlan, a GPU-accelerated, differentiable, end-to-end framework that co-optimizes top-level floorplanning and partition-level cell placement within a single automated loop. Leveraging cus- tom CUDA kernels, GrandPlan generates clean, rectilinear partition boundaries while concurrently placing macros and standard cells. The framework consists of three tightly coupled stages: (1) flat IP-core placement with differentiable grouping objectives, (2) boundary refinement via simulated annealing under area and routability constraints, and (3) routability-aware fence-region placement. Experiments on eight large-scale industrial IP-cores (up to 25M cells) show that GrandPlan reduces total wirelength by up to 14% and cross-partition (feedthrough) wirelength by 27% on average compared to human-expert-crafted baselines, with an average runtime of only 1.2 hours.
3. "Invited: BonnRoute: Classic Routing Algorithms with Recent Advances", Jens Vygen (University of Bonn) [abstract]
Abstract: BonnRoute is the routing tool developed by the University of Bonn in cooperation with IBM. It is based on algorithms that solve core subproblems optimally or near optimally. Here we review its basic approach and mention some of its core components.
14:10 - 14:20: Break
14:20 - 15:40: 5. Advanced Cell & Transistor-Level Design
Chair: Mark Ho (Nvidia)
1. "A Graph-Based Approach for Optimizing Pin Access in Nanosheet FET Standard Cell Library Synthesis", Meng-Yu Shih, Ting-Xin Lin and Yih-Lang Li (National Yang Ming Chiao Tung University) [abstract]
Abstract: This paper addresses the challenges associated with standard cell synthesis for Nanosheet FET technology, particularly the constraints on M2 layer usage and the need to consider M0 and M1 layers in block-level routing. We propose a flexible synthesis flow that can dynamically switch between single-row and multi-row cell structures. To improve pin accessibility, we introduce a method for dynamic pin allocation on M0 and M1 layers, along with techniques to limit M0 pin length and mitigate vertical pin access conflicts. Experimental results demonstrate that, under 70% core utilization, our cell library achieves an average 3.2% reduction in chip area, an average 97.6% reduction in design rule violations, and an average 16.5% decrease in wirelength compared to NS3K. Under the same chip area, our cell library achieves 97.5% reduction in design rule violations and 15.1% decrease in total wirelength.
2. "TransOpt: A Scalable Transistor-Level Placement and Routing Optimization Framework Beyond Standard Cells", Chen-Hao Hsu and David Z. Pan (University of Texas at Austin) [abstract]
Abstract: The standard-cell methodology has become the dominant paradigm in modern VLSI design due to its scalability and reusability. However, optimizing individual cells in isolation often yields suboptimal design-level results. Breaking the rigid abstraction of standard cells enables fine-grained optimization opportunities such as diffusion sharing and direct connections through gate or metal-to-diffusion layers, particularly in advanced technologies. Prior work on pure transistor-level placement without standard-cell abstraction has faced scalability challenges for larger designs. A practical alternative is to use standard-cell placement as an initial legal transistor placement and refine it locally at the transistor level. Furthermore, the existing transistor-level routing framework overlooks escape pin locations, leading to excessive upper-layer routing demand. To address these limitations, this paper presents TransOpt, a transistor-level optimization framework that refines standard-cell placements through local transistor-level placement refinement and escape-net-HPWL-aware routing optimization. Experimental results on benchmarks using the open-source 3 nm GT3 PDK demonstrate the effectiveness of the proposed transistor-level optimization techniques, achieving significant reductions in wirelength and via count compared with standard-cell baselines.
3. "A New Approach to Performance-Driven Analog IC Placement", Donghao Fang, Hailiang Hu, Wuxi Li and Jiang Hu (Texas A&M University, AMD) [abstract]
Abstract: A major obstacle in analog design automation is that circuit performance is sensitive to layout, yet accurately capturing this impact within layout tools is very expensive. To address this challenge, we propose a performance-driven analog IC placement approach, called VPlace, guided by machine learning. Our approach leverages a novel application of the VQ-VAE technique to improve robustness during the placement stage, in conjunction with a recent machine learning-based macromodeling method. We further demonstrate that data preparation strategies, which directly affect the efficiency of investigating the solution space, play an important role in determining both the accuracy of machine learning models and the resulting circuit performance. Experimental results show that VPlace achieves 22%-26% and 10%-16% performance improvements over an open-source analog layout tool and a prior machine learning–based performance-driven analog placement technique, respectively.
4. "Invited: Physical Synthesis/Layout Issues Specific to FPGA-based Emulation and Prototyping", Helena Krupnova (Synopsys) [abstract]
Abstract: Implementing digital design using FPGA (Field Programmable Gate Arrays) platforms is one of the most important verification technologies ([5], [8]). It allows us to build functional prototypes six months or more prior to real silicon becoming available. Using the prototype, design teams can verify the functionality, establish efficient regression process where the new RTL is validated on a regular basis, develop and validate the embedded software and confirm performance aspects of the designed chips. Emulation and prototyping became mandatory steps for signing off before silicon production. Thanks to the software development using emulators, the chips can boot operating systems and run real operation scenarios on day 1 of the silicon bring up. As digital designs cross multi-billion gates size, the FPGA-based emulation and prototyping platforms capacity needs to follow. Synopsys ZeBu emulation and HAPS prototyping solutions are based on AMD Xilinx FPGAs. Latest devices are AMD Versal™ Premium VP1902. Several hundreds of FPGAs are required to build systems capable of implementing the latest AI chips, GPUs, CPUs, SOCs. Building an emulation platform can be seen from different perspectives: In this paper we will describe different problems that are rising when building latest generation emulation solution and implementing customer design on FPGA systems. Complexity –-- hugely growing size of customers chips, as well as complexity of the latest generations of FPGAs represents the main challenge.
15:40 - 16:00: Break
16:00 - 16:50: 6. Keynote
Chair: Gracieli Posser (Cadence)
"How Optical Lithography Enables the Digital Age", Thomas Stammler (Zeiss) [abstract]
Abstract: From a few thousand transistors in the Apollo Moon lander’s guidance computer to today’s AI chips with tens of billions transistors, semiconductor scaling has been powered by optical lithography. Continuous improvements in resolution have posed extreme requirements for optical systems operating at nanometer precision. Building these systems demands two core competencies: excellence in optical design and engineering, and mastery of mechatronics for ultra-stable positioning. Manufacturing relies on advanced processes—grinding, polishing, ion beam figuring, coating—and on advanced metrology techniques to achieve sub-nanometer accuracy. Optical lithography remains indispensable, not only for its precision but also for its unrivaled data throughput, transferring vast amounts of pattern information in a single exposure. This talk will trace the evolution of lithography, highlight the engineering and manufacturing innovations behind these optical systems, and show how they continue to enable the next generation of semiconductor devices.
16:50 - 18:00: 7. Panel: Agentic AI
Chair: Andrew Kahng (University of California, San Diego) [abstract]
Abstract: Recent advances in large language models (LLMs) and tool-using autonomous agents present new opportunities for accelerating research and development in physical design. Unlike earlier uses of machine learning that focused narrowly on prediction or optimization subroutines, agentic AI systems can comprehend user specifications, modify code, run EDA tools, analyze results, perform multi-step reasoning, and iteratively refine design heuristics. This paper surveys the emerging landscape of agentic AI for physical design R&D, with emphasis on (i) tool-integrated agents for algorithm evolution, debugging, and workflow automation, (ii) autonomous exploration of heuristic spaces in placement, routing, and partitioning, and (iii) interfaces between agents and traditional EDA frameworks. We analyze recent experience with multi-agent workflows and benchmark evaluation, highlighting current capabilities, limitations, and research frontiers. We conclude by articulating the long-term prospects of agentic AI as a catalyst for accelerated innovation in physical design, including autonomous algorithm discovery, continuous tool improvement, and closed-loop learning from large design corpora.
Panelists:
Igor Markov (Synopsys)
Bei Yu (The Chinese University of Hong Kong) [abstract]
Abstract: Large language models have shown remarkable potential for electronic design automation (EDA), yet building effective LLM systems for EDA remains challenging due to complex tool-specific terminology and documentation. This paper surveys knowledge injection techniques that infuse domain expertise into LLM systems for EDA. We examine three complementary approaches: finetuning, which encodes EDA knowledge into model parameters through training on domain corpora and synthetic data; retrieval-augmented generation (RAG), which dynamically retrieves from external knowledge bases; and multi-agent flow, which decomposes complex tasks across specialized agents and leverages environment feedback for iterative refinement. As a case study, we present a graph-based RAG approach that addresses global queries requiring cross-chunk reasoning. The method trains document-customized embeddings via contrastive learning on knowledge graphs, detects semantically related entities using HDBSCAN clustering, and generates textual summaries integrated through hybrid retrieval. Experiments on OpenROAD documentation demonstrate significant improvements in answering global queries while maintaining local query performance. These findings highlight that domain customization is essential for effective knowledge injection, and graph-based techniques are particularly promising as they inherently encode domain knowledge through entity extraction and relationship modeling.
Mark Ho (Nvidia) [abstract]
Abstract: Large language models (LLMs) excel at solving complex tasks by executing agentic workflows composed of detailed instructions and structured operations. However, building agents for diverse applications by manually embedding foundation models into agentic systems such as Chain-of-Thought, Self-Reflection, and ReACT through text interfaces limits scalability and efficiency. Recently, researchers have explored automating workflow generation using code-based representations, but most methods depend on labeled data, limiting their applicability to real-world, dynamic hardware design problems. We introduce Polymath, a self-improving agent with a dynamic hierarchical workflow that combines task flow graphs with code-represented workflows to address these challenges. Polymath employs an experience-driven optimization frame- work that integrates multi-level graph optimization using surrogate scores from historical evaluations with a self-reflection-guided evolutionary algorithm for workflow refinement, enabling unsupervised self-improvement without labeled data. Experiments show that Polymath outperforms a leading commercial agentic system by 16.23% pass@1 and 11.47% pass@3 on hardware benchmarks, and achieves an average 8.1% improvement over state-of-the-art baselines on coding, math, and multi-turn QA tasks.
Chuck Alpert (Cadence)
Ankur Gupta (Siemens) [abstract]
Abstract: Due to the increasing complexities of chip design and shrinking time-to-market windows, design methodologies must scale beyond traditional solutions. Today, AI-powered EDA is redefining physical design methodologies, enabling unprecedented gains in productivity as well as power, performance, and area (PPA) results of silicon products. The past 12 to 18 months has seen the acceleration of generative and agentic AI capabilities being integrated into EDA tools, enabling engineering teams to automate tasks, optimize decision-making, and accelerate design closure, achieving a 10x productivity boost while significantly improving PPA. The result is a significant improvement in engineering scalability and time-to-market. In this panel segment, we discuss how AI-powered physical design methodologies are reshaping EDA workflows as well as transforming interactions between silicon designers and EDA software, and in addition, what this means for the future of chip design.
9:00 - 9:50: 8. Keynote
Chair: David Chinnery (Siemens)
"Use of AI/ML in Electronic Design Automation and Engineering Simulation", Prith Banerjee (Ansys/Synopsys) [abstract]
Abstract: This talk will describe how AI/Machine Learning is being applied to the field of Engineering Simulation and Electronic Design Automation. First, we will describe how AI/ML can be used to speed up engineering simulation by developing surrogate models by training neural networks using actual multi-physics simulations over various CAD models, and different boundary conditions. This requires the customers to train the AI models using an AI platform. Second, we will discuss how to develop foundational models for simulation by training AI models over a wide range of CAD models and boundary conditions. This approach does not require the customer to train any AI models, but instead they can use pretrained models on any given CAD geometry. Third, we will discuss how AI/ML models are used to make simulation tools easier to use by automatically setting the parameters of the simulation tools in order to get the best performance and accuracy. Fourth, we will discuss how generative models can be used to explore new designs and optimize product designs. Next, we will explore how AI/ML methods are used in electronic design automation tools such as placement, routing, synthesis and verification using reinforcement learning. These techniques have been used for design space optimization, analog space optimization, verification space optimization, and test space optimization. Finally, we will conclude the talk by showing how Agentic AI techniques can be used in engineering simulation and EDA to automate various tasks and workflows using the concept of “Agent Engineers”.
9:50 - 10:50: 9. AI/LLM in Physical Design
Chair: Evangeline Young (The Chinese University of Hong Kong)
1. "AstroTune: AST-Assisted LLM Retrieval for Cross-Stage Design Flow Parameter Tuner", Runzhi Wang, Jingyu Pan, Yiran Chen and Jiang Hu (Texas A&M University, Duke University) [abstract]
Abstract: Modern VLSI design relies on EDA tools, which expose designers to high-dimensional and complex parameter spaces. Efficiently optimizing these parameters remains challenging, as manual tuning is time-consuming and heavily dependent on expert experience. Recent advances in automatic parameter tuning utilize conventional tuning algorithms and machine learning approaches, but most still rely on slow, sequential iterative adjustments and struggle to retrieve relevant prior design knowledge when semantic information is modified or obscured. We present AstroTune, a structure-aware framework for LLM-assisted parameter tuning in chip design flows. By integrating both RTL source code and its abstract syntax tree (AST), AstroTune can retrieve and transfer design knowledge based on both semantic and structural relationships, remaining effective even if semantic information is modified or obscured. During execution, AstroTune employs stage-by-stage multi-candidate generation, propagation, and pruning via tournament selection to accelerate iterative tuning. Experiments on public benchmark suites show that AstroTune achieves superior tuning quality and substantial runtime reduction compared to state-of-the-art works.
2. "CHASE: A CHiplet Architecture Simulation and Exploration Framework with Decoupled Multi-Fidelity Optimization", Shixin Chen, Hengyuan Zhang, Jianwang Zhai and Bei Yu (The Chinese University of Hong Kong, Beijing University of Posts and Telecommunications) [abstract]
Abstract: Chiplet-based architecture is a promising emerging technology with benefits in cost, reusability, and performance. However, designing a complicated system to fulfill the comprehensive design metrics is challenging, and designers frequently suffer from tedious evaluation iterations. We propose the CHASE framework, i.e., a CHiplet-based Architecture Simulation and Exploration framework, which jointly considers both performance metrics and manufacturing. In the framework, simulation component ChipletSIM offers holistic modeling of chiplet-based architectures, integrating critical performance metrics (e.g., latency, power) and manufacturing metrics (e.g., yield, cost) across design stages. The exploration component, ChipletDSE, adopts a decoupled multi-fidelity exploration strategy to boost design exploration efficiency and reduce resource consumption. Our framework substantially improves the probability of attaining optimal designs in the early design phase via a comprehensive simulation process and an efficient exploration approach. Compared to previous methods, the experimental results demonstrate the effectiveness of the CHASE framework in comprehensive simulation and efficient exploration.
3. "Invited: Bidirectional Data Flow in VLSI Design Using Ontology and Knowledge Graph", Ilhami Torunoglu (Siemens) [abstract]
Abstract: Modern semiconductor design complexity continues to rise at an unprecedented pace, driven by advanced‑node lithography limits, heterogeneous integration, and interconnect‑dominated delays. These pressures expose limitations in unidirectional flows where early design choices lack physical context and manufacturability insight, causing repeated power, performance, and area (PPA) divergence in place‑and‑route. Our methodology enables teams to predict physical outcomes earlier, propagate manufacturing constraints upstream, and reduce costly iteration loops. We present a comprehensive bidirectional EDA framework that unifies front‑end and back‑end processes using early physical awareness, realistic wire resistance and capacitance (RC) based timing and power models, congestion prediction, and manufacturability‑driven feedback. At its core, a physical‑aware synthesis stage ingests floorplan geometry, macro constraints, routing blockages, and preliminary RC extraction to raise timing correlation with post‑route results. Treating congestion and routing resource availability as first‑class optimization inputs reduces bottlenecks and minimizes disruptive engineering change orders (ECOs).
10:50 - 11:10: Break
11:10 - 12:30: 10. Advanced Synthesis: From Classical to Quantum Architectures
Chair: Pavlos Matthaiakis (Synopsys)
1. "Invited: Improving Runtime Scaling in the EDA Flow for Designs with Millions of Gates", David Chinnery (Siemens) [abstract]
Abstract: Digital circuits have scaled to billions of gates, but synthesis and automated place-and-route (SAPR) tools are practically limited to several millions of gates due to taking weeks of runtime beyond this design size. Server hardware advances, improvement in software code and algorithms, and parallelism, e.g., multi-threading, have provided modest speedups of about 10x in the past 20 years. However, SAPR software is difficult to parallelize with significant overheads for single-threaded synchronization, mutual exclusion (mutex) locks when reading from and writing to database to avoid collisions, to ensure deterministic results, etc. This work discusses why SAPR runtime scaling has been limited, approaches that have been taken to tackle this and why they have been insufficient, looking toward what is possible in the future. Amdahl’s law limits speedups achievable with massively parallel techniques like GPU acceleration, which are practically limited to isolated portions of the SAPR flow. Flow restructuring, multi-objective optimization, and partitioning large designs to run in parallel on separate compute servers provide further opportunities for speedup, but suboptimally vs. optimizing the design as a whole. As an example of this, partitioning to speed up swapping between and reordering of scan chains provides up to 3.5x speedup at the cost of up to 10% increased scan chain wire length vs. the fastest results on hard testcases with high runtimes.
2. "Invited: Challenges and Opportunities in Advanced-node Design Closure", Will Reece (Cadence) [abstract]
Abstract: Electronic Design Automation (EDA) software tools are indispensable for designing and verifying modern semiconductors. The end of Moore’s Law and the advent of dedicated hardware for AI acceleration have led to designs that are physically large and have unprecedented instance counts. Both in mobile and other applications, power consumption has become key for many projects while achieving timing closure with high performance is as important as ever. Commercial SoCs often combine many different functions, which means many IP blocks need to be optimized and integrated together, providing further challenges for designers. The rapid improvement of AI-based techniques provides hope for further automation and chip performance improvements as EDA tools and design workflows are updated to exploit them. While significant gains have already been made, the potential for further development remains high. Improvements in classic EDA approaches are still relevant and can be complementary to AI algorithms. These topics will be explored in the context of a rapidly changing chip design ecosystem, where the number of SoC designs is increasing and custom ASICs are deployed for many more applications.
3. "An Improved Ion-Shuttling Approach for QCCD Architectures", Tung-Yeh Wu and Ting-Chi Wang (National Tsing Hua University) [abstract]
Abstract: Trapped-ion quantum computers offer high-fidelity operations, long coherence times, and all-to-all connectivity within a single trap. However, scaling to large circuits is limited by the trap’s ion capacity. The QCCD architecture addresses this by enabling ion shuttling across multiple traps, but excessive shuttling increases execution time and degrades fidelity. This paper presents an im- proved ion-shuttling approach targeting linear QCCD architectures. Building on a state-of-the-art approach [27] for linear trapped-ion devices, our approach introduces a front-layer-based scheduling method that dynamically prioritizes executable gates, along with a novel destination trap selection method for ion-shuttling, effectively reducing unnecessary ion movements. We also present a look-ahead gate selection strategy based on the gate dependency graph, avoiding the high runtime of the gate proximity approach used in the state-of-the-art approach, especially in circuits with all-to-all communication. Our approach achieves up to a 37.67% reduction in shuttle count and up to a 74.91% reduction in compilation time compared to the state-of-the-art approach for linear trapped-ion devices on actual NISQ benchmarks and synthetic circuits. On average, our approach achieves a 16.44% reduction in shuttle count and a 30.77% reduction in compilation time across all test cases. These improvements highlight the effectiveness of our approach, particularly for actual circuits.
4. "Timing-Aware End-to-End Circuit Compilation Framework for Modular Quantum Systems", Ching-Yao Huang and Wai-Kei Mak (National Tsing Hua University) [abstract]
Abstract: To address the scalability challenges of quantum computing, the industry is shifting from monolithic architectures to modular quantum systems. By interconnecting multiple quantum processing units (QPUs) through communication links, modular quantum systems can scale to much higher number of qubits. However, the delay of inter-QPU operations is an order of magnitude greater than that of intra-QPU operations. To minimize the final circuit latency, a well-considered initial assignment of logical qubits in the circuit to QPUs in the system and careful insertion of inter-QPU operations are required. In this paper, we propose a timing-aware end-to-end circuit compilation framework for modular quantum systems. In the placement stage, a dependency- and interaction-aware assignment strategy is developed to assign logical qubits that interact early and frequently to nearby QPUs. In the routing stage, we optimize the insertion of inter-QPU operations and multiple intra-QPU operations to enable the execution of the gates. The selection of both inter- and intra-QPU operations is based on their circuit latency overhead, which is estimated using the operation delay and the available idle time of operand qubits that can cover the delay, as well as their benefit to subsequent gates. Our experiments assumed a realistic modular quantum system consisting of three interconnected QPUs with a total of 1,386 physical qubits. We evaluated our framework using two benchmark sets consisting of reversible arithmetic circuits and algorithmic circuits, with up to 1,300 qubits and over 800,000 two-qubit gates. Experimental results demonstrate that our approach outperformed the state-of-the-art compilation approach for modular quantum systems with over 73.8% and 46.9% reduction in final circuit latency and number of inserted inter-QPU operations, respectively. In addition, this advantage is preserved on a larger modular quantum system with four QPUs arranged in a square topology.
12:30 - 13:30: Lunch
13:30 - 14:30: 11. Reliability/Electromigration
Chair: Ramprasath S (Indian Institute of Technology Madras)
1. "Invited: Toward Accurate, Large-scale Electromigration Analysis and Optimization in Integrated Systems", Sachin Sapatnekar (University of Minnesota) [abstract]
Abstract: Electromigration, a significant lifetime reliability concern in high-performance integrated circuits, is projected to grow even more important in future heterogeneously integrated systems that will service higher current loads. Today, EM checks are primarily based on rule-based methods, but these have known limitations. In recent years, there has been remarkable progress in enabling fast EM computations based on more accurate physics-based models, but such methods have not yet moved from research to practice. This paper overviews physics-based EM models, contrasts them with empirical models, and outlines several open problems that must be solved in order to enable accurate physics-based and circuit-aware EM analysis and optimization in future integrated systems.
2. "Invited: Electromigration Avoidance Strategies in Infineon", Shanthi Siemes (Infineon Technologies Dresden) [abstract]
Abstract: In this presentation, I discuss the strategies which are used in Infineon to avoid electromigration. Technology is shrinking at a fast pace, metal widths are reducing and hence current density is increasing leading to Electromigration (EM), which is a key concern for integrated circuit (IC) reliability. Understanding current density rules, Blech or short length rules, technology understanding of where the voids occurs the most is a must to avoid EM [1]. Transition metal layers and transition vias in designs pose challenges to electromigration sign off. These layers have to be carefully routed and analyzed for electromigration. Understanding the current density limits for each backend of layers also helps to carefully choose the metal and via layers for different routing. EM is highly temperature dependent (Black’s equation [2]). A 5K rise reduces the current density limit by half. This is a challenge. As technology is scaling with more metal layers far away from the substrate, thermal simulation is needed. Fins in FinFET technology node is a worse heat dissipator to the substrate so thermal analysis should be run for the entire design and heat sinks should be carefully designed to mitigate thermal and the resulting electromigration. Understanding of the product mission profile [3] and using the equivalent temperature and lifetime to run a realistic electromigration analysis is essential. It is important to choose a realistic extraction corner instead of pessimistic worst-case corner to reduce over design.
3. "Invited: Addressing Electromigration Challenges in 3D Integrated Circuit (3DIC) Wafer-On-Wafer Technology", Ingo Kühn (Global Foundries) [abstract]
Abstract: In the realm of high-density multi-wafer designs, conducting Electromigration (EM) analyses alongside Electro-thermal co-simulation (ETCosim) has become increasingly essential, primarily due to the proximity of heat sources within these systems. A wafer-on-wafer (WoW) technology (similar to TSV based in [3] but flipped) requires solutions for post-layout simulations, particularly for EM analyses and Electro-thermal co-simulation. Parasitic Extraction (PEX) is a crucial step that serves as the foundation for later post-layout simulations. The coupling capacitances between wafers are extracted into an own entity which will be used in simulation. Subsequent multi-wafer EM analysis requires a multi-technology simulation setup as single wafers might be implemented in different technologies. Addressing the thermal challenges inherent in 3DICs involves examining the interplay between EM and self-heating (SHE) effects. This effort is done mainly to avoid both under- and overdesign of the back end of line (BEOL) metallization. This paper proposes a potential solution to analog circuit EM with self-heating, extending single-wafer techniques to 3DIC.
14:30 - 14:40: Break
14:40 - 16:00: 12. Automotive/Analog
Chair: Andreas Krinke (Dresden University of Technology)
1. "Invited: Substrate Netlist Extraction in Analog Design", Klaus Heinrich (XFAB) [abstract]
Abstract: The undesired coupling between various circuits or devices on a chip due to active (bipolar) and/or passive (RC) elements within the substrate has been a known issue for many years. PN-Solutions SA, a Swiss EDA company and X-FAB Semiconductor Foundries have developed together two innovative products PNAware and PNAwareRC, that address these issues. By using those two tools, it is now possible to run a circuit simulation using the extracted post-layout view as usual but including all substrate couplings effects.
2. "Invited: Novel Concepts to Improve Custom Layout Automation Capabilities", Goeran Jerke, Thomas Burdick, Peter Herth, Vinko Marolt, Andrew Beckett (Bosch, Cadence) [abstract]
Abstract: Custom layout remains predominant in the design of today’s Analog, mmW/RF, MEMS, Power MOSFET, Silicon Photonics, Advanced Package, and other non-digital integrated circuits. Automation in these domains is often impeded by inherent design complexity, uncertainty due to missing information, limited compatibility and capability, and established design paradigms. This paper provides novel perspectives and interdependent concepts to address several long-standing custom automation challenges. We identify partitioning, abstraction, reification, epistemic uncertainty, and dialectic motion as theoretical foundations to improve custom layout automation. Furthermore, we derive and detail generic, automation-enhancing elementary and practical application concepts, including design context consideration, mixins, reactivity, and representation. We demonstrate these improved automation capabilities through various application examples.
3. "Invited: Analog Computation with Oscillatory Neural Networks", Aida Todri-Sanial (Eindhoven University of Technology) [abstract]
Abstract: Dynamical systems exhibit rich and intricate behaviors that can be harnessed for physical computation. Physical computing draws inspiration from complex systems that continuously adapt, self-organize, and minimize energy as they evolve toward stable configurations, naturally enabling parallel processing. These characteristics show promise for tackling difficult scientific challenges, including NP-hard combinatorial optimization problems. However, designing dynamical systems for computation remains challenging, particularly in choosing appropriate technologies and developing scalable circuit implementations. This invited talk will provide an overview of circuit-level implementations of physical computing using coupled oscillatory neural networks (ONNs).
4. "Invited: Analog IC Design Automation -- More than a Technical Challenge", Benjamin Prautsch (Fraunhofer EAS) [abstract]
Abstract: The design of analog integrated circuits (ICs) and analog IC components is still a largely manual process, especially when compared to digital design. Despite significant efforts in this field, which address all major steps of analog design, the automation level remains relatively low. In this extended abstract, we discuss some potential reasons for this situation. Besides technical challenges, we also consider non-technical aspects, as we have observed that they play a crucial role in limiting the usage of analog automation. For most of the challenges, we suggest mitigations, which should be taken into account when developing analog EDA (electronic design automation). Analog complexity: In [1], digital design and analog design is distinguished between quantitative complexity and qualitative complexity, respectively. The authors refer to the number of elements in digital versus the diversity of requirements to be considered in analog. We suggest to purposefully restrict the design and/or algorithmic freedom in analog design choices by design-methodological constraints, as for instance with a slicing template approach for analog layouts such as in [2]. The idea is to trade-off simplicity of EDA over result quality, as the industry will often (but not always) economically value “good enough” over “best-in-class”. This way, EDA can also be used on abstraction levels closer to the system and, thus, help trade-off more global design choices.
16:00 - 16:20: Break
16:20 - 18:00: 13. Lifetime Achievement Session
Chair: Patrick Groeneveld (AMD)
"Invited: A History of Influences", Jürgen Scheible (Reutlingen University) [abstract]
Abstract: This paper summarizes my presentation at the ISPD 2026 Lifetime Achievement Session honoring Professor Jens Lienig. Jens Lienig and I have known each other for almost four decades. During this period, we have worked together in various contexts, repeatedly influencing each other's interests and activities. In the following, I outline some milestones of this journey, highlighting some key results of Jens Lienig’s contributions to EDA.
"Invited: The Many PD Faces of Professor Jens Lienig", Andrew Kahng (University of California, San Diego) [abstract]
Abstract: The 2026 ISPD Lifetime Achievement Award honors Professor Jens Lienig for his tremendous decades-long impact on physical design research, education, and community. Professor Lienig’s career uniquely weaves algorithmic methodology, industry-driven foci, and pedagogical codification for future generations. It has both mirrored and driven the broadening of physical design’s objective functions and notions of correctness, from geometric checks to reliability- and physics-aware methodologies. This invited paper provides a few personal perspectives to complement the excellent review by Knechtel et al. [47]. Three “PD faces” of Jens Lienig are highlighted: (i) early evolutionary and parallel metaheuristics for combinatorial problems in physical design; (ii) textbooks whose arc spans system context through layout practice and physical design automation, alongside landmark monographs on electromigration-aware design and 3D integration; and (iii) influence on the trajectory, research and community of ISPD.
"Invited: From Evolutionary Algorithms to Analog IC Design, 3D Integration, Reliability, and Beyond: On Jens Lienig's Contributions to Advance Physical Design", Johan Knechtel (New York University Abu Dhabi) [abstract]
Abstract: The 2026 International Symposium on Physical Design (ISPD) honors Jens Lienig with the Lifetime Achievement Award, recognizing his multi-decade impact on physical design automation, education, and professional service. While the semiconductor industry has relentlessly pursued power, performance, and area scaling, Lienig’s research has consistently highlighted a fourth, critical dimension: robustness and reliability. This paper reviews the trajectory of his contributions, beginning with foundational work on evolutionary algorithms for routing in the 1990s, moving through the rigorous automation of analog constraint handling, and culminating in his pioneering research on electromigration-aware physical design. We further examine his contributions for automating physical design for 3D integration, in particular handling thermal and mechani- cal challenges, and his recent collaborations to establish security as an emerging physical design objective. Beyond his technical achievements, this paper acknowledges his profound influence as an educator, whose textbooks and curriculum reforms have bridged the gap between theoretical algorithms and industrial reality for a generation of engineers.
"Invited: Layout Design Automation: From Academia to Industry and Back", Jens Lienig (Dresden University of Technology)
19:00 - 21:00: Banquet
9:00 - 9:50: 14. Keynote
Chair: Iris Hui-Ru Jiang (National Taiwan University)
"Studying the Brain from the Perspective of EE", Lou Scheffer (Janelia Research Campus) [abstract]
Abstract: There are many important tasks where the brain outperforms our best existing technologies. Its low power consumption and high learning rates, for example, are achieved by algorithms and hardware only partially understand. Better knowledge of how the brain does these tasks will likely lead to commensurate performance improvements in our own technology. But how to understand the brain? The first part of this talk will summarize our current efforts in brain study, many using techniques borrowed from, and understandable to, EEs. The second part of the talk will extend this analysis to aspects of the brain that are not yet well understood, again borrowing tools and techniques from electrical engineering. The talk will end with some (hopefully informed) speculation on where this may lead.
9:50 - 10:50: 15. Benchmarking for EDA
Chair: Ismail Bustany (AMD)
1. "Invited: Benchmarker: A Web-Based System for Tracking Experimental Results", Rahul Rana, Tejas Bachhav, Aniruddha Dhumal, Ashutosh Pareek, Riya Sara Angel Korrapolu, Sathya Sai Ram Prabhala, Patrick Madden, Dishant Bhatnagar (Binghamton University) [abstract]
Abstract: Benchmarks have been a cornerstone of research in integrated circuit design. Well defined problems and metrics have allowed research teams to address key challenges and measure the impact of new ideas and methodologies. Traditionally, researchers could follow a handful of conferences and journals to stay abreast of advances. Over the past few years, the research pace has increased, and the number of publication venues has significantly expanded; staying up-to-date has become much more challenging for active research groups, paper reviewers, editors, and for anyone with an interest in a particular topic area. To streamline the dissemination of research results, reduce the chance of errors and misunderstandings, provide timely corrections and clarifications, and to make new publications and results more easily visible, we present Benchmarker, a web-based system for tracking experimental results.
2. "Invited: Toward Sustainable and Transparent Benchmarking for Academic Physical Design Research", Liwen Jiang, Andrew Kahng, Zhiang Wang, Zhiyu Zheng (Fudan University, University of California, San Diego) [abstract]
Abstract: This paper presents RosettaStone 2.0, an open benchmark translation and evaluation framework built on OpenROAD-Research [1]. RosettaStone 2.0 provides complete RTL-to-GDS reference flows for both conventional 2D designs and Pin-3D-style face-to-face (F2F) hybrid-bonded 3D designs, enabling rigorous apples-to-apples comparison across planar and three-dimensional implementation settings. The framework is integrated within OpenROAD-flow-scripts (ORFS)-Research [2]; it incorporates continuous integration (CI)-based regression testing and provides a standardized evaluation pipeline based on the METRICS2.1 convention, with structured logs and reports generated by ORFS-Research. To support transparent and reproducible research, RosettaStone 2.0 further provides a community-facing leaderboard, which is governed by verified pull requests and enforced through Developer Certificate of Origin (DCO) compliance.
3. "Invited: Modern Hypergraph Partitioning: KaHyPar, Mt-KaHyPar, and Beyond", Sebastian Schlag, Christian Schulz, Tobias Heuer (Heidelberg University, eBay) [abstract]
Abstract: Hypergraph partitioning is a standard abstraction for VLSI circuit partitioning. While hypergraphs naturally capture multi-terminal nets, they also arise in a wide range of non-VLSI applications, including sparse matrix computations, scientific simulations, and data-intensive workloads. Historically, many successful hypergraph partitioning tools have been carefully tuned to the requirements of specific application domains, leaving a gap for general-purpose frameworks that deliver high solution quality and scalable performance across diverse classes of instances. Since balanced hypergraph partitioning is computationally in- tractable, heuristics are used in practice, with the three-phase multi-level paradigm being the most prominent approach. Within this paradigm, the choice and interaction of coarsening, initial partitioning, and refinement techniques largely determine the resulting trade-offs between solution quality and running time. This talk1 traces the development of modern hypergraph partitioning through three representative tools: KaHyPar [1, 2, 5, 11– 13, 15, 16], Mt-KaHyPar [4, 6–10, 14], and HeiCut [3]. KaHyPar advanced the state of the art in general-purpose hypergraph partitioning by employing a highly fine-grained multilevel scheme together with a combination of localized and flow-based refinement techniques, complemented by sparsification and structure-aware coarsening strategies. Mt-KaHyPar extends these ideas to shared-memory parallelism, reshaping the time–quality trade-off landscape by parallelizing the techniques introduced in KaHyPar. Looking beyond heuristic methods, HeiCut illustrates how engineered exact algorithms can complement heuristic approaches by making minimum cut computations feasible at scale through optimality-preserving reduction techniques.
10:50 - 11:10: Break
11:10 - 11:50: 16. Contest Summary/Results
Chair: Tung-Chieh Chen (Synopsys)
"ISPD 2026 Contest: Post-Placement Buffering and Sizing", Andrew B. Kahng, Seokhyeong Kang, Sayak Kundu, Yiting Liu, Davit Markarian, Seonghyeon Park, Zhiang Wang (University of California, San Diego, Pohang University of Science and Technology, Fudan University) [abstract]
Abstract: The ISPD 2026 Contest [22] challenges participants to develop post-detailed placement buffering and sizing tools that optimize timing and fix electrical rule check (ERC) violations under real-world constraints. Unlike prior contests, this contest emphasizes practical physical design challenges including fixed macros and I/Os, power delivery network (PDN) blockages, soft placement blockages, and fixed routing resources. The contest provides eight public benchmarks and four hidden benchmarks, with a range from 15K to 1.4M instances, in the ASAP7 7nm technology node [4] with multi-threshold voltage cell libraries. Evaluation is performed using the open-source OpenROAD infrastructure, with scoring based on timing (total negative slack), power (dynamic and leakage) and penalties for ERC violations, displacement, routing congestion and runtime. This paper describes the contest problem formulation, benchmarks, evaluation methodology, a review of related contests and a two-year roadmap for continuation in the ISPD 2027 Contest.
11:50 - 12:00: Outlook to ISPD 2027
12:00 - 17:00: Social Outing
We will travel to the top of the Drachenfels (engl. dragon's rock) south of Bonn via bus and rack railway. It is topped with the ruins of a 850 year old castle and one Germany's most visited natural monuments. On the way we will have lunch in the wine cellar of the Bredershof, a 400 year old farm house.
17:00: Free Arithmeum Tour, Lennéstr. 2, 53113 Bonn (registration required)
Explore the history of calculating machines and enjoy constructive art!