### International Symposium on Physical Design 2010

# Performance Study of VeSFET-Based, High-Density Regular Circuits

Yi-Wei Lin<sup>1</sup>, Malgorzata Marek-Sadowska<sup>1</sup> and Wojciech Maly<sup>2</sup>

<sup>1</sup>Dept. of ECE, University of California, Santa Barbara <sup>2</sup>Dept. of ECE, Carnegie Mellon University

### **Outline**

- Introduction
- High Density Regular Circuits
  - Vertical Slit Transistor (VeST)
  - High-density regular transistor array
  - Parasitics of diagonal interconnects
- Cell Layout Study
  - Three different layout styles
  - D-net re-routing strategy
- Experiments
  - Cell level and circuit level comparisons
  - Cell replacement for metal layer optimization
  - Implementations with different transistor heights
- Conclusions

### Introduction

### Regular fabric

- Interactions between components are easier to model and analyze
- Device and metal masks can be shared
- Restricted layout constraints could lead to performance and area overhead.

### VeSFET-based high-density regular circuits

- Memory-like, super-regular transistor array
- Similar performance with much less area
- New design challenges induced from the unique layout characteristics

### Vertical Slit Field Effect Transistor (VeSFET)



#### VeSFET vs.65nm CMOS Transistors

- Smaller driving current (larger resistance)
- Smaller transistor capacitance
- Lower power consumption

# High-Density Regular Transistor Array



- All connections must be made by wires.
- Transistor sizing needs parallel connected multiple unit transistors.
- All wires are atop transistor pins.
- Vias are needed for turning connections.
- All unit transistors are prefabricated and are of identical size
- All wires on the same layer are parallel

# **Diagonal Wire Connections**



# Parasitics of Diagonal Wires





O DW = 
$$(1/\sqrt{2})$$
W

$$\circ$$
 DL =  $\sqrt{2}$  L

#### Dimensions of diagonal and horizontal/vertical wires

| Wire type                | Wire<br>width | Unit spacing | Unit segment length |  |
|--------------------------|---------------|--------------|---------------------|--|
| Diagonal (Dia)           | 70nm          | 70nm         | 282nm               |  |
| Horizontal/vertical (HV) | 100nm         | 100nm        | 200nm               |  |



# Performance Effects of Interconnect Parasitics





#### When T<sub>c</sub> is switching

⇒ Gate net and D/S-net always switch in opposite directions



- For minimal height VeSFET:
  (b) has 41% more delay and PDP than (a)
- For 65nm CMOS
  (b) has 13% more delay and PDP than (a)

### Different Layout Realization

#### Inverter E



• All nets are routed in M1 & M2

#### Inverter *F*



- Input net is routed in M1 & M2
- Output net, VDD/GND are routed in M3 & M4

Inverter E has 43% more delay and PDP than Inverter F

## Different Layout Styles

- Style-A: All D-nets are routed at M1 & M2
- Style-B: All D-nets are routed at M3 & M4
- Style-C: Only critical nets are routed at M3 & M4

(All G-nets are routed on M1 & M2)



# Performance Ratios for Different Layout Styles

## Performance ratios for 3-input 1X NAND gate in 3x4 footprint implemented in different layout styles

| Switching | Style-B |       | Style-C_X |       | Style-C_Y |       | Style-C_Z |       |
|-----------|---------|-------|-----------|-------|-----------|-------|-----------|-------|
| input     | Delay   | PDP   | Delay     | PDP   | Delay     | PDP   | Delay     | PDP   |
| X         | 0.862   | 0.857 | 0.878     | 0.871 | 0.911     | 0.897 | 0.968     | 0.971 |
| Y         | 0.820   | 0.809 | 0.963     | 0.968 | 0.850     | 0.846 | 0.934     | 0.927 |
| Z         | 0.787   | 0.779 | 0.968     | 0.966 | 0.887     | 0.877 | 0.796     | 0.789 |
| Average   | 0.826   | 0.819 | 0.933     | 0.931 | 0.883     | 0.874 | 0.905     | 0.902 |



- Coupling capacitances between serially connected D-nets (Ca & Cb) can accelerate the switching of the output net.
- Performance overhead of Style-C\_K is only around 1%~3% when input K is critical

### D-net Re-Routing Strategy (Flipping)



- A unit transistor U is even if  $[C_X(U) + C_Y(U)]\%2 = 0$ .
- A unit transistor U is odd if  $[C_X(U) + C_Y(U)]\%2 = 1$ .
- Unit transistors should have the same orientation to simplify critical nets routing at M3 & M4.







### D-net Re-Routing Strategy (Clustering)

- Partition the circuit into serially connected cluster groups (two-colorable sub-graphs).
- Each cluster group is a *flipping unit* to find feasible unit transistor orientations.



### Cell Level Comparisons

|             | 65nm CMOS |      | Style-B |      | Style-C |      | ISPD 09 |      |
|-------------|-----------|------|---------|------|---------|------|---------|------|
| Static Cell | Timing    | PDP  | Timing  | PDP  | Timing  | PDP  | Timing  | PDP  |
| INV 4X      | 1.02      | 2.38 | 0.77    | 0.76 | -       | -    | 1.03    | 1.03 |
| INV 8X      | 1.01      | 2.36 | 0.70    | 0.70 | -       | -    | 0.98    | 0.98 |
| 2-NAND 2X   | 1.09      | 2.47 | 0.78    | 0.77 | 0.80    | 0.80 | 1.06    | 1.06 |
| 2-NOR 2X    | 1.08      | 2.47 | 0.79    | 0.79 | 0.81    | 0.80 | 1.03    | 1.02 |
| 3-NAND 1X   | 1.05      | 2.42 | 0.83    | 0.82 | 0.84    | 0.84 | 1.01    | 1.00 |
| 3-NAND 2X   | 1.04      | 2.42 | 0.76    | 0.75 | 0.79    | 0.78 | 0.98    | 0.98 |
| AOI21 1X    | 1.11      | 2.50 | 0.88    | 0.88 | 0.91    | 0.90 | 1.02    | 1.02 |
| AOI21 2X    | 1.07      | 2.40 | 0.83    | 0.82 | 0.87    | 0.86 | 0.99    | 0.99 |
| AOI31 1X    | 1.12      | 2.49 | 0.86    | 0.86 | 0.89    | 0.87 | 0.99    | 0.98 |
| OAI31 1X    | 1.11      | 2.46 | 0.87    | 0.86 | 0.89    | 0.88 | 1.03    | 1.02 |
| AOI22 1X    | 1.15      | 2.53 | 0.89    | 0.89 | 0.93    | 0.93 | 1.07    | 1.06 |
| OAI22 1X    | 1.00      | 2.35 | 0.89    | 0.88 | 0.94    | 0.93 | 1.04    | 1.04 |
| Average     | 1.07      | 2.44 | 0.82    | 0.81 | 0.87    | 0.86 | 1.02    | 1.02 |



- AATR: Average Available Track Ratio.
- Style-A cells have
   100% AATRs in both
   M3 & M4.

## Circuit Level Comparisons

- Style-A circuits: composed of Style-A cells
- Style-B circuits: composed of Style-B cells
- Initial Style-C circuits:
  - Modified from Style-B circuits
  - Replace each Style-B cell by a Style-C cell corresponding to the latest arriving fan-in signal.

|         | CMOS-  | Circuits | Style-B circuits |       | Initial Style-C circuits |       | ISPD09-circuits |       |
|---------|--------|----------|------------------|-------|--------------------------|-------|-----------------|-------|
|         | Timing | Power    | Timing           | Power | Timing                   | Power | Timing          | Power |
| Average | 0.92   | 2.57     | 0.85             | 0.86  | 0.86                     | 0.87  | 0.99            | 1.02  |

### Cell Replacement for Metal layer Optimization



- Apply LP-based slack budgeting technique.
- o Each cell is represented by a binary variable (replaced or unchanged)
- Cell's input capacitances are taken into account.
- ΔL for Style-B circuits: reduced from 1.25 to 0.42.
- ΔL for initial Style-C circuits: reduced from 1.08 to 0.25.

# Comparisons for Different Transistor Heights

Transistor height of VeSFET = Transistor width of CMOS

## Normalized performance ratio of *Style-B* layouts with different transistor heights at cell and circuit levels

| Transistor Height | Cell   | Level | Circuit Level |       |  |
|-------------------|--------|-------|---------------|-------|--|
|                   | Timing | PDP   | Timing        | Power |  |
| 200μm             | 0.821  | 0.814 | 0.851         | 0.864 |  |
| 400μm             | 0.897  | 0.886 | 0.917         | 0.923 |  |
| 600µm             | 0.945  | 0.929 | 0.972         | 0.979 |  |

- For lower power application (smaller transistor heights):
   Style-B cells and Style-C cells could improve circuit performance.
- For high performance application (higher transistor heights): Style-A cells could save metal layer usages.

### Conclusions

- Interconnect parasitics significantly affect the circuit performance.
- Two performance improvement technique
  - Critical-net re-routing strategy
  - LP-based cell replacement
- We have demonstrated a tradeoff between performance and metal layer usage

## Q & A

# Thank you!