## Design and Test of a Mixed-Signal Application-Specific Video Encoder

Haiming Jin Intel Corporation haiming.jin@intel.com

### Abstract

After decades of research and development, tightly following the advancement of process technology, electronic design automation tools have reached a mature level. Design flows, which have been the guiding principle and driving force behind point tools' interoperability and quality, have now become efficient work horses for digital silicon designers. In this paper, we demonstrate, through the design process of a mixed-signal multi-function video encoder integrated circuit, the critical role design flow plays in the design and test of system-on-chip silicon products. A flow diagram is presented accompanied by a step-bystep explanation. Emphasis has been placed on the noise isolating features of the mixed-signal SoC design. Various quality aspects of the design process have been addressed in order to deliver a functional chip. In striving towards a "total-quality" design of integrated circuit systems, we stress the continuous need for quality silicon collaterals and IP components, along with their qualified interface models. In a final note, we declare the success of the flow introduced herein as evidenced by silicon test results as well as the time to delivery.

## 1. Introduction

In as early as the seventies, when semiconductor device integration was still at its primitive stage relative to the scale today, research and development of design automation has already taken place<sup>[2][3]</sup>. Since then, numerous EDA point tools have been developed across all areas of silicon design and verification. At the front-end, circuit simulation<sup>[1]</sup> and graphical schematic entry design tools were among the first EDA products. At the backend, routing<sup>[2]</sup> and placement<sup>[3]</sup> automation tools started to emerge. Capacity and performance were easier to manage then. However, as Moore's law became the ruling principle of semiconductor industry, the scale of integration of semiconductor devices increased steadily with time. Meanwhile, the design world also became increasingly digital-oriented. As a result, the challenge to design automation, especially digital design methodology, had skyrocketed.

"A fundamental rule in technology says that whatever can be done will be done"<sup>[6]</sup>, so holds true for the technical challenges facing digital design automation. Digital simulation languages such as Verilog<sup>[5]</sup> were invented, accompanied by the language cognizant simulation tools and later synthesis machines<sup>[4]</sup>. Such creation provides an interface between front-end design, simulation, and backend implementation, opening the door to what we know nowadays as the "RTL-to-GDSII" ASIC design and assembly flow. Not for too long, such a solution demand has been turned into reality, and has even become the driving force behind point tools' quality and interoperability. Now almost all semiconductor manufacturers or fabless design houses making building blocks for the digital world are equipped with various sets of EDA tools, and have polished their flows to efficiently produce quality electronic components.

Not much unlike the "virtual factory" concept developed inside Intel<sup>®</sup>, baseline flows can be "copied exactly" among the design groups and across design projects. Quality and reliability of the products generated from such design flows or their derivatives are believed to be under better control. This is particularly so for designs that share the same process technology, since design libraries and IP blocks together with the flow are pre-qualified for reuse.

In this paper, we demonstrate a typical design flow through the development of a sample video encoder mixed-signal ASIC over a TSMC 0.18um CMOS process. Also covered are some of the important check points where extra attentions are usually needed during the IC design process. Post silicon debug results are also presented to illustrate the quality and effectiveness of the flow. We summarize the paper by providing our view of the ultimate quality electronic design.

The paper is organized as follows: first is a brief overview of the system microarchitecture, followed by a detailed description of the design, assembly and verification flow that delivered this IC; we then highlight the key results of the design in section 4; Section 5 touches upon the vital issues of silicon component qualification; We show the quality of the flow by the test results on its product in section 6, and give our concluding remarks in section 7.

In the presented design flow, the majority of digital creation and implementation tools are from Synopsys<sup>®</sup>; Our formal verification and IP qualification tools come from Cadence<sup>®</sup>; Modelsim is the functional simulation tool, and Calibre is used for physical verification, both of Mentor Graphics<sup>®</sup>; Timing interface modeling and verification, as well as silicon debug analysis are also handled by Synopsys<sup>®</sup> dynamic and static timing tools. While Artisan Components<sup>®</sup> supplied digital component

libraries including standard cell, standard I/O, and compiled memories, Leda Systems<sup>®</sup> provided the analog IP blocks.

## 2. Architectural overview

In its previous generations, the function of the chip dawned upon a few IC components, including a DAC, one or more timing generator chips, and the digital video encoder. The goal of this design is to provide a monolithic solution for these functions on a smaller feature-sized process technology. The area and cost advantage is obvious.





As shown in Figure 1, digital video encoder makes up the core of the digital system, along with the system management bus interface. Also on chip are the timing generating clock synthesizers and a thermal sensor which monitors system temperature via two sensor diodes. Communication between the digital and analog portions of the chip is also managed by the system bus interface manager, same way as how it is handled between the chip and the external system. Target max operating frequency of the digital core is 100MHz.

Noise sensitive analog operation requires that the PLLs, DAC, and thermal sensor work strictly under their respective isolated environment with dedicated power supplies (clean 1.8v and 3.3v sources) and an analog ground. The digital core which operates off of a 1.8v supply is relatively noisy and needs to be physically kept a distance from the analog IP blocks. We even extend such protective measures to have the analog components carry their own built-in I/O buffers. IO Break cells are inserted at the boundaries of the analog and digital pad areas. Effectively, the integrated circuit is divided into a digital section and an analog section.

In the sections that follow, we demonstrate, through the design implementation of this video encoder, the thorough chip and power planning required of an SoC design, the detail design and verification steps needed to implement the blocks and to assemble the chip, as well as the comprehensive timing modeling, signal integrity analysis and fixes performed prior to timing sign off.

# 3. Design and verification flow

The design and verification process of the semiconductor integrated circuit starts from micro-architecture specification which usually is part of the hardware system specification. Front-end designers code up RTL according to the micro-architecture spec. System validators set up test benches around the chip under development. The test benches are then applied against RTL to check if it functionally produces desired outputs. Meanwhile physical designers prepare design constraints at the full chip level, and design budgets for major soft blocks. These constraints are interpreted as goals for path based timing optimization. They are therefore key to timing closure flow from logic synthesis to physical optimization. After constraints are defined, netlist consisting of (placed) collateral components can be obtained through a flow step called (physical) synthesis. Once the netlist is available, or sometimes when a good portion of it is synthesized with the rest of the design encapsulated in a skeletal form, we may start floor planning.

Floor planning is the first and one of the most critical backend physical design steps. It forms physical constraints for the chip and its subblocks. It is often a compromise between board level design constraints including system data accesses, power supply locations, and the internal core design requirements such as data flow and timing. In this particular design, the clock synthesizers are lined up on the bottom of the die, with the crystal oscillator reference clock located at the upper right portion of the pad ring. DAC and thermal sensor are given the right section of the die. Break cells are inserted between the reference clock driver pad and the digital IO pads, as well as above the PLL1 I/O buffers at the bottom. As can be seen, an analog sub-section is now formed along the right and bottom portions of the die. The rest of the die belongs to the digital core. Power planning and macro placement are completed on this part of the die as part of floor planning. Full chip physical structure is demonstrated in Figure 2.

After floor plan is ready, we push the design through the automated "RTL-to-GDSII" flow to complete the digital core design and assembly. The flow diagram is shown in Figure 3. In our flow, DFT insertion is a default operation, facilitating post silicon ATPG tests. RTL is mapped into gate-level netlist after DFT synthesis. The latter is then imported into the "physical design database", in our case Milkyway<sup>™</sup>, which serves as the central repository of design information, e.g. the gatelevel netlists, placement and routing information, clock-tree structure, etc. Physical design flow after floor planning starts with placement; With an initial placement, more accurate net delays can be obtained based on Steiner route distances of interconnects; An in-place timing optimization can then be carried out to try bring max path delay within constraints; Clock-tree and high fan-out nets are synthesized followed by a post-placement optimization step to further close in on max delays and to fix min delay violations. At this point a formal verification step against the synthesized netlist is recommended in case of any unexpected errors that get registered into the database. If the design database is consistent, the flow continues onto routing steps in which power rails are connected up to the straps, clock and signal nets are properly routed abiding by the design rules. The design is then extracted and taken into the timing sign off engine, where layout parasitic is annotated to the corresponding nets and path delays are checked against design constraints. ECOs are normally needed to bring timing into final closure. Formal verification between final postlayout netlist and synthesized netlist is performed again to ensure conformity to the original design. With all verification results clean, the database is ready for GDSII stream out and physical verification. At this point, the chip should be for tape out. It is highly suggested that a "post-layout" functional validation be performed over the delay annotated final netlist. This will further help prevent potential implementation incurred errors from entering the fabrication process.



Figure 2. Full Chip Physical Architecture

It is always helpful to be extra careful by performing additional verification tasks along the flow. Formal verification can be checked whenever there is a change in the netlist. Timing checks can be performed after a layout or netlist change. It is not necessary to go through the whole flow over again in case of violations. Back tracking the design steps, exporting the corresponding netlist and/or layout information will help determine at which step the failure occurs. A fix at the failure point normally suffices. However, completing the rest of the flow from the point of a design fix is required.

Following the flow described above, we managed to complete the design of the multi-functional integrated digital video encoder.



Figure 3. Design and Verification Flow

# 4. Signal integrity and timing closure

In this paper, we consider power ground robustness, clock skew and latency part of signal integrity issues in addition to the commonly recognized signal strengths and cross-talk effects between neighboring nets.

The design of power mesh structure needs to meet both specification of maximum deviations from the supplies and the metal electromechanical rules. Due to the isolation between analog and digital sections, digital power and ground are initially only supplied from the top and left sides of the pad ring. Preliminary analyses indicate that voltage drops around the lower right digital core is larger than the 100mv target spec, potentially causing the chip to malfunction. Two sets of power pads have been inserted in between the analog IP blocks, one at the bottom and the other on the right side. Although both need to extend a long metal span before reaching the digital core, layout techniques help keep the final voltage drop

below 78mv over this 5-layer-metal 0.18um CMOS design.

Clock tree synthesis has also been successful. Three major driving clocks, the 10MHz system clock, the 12-100MHz programmable pixel clock, and the 54-100MHz programmable video clock, all achieved satisfactory core clock skews of less than 200ps target. Considering that the reference clock (crystal oscillator) needs to drive half of the die to the PLLs, full shielding is used for signal integrity.

Cross talk figures are quite ideal as well. According to AstroRail, maximum cross talk impact between nets is estimated at 0.23Vcc away from the supply voltages, not significant enough to cause a logic error. From PrimeTime™ reports, we achieved timing closure with all timing related measurements below the target maximum by sufficient margins.

In summary, the application-specific digital video encoder is implemented with satisfactory quality by all design metrics.

## 5. Collateral and IP qualifications

The quality of the integrated circuits depends not only on the quality of the design and verification tools and flows, but also on the quality of the silicon collaterals and IP blocks. The latter includes the silicon qualified design components, and equally important, their logical, physical, and timing models chip assembly is based on. Collateral check up, even though may have already been done in other designs, needs to be carefully conducted before the design starts. Among others, we look for the richness of logic functions, various drive capabilities, timing characteristics, physical dimensions, etc. in the standard cell library and compiled memories. For IO buffers, ESD structure and its usage are among the items examined. This process helps designers familiarize with what will come in handy for the design, making design ECOs easier during timing closure.

IP qualifications are harder to perform, since most IP blocks are analog in nature. Not only does it take a much longer time to complete the simulations, but it calls for a detailed understanding of analog circuits along with an intimate familiarity with the target process as well.

Checking IP vendor's simulation waveforms against their datasheet specifications is one of the qualification tasks in our predesign process. We also managed to perform sample tests over a few of the five PLLs and the DAC integrated on the chip.

#### 6. Silicon test and diagnostics

As much as any designer would like the silicon to function in the manner simulation or analysis tools describe how it would, mismatches in behavior do occur. This is simply because the design flow takes abstracted models of logic gates or block components for capacity reasons. Knowing the imperfections, guard bands are placed in the timing closure flow wherever applicable. The silicon functions as desired in general. However, an accident did occur. One of the data signals driven from the core to an IP block showed weakness in signal strength during test under worst corner operating condition (Figure 4, the net on top, clock at bottom). Root cause has been identified as that of an inaccurate pin capacitance characterized in the IP model. A smaller than real capacitive value tricked the design tool to downsize the buffer which drives the pin, resulting in an under driven net. With an increase in supply voltage, the signal regained its strength under the typical operating condition (Figure 5). This example further emphasizes the importance of model accuracy, particularly the interface characteristics of a component block.



Figure 4. Weak net characteristic (worst condition)



Figure 5. Same net with higher Vcc (typical condition)

At this point, we declare with surety that the flow, including all design software and collateral IP components, has delivered a quality silicon system without a re-spin.

## 7. Conclusion

We have demonstrated in this paper a complete design and verification flow that successfully delivered a video encoder ASIC on a TSMC 0.18um 5LM CMOS process. Because of the flow, we managed to complete the design from RTL to GDSII in a short period of time, in parallel with IP co-development. We also illustrate the importance of design collaterals and IP blocks as an integral part of the flow. Through post silicon debug, we emphasize the criticality of the accuracy of an IP model, irrespective of its silicon quality.

We find, with satisfaction, that our design flows today have reached a mature level. Design tools have delivered what's needed to develop integrated circuits on time and with satisfactory silicon correlation. Meanwhile, IP modeling and qualification remain an area of development. This is partly due to the variety and complexity of IP components. It calls for an accumulation of expertise and continued development in the area.

## References

[1] Kuh, E.; "Elementary Operations which Generate Network Matrices"; *Circuit Theory, IRE Transactions on, Volume: 3, Issue: 2, June 1956* [2] Ting, B. et. al.; "The multilayer routing problem: Algorithms and necessary and sufficient conditions for the single-row, single-layer case"; *Circuits and Systems, IEEE Transactions on, Volume: 23, Issue: 1,* December 1976

[3] Goto, S. et. al.; "All approach to the two-dimensional placement problem in circuit layout"; *Circuits and Systems, IEEE Transactions on, Volume: 25, Issue: 4*, April 1978

[4] de Geus, A.J.; "Logic synthesis speeds ASIC design"; Spectrum, IEEE, Volume: 26, Issue: 8, August 1989

[5] Palnitkar, Samir; *Verilog HDL*; SunSoft Press; 1996

[6] Andy Grove; www.intel.com