Hardware Emulation: Three Decades of Evolution – Part II
- June 12, 2015
- Posted by: Lauro Rizzatti
- Category: 2015
THE SECOND DECADE
In the second decade, the hardware emulation landscape changed considerably with a few mergers and acquisitions and new players entering the market. The hardware emulators improved notably via new architectures based on custom ASICs. The supporting software improved remarkably and new modes of deployment were devised. The customer base expanded outside the niche of processors and graphics, and hardware emulation slowly attracted more and more attention.
While commercial FPGAs continued to be used in mainstream emulation systems of the time (i.e., Quickturn, Zycad and IKOS) four companies — three startups plus IBM — pioneered different approaches.
IBM continued the experimentation it started a decade earlier with the YSE and EVE. By 1995, it had perfected its technology, based on arrays of simple Boolean processors that processed a design data structure stored in a large memory via a scheduling mechanism. The technology was now applicable to emulation. While IBM never launched a commercial product, in 1995 it signed an exclusive OEM agreement with Quickturn that gave the partner the right to deploy the technology in a new emulation product.
By then, Quickturn grew disappointed with the difficulties posed by the adoption of a commercial FPGA in an emulation system. To reach adequate design capacity, it was necessary to interconnect many hundreds of FPGAs mounted on several boards. Partitioning and routing such a huge array of FPGAs became a challenging task, with setups in the order of many months. Design visibility had to be implemented through the compilation process that competed for routing resources with the DUT, and killed fast design iterations. Finally, the system did not scale linearly at the increase of design size, suffering significant performance drops.
The IBM technology promised to address all of these shortcomings:
- Very slow setup and compilation time
- Rather poor debugging capabilities
- Significant drop in execution speed at the increase of design size
A drawback of that technology not appreciated at the time was potentially higher power consumption than in the FPGA approach for the same design capacity.
In 1997, Quickturn introduced the Concurrent Broadcast Array Logic Technology (CoBALT) emulator, based on the IBM technology, that became known as processor-based emulator.
In 1998, Cadence® purchased Quickturn and over time launched five generations of processor-based emulators under the name of Palladium®. Two or so years later, Cadence discontinued the FPGA-based approach, including an experimental custom FPGA-based emulator called Mercury Plus.
The idea of developing a custom FPGA targeted to emulation came from a French startup by the name of Meta Systems 1 . Conceived as a programmable device similar to an FPGA but customized for emulation applications, the Meta custom FPGA would have been a poor choice as a general-purpose FPGA. Its fabric included configurable elements, a brilliant interconnect matrix, embedded multi-port memories, I/O channels, a debug engine with probing circuitry based on on-board memories, and clock generators.
The approach yielded three benefits:
- Easy setup time and fast compilation time
- Total design visibility without compilation
- Scalability at the increase of design size
In fact, the Meta custom FPGA provided the same benefits of the processor-based approach with less power consumption.
The processor-based approach was not unique to IBM. It was also used by Arkos, a startup with a lifespan of a falling star in a clear August night. After being acquired by Synopsys® in 1996, it was sold soon after to Quickturn.
In the course of the second decade, significant progress was made in several aspects of the hardware emulator. For example, by the mid-2000s, design capacity increased more than 10-fold to 20+ million ASIC-equivalent gates in a single chassis. By then, all vendors supported multi-chassis configurations that expanded the total capacity to well over 100 million gates. Speed approached the threshold of 1MHz. Multiple concurrent user capabilities began to show up in datasheets.
Major enhancements were made in the supporting software. The compiler technology saw progress across the board. The two popular HDL languages, Verilog and VHDL, were supported. Synthesis and partitioning were improved.
New modes of deployment were concocted, in addition to ICE. It was now possible to connect an HDL testbench running on the host PC to a DUT mapped inside the high- speed emulator. This approach leveraged the existing RTL/HDL testbench and eliminated the need for external rate adapters, necessary to support ICE. It became known as simulation acceleration mode. As good as it sounded, it traded speed for flexibility. The weak link was the PLI interface between the simulator in charge of the testbench and the emulator in charge of the DUT. Typically, the acceleration factor was limited to a low single digit.
To address this drawback, IKOS2 pioneered a new approach called transaction-based acceleration or TBX3 . TBX raised the abstraction level of the testbench by moving the signal-level interface to the emulated DUT within the emulator and introducing a transaction-level interface in its place. The scheme achieved up to a million times faster execution speed, and simplified the writing of the testbench.
Another mode of deployment, called targetless emulation, consisted of mapping the testbench together with the DUT onto the emulator. By removing the performance dependency on the software-based testbench executed on the host PC, it was possible to achieve the maximum speed of execution allowed by the emulator. The caveat was that the testbench had to be synthesizable, hence the name of Synthesizable Testbench (STB) mode.
Debugging also improved radically. One of the benefits of the processor-based emulators as well as of the custom FPGA-based emulators was 100% visibility into the design at run-time without requiring compilation. This led to very fast iteration times.
The cost of emulation decreased on a per-gate basis by 10X.
By the turn of the century, it seemed that emulators built on arrays of commercial FPGAs were destined for the dust bin. But two startups proved that premise to be false.
Although only a few years had passed since Quickturn’s dreadful experience with commercial FPGAs, a new breed of FPGAs developed by Xilinx and Altera changed the landscape forever. Fully loaded with programming resources and enriched with extensive routing resources, they boasted high capacity, fast speed of execution and faster place & route time. The Virtex® family from Xilinx also included a read-back mechanism that provided full visibility of all registers and memory banks without requiring compilation. This capability came at the expense of a dramatic drop in speed during the read- back operation. All of the above were a windfall for two new players.
In 1999, Axis4, a startup in Silicon Valley led by entrepreneurs from China, introduced a simulation accelerator based on a patented Re-Configurable Computing (RCC) technology that provided accelerated simulation. The technology was implemented in an array of FPGAs called Excite. This was followed by an emulator built on the same technological foundation with the name Extreme. Extreme became successful for the ability to swap a design from the emulator onto a proprietary simulator to take advantage of the debugging interactivity of the simulator. This feature was called Hot-Swap.
On the other side of the Atlantic, a French startup named Emulation Verification Engineering (EVE) led by four French engineers who left Mentor Graphics in 2000 developed an emulator implemented on a PC card with two of the largest Xilinx Virtex-6000/8000 devices. The product name was ZeBu for Zero-Bugs. The implementation did not support ICE. Instead, it promoted transaction-based emulation based on a patented technology called “Reconfigurable Testbench” (RTB). The team also harnessed the read-back feature of the Virtex devices to implement 100% design visibility at run-time without compilation. As mentioned, the drawback was a drop in performance during the reading process.
|Table 2: The table summarizes the characteristics of hardware emulators, circa 2005.|
|Architectures||Arrays of Processors, custom FPGAs, Commercial FPGAs|
|Total Design Capacity||Over 100 million gates (*)|
|Deployment Modes||ICE –– Simulation Acceleration –– TBX –– STB|
|Speed of Emulation||Up to 1MHz|
|Time to Emulation||Up to 30MG/hour (**)|
|Ease of Use||Medium|
|Concurrent Users||Yes, max number dependent on the emulator|
|Dimensions||Similar to small home refrigerators|
|Reliability (MTBF)||Several weeks|
|Typical Cost||10 cents/gate|
(*) Based on multi-box configurations (**) Requirements: A single PC with processor-based emulator; PC farms with FPGA-based emulators.
By the end of the second decade, for the first time hardware emulation was being considered by companies outside its traditional core use of processors and graphics designs. Now designs in fields as different as embedded processors, networking, storage, video, multimedia, etc., started to adopt hardware emulation.
- Meta Systems was acquired by Mentor Graphics in 1996.
- IKOS was acquired by Mentor in 2002.
- Today, different vendors call it Transaction-based verification (TBV) or transaction-based acceleration (TBA).
- Axis was acquired by Verisity on November 16, 2004. Three months later, Cadence purchased Verisity.