Hardware Emulation: Three Decades of Evolution – Part III
- November 11, 2015
- Posted by: Lauro Rizzatti
- Category: 2015
THE LAST DECADE
At the beginning of the third decade, circa 2005, system and chip engineers were developing evermore complex designs that mixed many interconnected blocks, embedded multicore processors, digital signal processors (DSPs) and a plethora of peripherals, supported by large memories. The combination of all of these components gave real meaning to the designation system on chip (SoC). The sizes of the largest designs were pushing north of 100-million gates. Embedded software—once virtually absent in ASIC designs—started to implement chip functionality.
By the end of the hardware emulation’s third decade, i.e., 2015, the largest designs are reaching into the billion-gate region, and well over 50% of functionality is now realized in software. Design verification continues to consume between 50% and 70% of the design cycle.
These trends became tremendous drivers for hardware emulator vendors to redesign and enhance their tools, opening the door to opportunities in all fields of the semiconductor business, from multimedia, to networking, to storage.
Following a decade of turbulence characterized by the sudden appearance of new players and, just as quickly, their demise, and a few mergers and acquisitions, by 2005, only three players competed in the emulation market: Cadence® Design Systems, Mentor Graphics® and EVE (Emulation and Verification Engineering), each promoting its own architecture.
After dropping the commercial FPGA-based approach around 2000, Cadence opted for the processor-based emulation technology. It was first introduced in 1997 by Quickturn Design Systems under the commercial name of CoBALT™ (Concurrent Broadcast Array Logic Technology). Based on an IBM technology, the processor-based emulator became one of Cadence’s showcase technologies after it acquired Quickturn in 1999. Under the new ownership, five generations of said technology were introduced with the name of Palladium®. Although Cadence does not release what process technology node has been used in the various versions, the best guess is listed in table I.
Table I. Here’s a look at five generations of Palladium beginning with its introduction in 2002, and the best-guess process node technology used in each.
While the principle of operation has been the same in all versions, each new generation expanded design capacity, improved speed of execution, added new capabilities for design analysis and design debug, and fine-tuned and accelerated the compilation process. The main advantages of Palladium have always been fast compile time and 100% design visibility at full emulation speed without compilation.
Palladium also excels in in-circuit-emulation (ICE) mode, supported by a large catalog of speed bridges. The main drawbacks have been large physical dimensions and higher power consumption when compared to an FPGA-based emulator with an equivalent design capacity. The impact of this drawback has affected scalability. Palladium-XP2 would require 32 boxes to reach the maximum capacity of two-billion gates as specified in the datasheet. Finally, in transaction-based acceleration, Palladium is reported to perform at lower speed than its competitors.
After acquiring IKOS Systems in 2002, Mentor Graphics undertook the challenging task of merging the custom FPGA-based technology inherited via the acquisition of Meta Systems in 1996, with the Virtual Machine Works technology implemented in the IKOS emulators. The marriage of the two patented approaches parented the Veloce® emulator, launched in 2007 with the tagline of “emulator-on-chip.” Fast and easy design setup, swift compile time, and 100% design visibility at full speed without compilation are the hallmarks of this approach.
The custom FPGA developed for emulation does not have the capacity of the largest commercial FPGAs on the market, leading to more custom devices to map any given design size compared to using off-the-shelf FPGAs. The net result is that the emulator has larger physical dimensions and a heavier weight with longer interconnecting wires and slower propagation delays than an emulator using commercial FPGAs.
Mentor also inherited the research IKOS conducted in the transaction-based verification field before the acquisition. Repackaged and enhanced under the name of TestBench Xpress™ (TBX), it is the most effective implementation of the transaction-based verification in emulation. Mentor took the technology a step further, by implementing VirtuaLAB, comprising specialized testbenches for specific applications, such as USB and Ethernet, in an all encompassed package.
In 2012, Mentor introduced a new generation of Veloce, called Veloce2. Veloce2 retained all the advantages of the emulator-on-chip architecture, it doubled the maximum design capacity, and it significantly expanded its design analysis capabilities, adding low-power verification, a unique power estimation approach, and functional coverage. For embedded software debugging, Mentor devised Questa® CodeLink that, unlike the JTAG approach, does not seize up the emulator and it does not intrude on or interfere with the operation of the system being run.
In the middle of the first decade of the new millennium, the merger of IKOS and Meta Systems technologies at Mentor Graphics proceeded at a slow pace. It opened the door for a startup named Emulation Verification Engineering (EVE) to play a role in this attractive market. EVE launched in 2003 its first emulation product implemented on a PC-card under the name of ZeBu®-ZV. (The name ZeBu is a contraction of zero bugs.) Unlike the custom silicon approach of Cadence and Mentor, EVE elected to use the Xilinx® Virtex®-II FPGAs as the building block for its emulator.
The pioneering product met with a discreet success, leading to a chassis-based emulation platform with vastly more Virtex-II FPGAs, called ZeBu-XL. This dual implementation approach, PC-card and chassis, became the development model for several years. Table II lists all the EVE’s emulation products and the FPGA type used in them.
Table II. Seven generations of ZeBu, their implementation footprint, the introduction year, and the Xilinx Virtex FPGA used are shown in this table.
When compared to a configuration with the same design capacity, EVE’s ZeBu platforms included fewer FPGAs in a smaller chassis than a corresponding custom-FPGA based emulator. This characteristic led to shorter propagation delays inside the emulator achieving faster speed of execution.
The faster emulation speed came at the expense of severe limitations. Long setup and compile time were drawbacks that users had to accept to gain faster emulation. While ZeBu supported 100% design visibility without compilation using the built-in Xilinx Virtex feature called “read-back,” it did so at very low speed of few tens or hundreds of hertz.
Breaking away from the typical emulation setup, ZeBu-ZV did not support the ICE mode, focusing instead on transaction-based verification. Forging a new method, it mapped the transactors in a dedicated FPGA, called flexible testbench (FTB), and clocked at twice the speed of the design-under-test (DUT) for higher bandwidth.
ZeBu-XL and all the following products supported ICE, but the FTB approach was its priority, and became a characteristic in the DNA of ZeBu ever since.
Synopsys® acquired EVE in 2012 and, two years later, launched ZeBu-Server3 based on the Xilinx Virtex-7 FPGA. The latest generation of the commercial-based emulator boasted more capacity, lower cost and higher speed. It also improved its compilation speed, and expanded the design analysis capabilities.
NEW DEPLOYMENT MODES AND NEW VERIFICATION OBJECTIVES
Hardware emulation was conceived from the beginning to be deployed in ICE mode to support real I/O traffic albeit at lower performance than the real speed. This was the trait for almost two decades. Everything changed at the beginning of the third decade, when the transaction-based methodology experienced by IKOS was adopted by all emulation vendors.
Today, hardware emulation is the only verification tool that can be deployed in several modes, some of which combined for added versatility. See table III.
Table III. Hardware emulation includes four main modes of deployment, including two sub-modes.
|Deployment Mode||How to Implement||Benefits|
|In-Circuit Emulation (ICE)||DUT mapped inside the emulator, connected to physical target system in place of a chip or processor.||Fast execution speed with real I/O traffic (interfaces via speed-bridges).|
|Transaction-Based Acceleration (TBA, TBV or TBX)||Physical target system replaced by virtual target system written in C++, SystemVerilog, or SystemC, connected to DUT via transactors.||As fast execution speed as ICE. No need for speed adapters. Deterministic debug. Unmanned remote access. Higher reliability.|
|Targetless Acceleration||• Synthesizable Testbench Acceleration. Testbench mapped inside emulator together with DUT.
• Embedded Software Acceleration. Software code executed on DUT processor mapped inside emulator.
|Fast performance mode. Deterministic debug. Unmanned remote access. Higher reliability.|
|Simulation Testbench Acceleration||RTL testbench drives DUT inside emulator via programmable logic interface (PLI).||Slowest performance mode. No need to change testbench.|
Because of this versatility, hardware emulation can be used to achieve several verification objectives. See table IV.
Table IV. Seven verification objectives are possible using hardware emulation.
|Verification Objective||What/Why/How to|
|Hardware Debugging||Foremost application, offering speed of execution between 100,000X and 1,000,000X faster than hardware description language (HDL) simulators. It provides an accurate representation of the design based on a silicon implementation before actual silicon is released by the foundry.|
|Hardware/Software Co-verification or Integration||It ensures that embedded system software works as intended with the underling hardware. It can trace a software bug propagating its effects into the hardware and, conversely, a hardware bug manifesting itself in the software’s behavior.|
|Embedded Software Validation||It provides the necessary processing power (in billions of cycles) for embedded software validation, from drivers, to operating systems, applications, diagnostics, and software-driven tests.|
|System-Level Prototyping||Only hardware emulation can accommodate more than one-billion gate designs and process several billion cycles without running out of steam.|
|Low-Power (Power Island) Verification||Low-power design verification can be achieved by modeling the switching on/off of power islands and related issues, such as retention, corruption, isolation, and level shifting, as defined in a power format file (UPF for Mentor/Synopsys or CPF for Cadence).|
|Power Estimation||Emulation can track switching activity at the functional level and generate a database to be fed to power estimation tools (Cadence and Synopsys) or directly feed the switching activity to power estimation tools via an API bypassing the file generation process (Mentor).|
|Performance Characterization||Design performance characterization can be carried out by verifying the number of cycles required for completing any specific design function.|
Today, hardware emulation is at the foundation of every verification strategy, and it has become a permanent fixture in SoC design. Embedded software is forcing project teams to debug hardware and develop software in parallel. With emulation, they now have the means to do it fast, efficiently and cost effectively. New emulation solutions are meeting the need and creating a new era for hardware and embedded software designs.