FPGAs in the Storm

EETimes

The 35-year-old field programmable gate array (FPGA) is one of the most impressive semiconductor devices ever created, short of the central processing unit (CPU). Today, it is more popular than ever before, reinvigorated by its intrinsic raw processing power and adaptability, a perfect match for accommodating the rapid evolution of the electronic industry. It does not show its age. Oddly, its very versatility is creating a dilemma for FPGA vendors.

Communications and networking stood out among applications that drove the FPGA’s early popularity because of frequent changes in their standards that necessitated field updates/grades, inconceivable with hard-wired ASIC implementations. In later years, a wide range of automotive, military, consumer and defense applications have been using FPGAs. Lately, the FPGA renaissance has been bolstered by progress in Artificial Intelligence (AI) technologies and proliferation of AI applications. Prominent among AI applications is deep-learning acceleration (DLA).

An article by Stephen M. Trimberger, Ph.D. and Xilinx fellow, titled “The Three Ages of FPGAs: A Retrospective on the First Thirty Years of the FPGA Technology,” traces the evolution of the FPGA from its origin in 1984 to 2015.

The Ages of the FPGA

Trimberger’s article, published in the March 2015 IEEE Proceedings, names three ages in this succession:

  1. Age of Invention (1984-1991)
  2. Age of Expansion (1992-1999)
  3. Age of Accumulation (2000-2007)

Following the “Age of Accumulation” through 2015, the author submits that FPGA technology entered the age of “No Longer Programmable Logic.” Ironically, the title contradicts the essence of the device as captured in its name, but it reflects the reality of its contents. In fact, calling it the age of “No Longer Programmable Logic” beyond 2015 may be appropriate. Actually, the structures of the latest FPGAs would be better served by renaming it the “Age of Artificial Intelligence.”

Certainly, the new Xilinx adaptive compute acceleration platform (ACAP), first implemented on the Versal family, reinforces the case for renaming it the “AI age.” According to Victor Peng, Xilinx’s president and CEO, the ACAP is an FPGA logic fabric that includes multiple levels of distributed memory, hardware-programmable digital signal processor (DSP) blocks, a multicore system on chip (SoC), one or more compute engines that are software programmable and hardware adaptable. Everything is connected via a low-latency, high-frequency on-chip network with arbitration and flow control. The platform also includes integrated RF-ADCs/DACs, support a High-Bandwidth Memory (HBM) stack, various generations of DDR, and advanced SerDes technology. See figure 1.

The block structures are brains of DLAs.

Show me the money

According to a market analysis by Barclays Research published in 2018, the DLA market can be broken in three buckets: machine learning training in the datacenter, inference in the datacenter, and inference at the edge (see the chart below).

The study predicted that until 2020, about two thirds of semiconductor sales would come from chips accelerating neural networks training, mostly from Nvidia GPUs. Beyond 2020, it projected a moderate growth in semiconductors for training neural networks with an explosive increase for inference in the data center and at the edge, reaching high double-digit growth in 2021 and keeping growing at the same rate. Specifically, Barclays anticipated that in 2021, edge inference would grow slightly faster than data center inference, since data at the edge would need local pre-processing to prevent the latency affecting data transfer to the data center for processing. From 2022 onward, inference would drive around 3.6X times as much semiconductor revenue as training would.

As for the technologies fueling DLA, such as CPUs, GPUs, FPGAs and ASICs, analysis after analysis consistently favor escalating growth in ASICs and GPUs, moderate growth for FPGAs and contraction for CPUs.

Obviously, FPGA vendors have been responding to the projections by stuffing more and more functional blocks to address DLA, trading off silicon area with traditional lookup tables (LUT). In the process, they are becoming more like re-programmable ASICs than FPGAs.

Arguably, the potential for driving DLAs with FPGAs motivated Intel to acquire Altera in 2015, and AMD to make an acquisition offer to Xilinx in 2020. Good news for Achronix, the new lion dominating the FPGA landscape.

FPGAs and hardware-assisted verification platforms

Assuming the trend will continue, it is sensible to ask what the future of hardware-assisted design verification tools based on commercial FPGAs will be.

Known as FPGA prototype platforms, they make extensive use of re-programmable logic provided by embedded LUTs to virtually map any design, regardless of size, architecture and target application. With design sizes reaching 10-billion ASIC-equivalent gates, prototyping vendors are designing cabinets loaded with several boards stuffed with vast arrays of the largest FPGAs in the quest to support massive amounts of LUTs.

This poses a dilemma for FPGA providers. To effectively support FPGA prototype platforms, FPGAs should trade off DSPs, SoCs and other embedded blocks and processors for LUT area. To serve the DLA market, however, FPGAs should trade off generic logic area for specialized silicon area. The two requirements are not reconcilable, leaving the FPGA vendor in quandary.

Should FPGA vendors create two distinct FPGA devices, one for less than one-billion dollars total available market of hardware-assisted verification platforms, and another to fully embrace the DLA venue and serve a several billion-dollar market (and growing)? In fact, they may have an opportunity to charge a premium for programmable devices targeting prototyping platforms.

Final thought

Since 1985, FPGAs evolved over five generations. In its present incarnation, its fabric is a far cry from its original structure. Today, it reflects the evolution of the electronic chip, from a monolithic construct that used to be well served by a sea of LUTs, to a hierarchical, multi-block assembly that calls for a mirror structural image.

Recently, Trimberger shared with me his thinking about the future of FPGAs, and elected to name it the “Age of Computation,” in expectation that software processing will play a fundamental role in them. In his words, “In today’s devices I/O communications infrastructure can consume over 50% of silicon area, in tomorrow’s over 50% could be consumed by software computation units.”

From my perspective, that future may or may not extend to prototyping platforms.