We study the implementation of a hardware accelerator that computes a dot product of IEEE-754 floating-point numbers exactly. The accelerator uses a wide (640 or 4288 bits for single or double-precision respectively) fixed-point representation into which intermediate floating-point products are accumulated. We designed the accelerator as a generator in Chisel, which can synthesize various configurations of the accelerator that make different area-performance trade-offs. We integrated eight different configurations into an SoC comprised of RISC-V in-order scalar core, split L1 instruction and data caches, and unified L2 cache. In a TSMC 45 nm technology, the accelerator area ranges from 0.05mm^2 to 0.32mm^2, and all configurations could be clocked at frequencies in excess of 900 MHz. The accelerator successfully saturates the SoC’s memory system, achieving the same per-element efficiency (1 cycle-per-element) as Intel MKL running on an x86 machine with a similar cache configuration.
Title
A Hardware Accelerator for Computing an Exact Dot Product
Published
2018-05-11
Full Collection Name
Electrical Engineering & Computer Sciences Technical Reports
Other Identifiers
EECS-2018-51
Type
Text
Extent
33 p
Archive
The Engineering Library
Usage Statement
Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).