Description
The end of Moore's Law has motivated numerous innovations in computer architecture. The traditional approach of increasing frequency and numbers of transistors for general-purpose computation hardware are facing diminishing return, and we must turn to data-level parallelism and specialized hardware accelerator for performance growth. This thesis describes a simple RISC-V vector unit implementation based on a microcode expander. We found that even if we are only using one data path, we still get considerable performance improvement on some benchmark tests over scalar code. We also demonstrate that we can reuse existing hardware to implement custom instructions with minimum hardware overhead by mapping a DSL for accelerator generation onto the microcode expander.