This report makes the case that a well-designed Reduced Instruction Set Computer (RISC) can match, and even exceed, the performance and code density of existing commercial Complex Instruction Set Computers (CISC) while maintaining the simplicity and cost-effectiveness that underpins the original RISC goals.We begin by comparing the dynamic instruction counts and dynamic instruction bytes fetched for the popular proprietary ARMv7, ARMv8, IA-32, and x86-64 Instruction Set Architectures (ISAs) against the free and open RISC-V RV64G and RV64GC ISAs when running the SPEC CINT2006 benchmark suite. RISC-V was designed as a very small ISA to support a wide range of implementations, and has a less mature compiler toolchain. However, we observe that on SPEC CINT2006 RV64G executes on average 16% more instructions than x86-64, 3% more instructions than IA-32, 9% more instructions than ARMv8, but 4% fewer instructions than ARMv7.CISC x86 implementations break up complex instructions into smaller internal RISC-like micro-ops, and the RV64G instruction count is within 2% of the x86-64 retired micro-op count. RV64GC, the compressed variant of RV64G, is the densest ISA studied, fetching 8% fewer dynamic instruction bytes than x86-64. We observed that much of the increased RISC-V instruction count is due to a small set of common multi-instruction idioms. Exploiting this fact, the RV64G and RV64GC effective instruction count can be reduced by 5.4% on average by leveraging macro-op fusion. Combining the compressed RISC-V ISA extension with macro-op fusion provides both the densest ISA and the fewest dynamic operations retired per program, reducing the motivation to add more instructions to the ISA. This approach retains a single simple ISA suitable for both low-end and high-end implementations, where high-end implementations can boost performance through microarchitectural techniques.




Download Full History