First, such processors often already utilize several power reduction techniques, and these existing optimizations can have a huge impact on the effectiveness of further optimizations. Second, highly optimized production code tends to have significantly less schedule slack and significantly higher density of memory accesses than unoptimized code. Finally, many such studies are done using high-level simulators, which may not accurately model the power consumption of real microprocessors. In addition, in this study we focus on an embedded, synthesized processor, rather than a high performance custom and hand designed stand-alone microprocessor; a 400MHz synthesized core (the TriMedia TM3270) has significantly different characteristics than a 3GHz Pentium.
We carefully analyze the power consumption of the TriMedia TM3270, a commercial product, on both reference benchmark code and optimized code. We use commercial synthesis and simulation tools to obtain a detailed breakdown of where power is consumed. We find that increased functional unit utilization causes significant differences in power consumption between unoptimized and carefully hand-optimized code. We also apply some simple techniques for power savings with no performance degradation, and find that such techniques can greatly change the power profile of a microprocessor. We find that clock gating of individual functional units is vital to keeping the dynamic power low. Finally, we find that synthesizing for the fastest target frequency possible at a given voltage yields the most energy-efficient design.