CPU pipelines need a steady flow of instructions in order to function with maximum effectiveness. Branches which change the expected order of instruction execution require that the pipeline be reloaded, resulting in several lost machine cycles per such branch. By examining the type of branch and the past execution behavior of that branch (taken/not taken) it is possible to predict with high accuracy whether the branch will be taken or not taken, and by remembering the previous branch target (destination), to predict the current branch target. In this paper we use a systematic approach to selecting good prediction strategies. Our studies are based on 26 program address traces grouped into four IBM 370 workloads (scientific, commercial compiler, supervisor) and CDC 6400 and DEC PDP-11 workloads. Our results show the effectiveness of various prediction strategies, the number of past branches that should be remembered, the amount of state required for each and the effect of workload and branch type. Improvements of from 5% to 20% can be expected in CPU performance when a branch target buffer is installed. We also consider issues relating to the implementation of real branch target buffers.




Download Full History