We first discuss the variations on the bisection algorithm for finding eigenvalues of symmetric tridiagonal matrices. We show the challenges in implementing a correct algorithm with floating point arithmetic. We show how reasonable looking but incorrect implementations can fail. We carefully define correctness, and present several implementations that we rigorously prove correct.
We then discuss a fast implementation of bisection using parallel prefix. We show many numerical examples of the instability of this algorithm, and then discuss its forward error and backward error analysis. We also discuss possible ways to stabilize it by using iterative refinement.
Finally, we discuss how to use a divide-and-conquer algorithm to compute the singular value decomposition and solve the linear least squares problem and how to implement Van Loan's algorithm for the generalized singular value decomposition using this divide-and-conquer algorithm. We show how our implementations achieve good speedups over the previous implementations. For example, on an IBM RS6000/590, our implementation runs 50 times faster than LAPACK's implementation for computing the bidiagonal SVD, and 13 times faster for computing the dense SVD for 1600 x 1600 random matrices.