As the gap between the cost of communication (i.e., data movement) and computation continues to grow, pursuing algorithms which minimize communication has become a critical research objective. Toward this end, we seek asymptotic communication lower bounds for general memory models and classes of algorithms. Recent work has established lower bounds for a wide set of linear algebra algorithms on a sequential machine and on a parallel machine with identical processors. This work extends these previous bounds to a heterogeneous model in which processors access data and perform floating point operations at differing speeds. We also present algorithms which prove that the lower bounds are tight (i.e., attainable) in the cases of dense matrix-vector and matrix-matrix multiplication.