With the advent of massive datasets, statistical learning and information processing techniques are expected to enable exceptional possibilities for engineering, data intensive sciences and better decision making. Unfortunately, existing algorithms for mathematical optimization, which is the core component in these techniques, often prove ineffective for scaling to the extent of all available data. In recent years, randomized dimension reduction has proven to be a very powerful tool for approximate computations over large datasets. In this thesis, we consider random projection methods in the context of general convex optimization problems on massive datasets. We explore many applications in machine learning, statistics and decision making and analyze various forms of randomization in detail. The central contributions of this thesis are as follows: (i) We develop random projection methods for convex optimization problems and establish fundamental trade-offs between the size of the projection and accuracy of solution in convex optimization. (ii) We characterize information-theoretic limitations of methods that are based on random projection, which surprisingly shows that the most widely used form of random projection is, in fact, statistically sub-optimal. (iii) We present novel methods, which iteratively refine the solutions to achieve statistical optimality and enable solving large scale optimization and statistical inference problems orders-of-magnitude faster than existing methods. (iv) We develop new randomized methodologies for relaxing cardinality constraints in order to obtain checkable and more accurate approximations than the state of the art approaches.