Description
In this dissertation, we explore algorithmic and data structure-driven solutions to these system design problems. We present Succinct, a distributed data store that addresses these challenges using a fundamentally new approach --- executing a wide range of queries (e.g., search, random access, range, wildcard) directly on a compressed representation of the input data --- thereby enabling efficient execution of queries on data sizes much larger than DRAM capacity. We then describe BlowFish, a system that builds on Succinct to enable a dynamic storage-performance tradeoff in data stores, providing applications the flexibility to modify the storage and performance fractionally, just enough to meet the desired goals. Finally, we explore approaches that enable even richer query semantics on compressed data, including graph queries using ZipG, a memory efficient graph store, and regular expression queries using Sprint, a query rewriting technique.