Privacy is becoming an increasingly important issue in electronic commerce and other online activities that are growing in popularity. This work introduces a framework, called Peers for Privacy (P4P), for implementing many useful algorithms with provable privacy and adequate efficiency in a realistic adversary model at a reasonably large scale. The basic idea is to decompose an algorithm into a series of addition-only steps, which have very efficient private implementation using cryptographic tools. This simple model is surprisingly general and supports many algorithms prevalent in distributed data mining. Examples include linear algorithms like voting and summation, as well as nonlinear algorithms such as regression, classification, SVD, PCA,
k-means, ID3, machine learning algorithms based on Expectation Maximization (EM), etc. In fact all algorithms in the statistical query model are supported.
The computation of the sums is based on a highly efficient verifiable secret sharing (VSS) scheme that allows secret-shared arithmetic operations to be done over small fields (e.g. 32 or 64 bits) where private arithmetic operations have the same cost as normal arithmetic. This thesis shows that this paradigm admits efficient zero-knowledge tools that can be used to verify the properties of user data such as equality and boundedness. These tools provide practical mechanisms to deal with cheating users. One such tool is an extremely efficient zero-knowledge proof that verifies the L2-norm of the user data is bounded by a constant. This is to prevent a malicious user from exerting too much influence on the computation. The verification uses a linear number of inexpensive small field operations, and only a logarithmic number of large-field (1024 bits or more) cryptographic operations, and can achieve orders of magnitude reduction in running time over standard techniques (from hours to seconds) for large-scale problems. Concrete examples are given to demonstrate how the framework supports private computation of popular algorithms such as SVD, link analysis and association rule mining. The thesis also includes schemes for scalable multicast encryption and bidirectional group communication. They provide secure data transmission support for the type of communication pattern required by the P4P framework and many other group-oriented applications.
Title
P4P: A Practical Framework for Privacy-Preserving Distributed Computation
Published
2007-12-19
Full Collection Name
Electrical Engineering & Computer Sciences Technical Reports
Other Identifiers
EECS-2007-165
Type
Text
Extent
154 p
Archive
The Engineering Library
Usage Statement
Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).