Manipulation-resistant online learning

Christiano, Paul

PDF

Description

Learning algorithms are now routinely applied to data aggregated from millions of untrusted users, including reviews and feedback that are used to define learning systems’ objectives. If some of these users behave manipulatively, traditional learning algorithms offer almost no performance guarantee to the “honest” users of the system. This dissertation begins to fill in this gap.

Our starting point is the traditional online learning model. In this setting a learner makes a series of decisions, receives a loss after each decision, and aims to achieve a total loss which is nearly as low as if they had chosen the best fixed decision-making strategy in hindsight.

We extend this model by introducing a set of users U. Each of the learner’s decisions is made on behalf of a particular user u ∈ U, and u reports the loss they incur from the decision. We assume that there is some (unknown ) set of “honest” users H ⊂ U, who report their losses honestly, while the other users may behave adversarially. Our goal is to ensure that the total loss incurred by users in H is nearly as small as if all users in H had used the single best fixed decision-making strategy in hindsight. We say that an algorithm is manipulation-resistant if it achieves a bound of this form.

This dissertation proposes and analyzes manipulation-resistant algorithms for prediction with expert advice, contextual bandits, and collaborative filtering. These algorithms guarantee that the honest users perform nearly as well as if they had known each others’ identities in advance, pooled all of their data, and then used a traditional learning algorithm. This bounds the total amount of damage that can be done per manipulative user. More significantly, we give bounds that can be considerably smaller in the realistic setting where the users are vertices of a graph (such as a social graph) with disproportionately few edges between honest and manipulative users.

As a key technical ingredient, we introduce the problem of online local learning, and propose a novel semidefinite programming algorithm for this problem. This algorithm allows us to effectively perform online learning over the exponentially large space of all possible sets H ⊂ U, and as a side-effect provides the first asymptotically optimal algorithm for online max cut.

Details

Title

Manipulation-resistant online learning

Creator

Christiano, Paul, Author

Published

2017-05-15

Full Collection Name

Electrical Engineering & Computer Sciences Technical Reports

Other Identifiers

EECS-2017-107

Type

Text

Format

technical reports

Extent

60 p

Archive

The Engineering Library

Usage Statement

Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).

Collection

EECS Technical Reports

Files

Statistics

Download Full History

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket