Learning Objective Functions from Many Diverse Signals

Milli, Smitha

PDF

Description

Specifying the correct objective function for a machine learning system is often difficult and error-prone. This thesis focuses on learning objective functions from many kinds of human inputs. It is comprised of three parts. First, we present our formalism, reward-rational choice (RRC), that unifies reward learning from many diverse signals. The key insight is that human behavior can often be modeled as a reward-rational implicit choice — a choice from an implicit set of options, that is approximately rational for the intended reward. We show how much of the prior work, despite using many diverse modalities of feedback, can be seen as an instantiation of RRC.

In the second part, we discuss implications of the RRC formalism. In particular, it allows us to learn from multiple feedback types at once. Through case studies and experiments, we show how RRC can be used to combine and actively select from feedback types. Furthermore, once a person has access to multiple types of feedback, even their choice of feedback type itself provides information to learn the reward function from. We use RRC to formalize and learn from a person’s meta-choice, the choice of feedback type itself.

Finally, in the third part, we study settings in which the human may violate the reward-rational assumption. First, we consider the case where the human may be pedagogic, i.e., optimizing for teaching the reward function. We show that the reward-rational assumption provides robust reward inference even when the human is pedagogic. Second, we consider the case where the human may face temptation and act in ways that systematically deviate from their target preferences. We theoretically analyze such a setting and show that, with the right feedback type, one can still efficiently recover the individual’s preferences. Lastly, we consider the recommender system setting. There, it is difficult to model all user behaviors as rational, but by leveraging one strong, explicit signal (e.g. “don’t show me this”), we are still able to operationalize and optimize for a notion of “value” on these systems.

Details

Title

Learning Objective Functions from Many Diverse Signals

Creator

Milli, Smitha, Author

Published

EECS Department, University of California at Berkeley, Berkeley, California, 08/10/22

Full Collection Name

Electrical Engineering & Computer Sciences Technical Reports

Other Identifiers

EECS-2022-190

Type

Text

Format

technical reports

Extent

100 p

Language

eng

Usage Statement

Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).

Collection

EECS Technical Reports

Files

Statistics

Download Full History

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket