Simultaneous support recovery in high dimensions: Benefits and perils of block $\ell_1/\ell_\infty$-regularization

Negahban, S.; Wainwright, M. J.

PDF

Description

Given a collection of $r \geq 2$ linear regression problems in $\pdim$ dimensions, suppose that the regression coefficients share partially common supports. This set-up suggests the use of $\ell_{1}/\ell_{\infty}$-regularized regression for joint estimation of the $\pdim \times \numreg$ matrix of regression coefficients. We analyze the high-dimensional scaling of $\ell_1/\ell_\infty$-regularized quadratic programming, considering both consistency rates in $\ell_\infty$-norm, and also how the minimal sample size $n$ required for performing variable selection grows as a function of the model dimension, sparsity, and overlap between the supports. We begin by establishing bounds on the $\ell_\infty$-error as well sufficient conditions for exact variable selection for fixed design matrices, as well as designs drawn randomly from general Gaussian matrices. Our second set of results applies to $\numreg = 2$ linear regression problems with standard Gaussian designs whose supports overlap in a fraction $\alpha \in [0,1]$ of their entries: for this problem class, we prove that the $\ell_{1}/\ell_{\infty}$-regularized method undergoes a phase transition---that is, a sharp change from failure to success---characterized by the rescaled sample size $\theta_{1,\infty}(n, p, s, \alpha) = n/\{(4 - 3 \alpha) s \log(p-(2-\alpha) \, s)\}$. More precisely, given sequences of problems specified by $(n, p, s, \alpha)$, for any $\delta > 0$, the probability of successfully recovering both supports converges to $1$ if $\theta_{1, \infty}(n, p, s, \alpha) > 1+\delta$, and converges to $0$ for problem sequences for which $\theta_{1,\infty}(n,p,s, \alpha) < 1 - \delta$. An implication of this threshold is that use of $\ell_1 / \ell_{\infty}$-regularization yields improved statistical efficiency if the overlap parameter is large enough ($\alpha > 2/3$), but has \emph{worse} statistical efficiency than a naive Lasso-based approach for moderate to small overlap ($\alpha < 2/3$). Empirical simulations illustrate the close agreement between these theoretical predictions, and the actual behavior in practice. These results indicate that some caution needs to be exercised in the application of $\ell_1/\ell_\infty$ block regularization: if the data does not match its structure closely enough, it can impair statistical performance relative to computationally less expensive schemes.

Details

Title

Simultaneous support recovery in high dimensions: Benefits and perils of block $\ell_1/\ell_\infty$-regularization

Creator

Negahban, S., Author
Wainwright, M. J., Author

Published

Statistics Department, University of California, Berkeley, University of California at Berkeley, Berkeley, California, May 2009

Full Collection Name

Statistics Technical Reports

Other Identifiers

774

Type

Text

Format

technical reports

Extent

42 pages

Archive

Mathematics Statistics Library

Standard Rights Statement

Transmission or reproduction of materials protected by copyright beyond that allowed by fair use requires the written permission of the copyright owners. Works not in the public domain cannot be commercially exploited without permission of the copyright owner. Responsibility for any use rests exclusively with the user.

Usage Statement

Researchers may make free and open use of the UC Berkeley Library’s digitized public domain materials. However, some materials in our online collections may be protected by U.S. copyright law (Title 17, U.S.C.). Use or reproduction of materials protected by copyright beyond that allowed by fair use (Title 17, U.S.C. § 107) requires permission from the copyright owners. The use or reproduction of some materials may also be restricted by terms of University of California gift or purchase agreements, privacy and publicity rights, or trademark law. Responsibility for determining rights status and permissibility of any use or reproduction rests exclusively with the researcher. To learn more or make inquiries, please see our permissions policies (https://www.lib.berkeley.edu/about/permissions-policies).

Collection

Statistics Technical Reports

Files

Statistics

Download Full History

Download

Formats

Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS

Add to Basket