Go to main content

PDF

Description

Given a collection of $r \geq 2$ linear regression problems in $\pdim$ dimensions, suppose that the regression coefficients share partially common supports. This set-up suggests the use of $\ell_{1}/\ell_{\infty}$-regularized regression for joint estimation of the $\pdim \times \numreg$ matrix of regression coefficients. We analyze the high-dimensional scaling of $\ell_1/\ell_\infty$-regularized quadratic programming, considering both consistency rates in $\ell_\infty$-norm, and also how the minimal sample size $n$ required for performing variable selection grows as a function of the model dimension, sparsity, and overlap between the supports. We begin by establishing bounds on the $\ell_\infty$-error as well sufficient conditions for exact variable selection for fixed design matrices, as well as designs drawn randomly from general Gaussian matrices. Our second set of results applies to $\numreg = 2$ linear regression problems with standard Gaussian designs whose supports overlap in a fraction $\alpha \in [0,1]$ of their entries: for this problem class, we prove that the $\ell_{1}/\ell_{\infty}$-regularized method undergoes a phase transition---that is, a sharp change from failure to success---characterized by the rescaled sample size $\theta_{1,\infty}(n, p, s, \alpha) = n/\{(4 - 3 \alpha) s \log(p-(2-\alpha) \, s)\}$. More precisely, given sequences of problems specified by $(n, p, s, \alpha)$, for any $\delta > 0$, the probability of successfully recovering both supports converges to $1$ if $\theta_{1, \infty}(n, p, s, \alpha) > 1+\delta$, and converges to $0$ for problem sequences for which $\theta_{1,\infty}(n,p,s, \alpha) < 1 - \delta$. An implication of this threshold is that use of $\ell_1 / \ell_{\infty}$-regularization yields improved statistical efficiency if the overlap parameter is large enough ($\alpha > 2/3$), but has \emph{worse} statistical efficiency than a naive Lasso-based approach for moderate to small overlap ($\alpha < 2/3$). Empirical simulations illustrate the close agreement between these theoretical predictions, and the actual behavior in practice. These results indicate that some caution needs to be exercised in the application of $\ell_1/\ell_\infty$ block regularization: if the data does not match its structure closely enough, it can impair statistical performance relative to computationally less expensive schemes.

Details

Files

Statistics

from
to
Export
Download Full History
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DublinCore
EndNote
NLM
RefWorks
RIS