Sparse high-dimensional regression models pdf

We consider variable selection in high dimensional sparse multiresponse linear regression models, in which a q dimensional response vector has a linear relationship with a p dimensional covariate vector through a sparse coefficient matrix \b\in rp\times q\. Sequential model averaging for high dimensional linear. To address this issue, we further extend the correlation learning to marginal nonparametric learning. Asymptotic properties of bridge estimators in sparse high. Estimation of regression functions via penalization and selection 3.

We study the asymptotic properties of adaptive lasso estimators in sparse, highdimensional, linear regression models when the. Nonasymptotic analysis of semiparametric regression models with high dimensional parametric coefficients zhu, ying, annals of statistics, 2017. Asymptotic properties of bridge estimators in sparse highdimensional regression models. The underlying model is the same as in equation 1, but we impose a sparsity constraint on the index set j. Pdf asymptotic properties of bridge estimators in sparse. A variable screening procedure via correlation learning was proposed by fan and lv 2008 to reduce dimensionality in sparse ultrahighdimensional models. Our methods combine ideas from sparse linear modeling and additive nonparametric regression. We propose a model feature selection procedure based on penalized maximum likelihood with a complexity penalty on the model size and derive the nonasymptotic bounds for the resulting misclassification excess risk. Adaptive lasso for sparse highdimensional regression models.

Work in first analyzed high dimensional sparse regression with arbitrary corruptions in covariates. Highdimensional sparse models hdsm models motivating examples 2. Bayesian adaptive elasticnet for high dimensional sparse quantile regression models. Zhao and yu 27 have shown that lasso is variable selection consistent for nonrandom highdimensional regressors under an irrepresentable condition ic on the sample covariance matrix and regression coef. We consider highdimensional binary classification by sparse logistic regression. Recent developments of theory, methods, and implementations in penalized least squares and penalized likelihood methods are highlighted.

Partial correlation estimation by joint sparse regression models. Sparse model identification and learning for ultrahigh. In a standard linear model, we have at our disposal xi, yi supposed to be linked with. We study the asymptotic properties of bridge estimators in sparse, highdimensional, linear regression models when the number of covariates may increase to infinity with the sample size. Much insight from this work can be gained to understand high dimensional or sparse regression and it comes as no surprise that donoho and johnstone have made the rst contributions on this topic in the early nineties. We study the asymptotic properties of the adaptive lasso estimators in sparse, highdimensional, linear regression models when the number of covariates may increase with the sample size. A variable screening procedure via correlation learning was proposed by fan and lv 2008 to reduce dimensionality in sparse ultra high dimensional models.

Finally, the ultrahigh dimensional assumption includes the high dimensional setting say p onb for some b 0 as a special case. Hence, sma is directly applicable to the high dimensional model. Horowitz2 and shuangge ma university of iowa, northwestern university and yale university we study the asymptotic properties of bridge estimators in sparse, highdimensional, linear regression models when the number of covariates may. Highdimensional classification by sparse logistic regression. Generalized ridge regression estimator in high dimensional. Inference for highdimensional sparse econometric models. Asymptotic properties of bridge estimators in sparse high dimensional regression models jian huang1, joel l. Variable selection in highdimensional sparse multiresponse. W e consider variable selection using the adaptiv e lasso, where. In this work we consider the problem of linear quantile regression in high dimensions where the num. Although it is well known in regression problems, explicit theoretical quanti. Noise accumulation is a common phenomenon in high dimensional prediction. The models distinguish themselves from ordinary multivariate regression models in two aspects. The main assumption is that the pdimensional parameter vector is sparse with many components being exactly zero or negligibly small, and each nonzero component stands for the contribution of an important predictor.

Least squares after model selection in high dimensional sparse models. In this article, we deal with sparse highdimensional multivariate regression models. The performance of our new estimators is compared with commonly used estimators in terms of predictive accuracy and errors in variable selection. Generalized ridge regression estimator in high dimensional sparse regression models article pdf available august 2018 with 62 reads how we measure reads.

Envelope models for parsimonious and efficient multivariate linear regression. Bayesian models for sparse regression analysis of high. Asymptotic properties of bridge estimators in sparse highdimensional regression models jian huang1, joel l. We consider high dimensional binary classification by sparse logistic regression. Boosting methods for variable selection in high dimensional. Least squares after model selection in highdimensional sparse models. Lassotype sparse regression and highdimensional gaussian graphical models by xiaohui chen m. This paper focuses on the simultaneous sparse model identification and learning for ultrahighdimensional aplms which strikes a delicate balance between the simplicity of the standard linear regression models and the flexibility of the additive regression models. It generates a sequence of solutions iteratively, based on support detection using primal and dual information and root.

The lasso is an attractive approach to variable selection in sparse, highdimensional regression models. We consider variable selection using the adap tive lasso, where the l1 norms in the penalty are reweighted by datadependent weights. Big data lecture 2 high dimensional regression with the lasso. Journal of the royal statistical society, series b, statistical methodology. Regularized estimation in sparse highdimensional time series models. Horowitz2 and shuangge ma university of iowa, northwestern university and yale university we study the asymptotic properties of bridge estimators in sparse, high dimensional, linear regression models when the number of covariates may. Our motivation comes from studies that try to correlate a certain phenotype with highdimensional genomic data. We consider linear, high dimensional sparse hds regression models in econometrics. Even when the true model is linear, the marginal regression can be highly nonlinear. Pdf bayesian adaptive elasticnet for high dimensional. The hds regression model has a large number of regressors p, possibly much larger than the sample size n, but only a relatively small number s high. In such models, the overall number of regressors p is very large, possibly much larger than the sample size n. These methods typically model the data using the sum of a small number of univariate or very lowdimensional functions. Highdimensional time series, stochastic regression, vector au toregression.

Partial correlation estimation by joint sparse regression models jie peng, pei wang, nengfeng zhou, and ji zhu in this article, we propose a computationally efficient approachspace sparse partial correlation estimationfor selecting nonzero partial correlations under the highdimensionlowsamplesize setting. The limits of dimensionality that regularization methods can handle, the role of penalty functions, and their statistical properties are detailed. High dimensional sparse econometric models, 2010, advances in economics and econometrics, 10th world congress. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual square and scaling the penalty in proportion to the estimated noise level. Estimation and inference with outline for econometric theory of big data part i. The hds regression model allows for a large number of regressors, p, which is possibly much larger.

Penalized regression, highdimensional data, variable selection, asymptotic normality, oracle property. Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. This work proposes new inference methods for the estimation of a regression coe. Partial correlation estimation by joint sparse regression. Recent developments in theory, methods, and implementations in penalized leastsquares and penalized likelihood methods are highlighted. A road to classification in high dimensional space. Pdf we study the asymptotic properties of the adaptive lasso estimators in sparse, highdimensional, linear regression models when the number of. It is observed that our approach has better prediction performance for highly sparse high dimensional linear regression models. Sparse modeling has been widely used to deal with high dimensionality. In this paper, we propose a convex formulation for sparse sliced inverse regression in the highdimensional setting by adapting techniques from sparse canonical correlation analysis vu et al.

Asymptotic analysis of highdimensional lad regression. Nonparametric independence screening in sparse ultrahigh. Such assumption is crucial in ensuring the identifiability of the true underlying sparse model especially. In this paper, we consider quantile regression in highdimensional sparse models hdsms. These methods typically model the data using the sum of a small number of univariate or very low dimensional functions. With such data, the dimension of the covariate vector can be much larger than the sample size. For example, in a linear regression model with noise variance. We present a new class of models for highdimensional nonparametric regression and classi. High dimensional structured quantile regression vidyashankar sivakumar 1arindam banerjee abstract quantile regression aims at modeling the conditional median and quantiles of a response variable given certain predictor variables.

Our proposal estimates the central subspace directly and performs variable selection simultaneously. Asymptotic analysis of highdimensional lad regression with lasso xiaoli gao and jian huang oakland university and university of iowa abstract. In highdimensional statistical modeling, it is a fundamental problem to identify important explanatory variables. Sparse highdimensional models in economics princetons orfe. For linear regression models, many penalization methods have been proposed to conduct variable selection and estimation, and much e. The hds regression model has a large number of regressors p, possibly much larger than the sample size n, but only a relatively small number s regression function. Estimation of regression functions via penalization andthe framework two examplesselection 3.

This paper considers the task of building efficient regression models for sparse multivariate analysis of high dimensional data sets, in particular it focuses on cases where the numbers q of responses y y k,1. We show that, if a reasonable initial estimator is available, under ap. A twostage sequential conditional selection approach to. In this paper we investigate sparse additive models spams, which extend the advantages of sparse linear models to the additive nonparametric setting. We study the asymptotic properties of bridge estimators in sparse, high. High dimensionality poses numerous challenges to statistical theory, methods, and implementations in those problems. We consider highdimensional models where the number of. We are particularly interested in the use of bridge estimators to distinguish between covariates whose coefficients are zero and covariates whose coefficients. We study the asymptotic properties of the adaptive lasso estimators in sparse, high dimensional, linear regression models when the number of covariates may increase with the sample size. Lassotype sparse regression and high dimensional gaussian graphical models by xiaohui chen m. We propose a consistent procedure for the purpose of identifying the nonzeros in b. Asymptotic properties of bridge estimators in sparse highdimensional regression models jian huang joel horowitz shuangge ma presenter. We consider variable selection in highdimensional sparse multiresponse linear regression models, in which a qdimensional response vector has a linear relationship with a pdimensional covariate vector through a sparse coefficient matrix \b\in rp\times q\. L1penalized quantile regression in high dimensional.

Scaled sparse linear regression biometrika oxford academic. Lassotype sparse regression and highdimensional gaussian. Sparse highdimensional regression ams 2000 subject classi. They show that replacing the standard inner product in matching pursuit with a trimmed version, one can recover from an. In this article, we deal with sparse high dimensional multivariate regression models.

Pdf generalized ridge regression estimator in high. Robust inference in high dimensional approximately sparse. Much insight from this work can be gained to understand highdimensional or sparse regression and it comes as no surprise that donoho and johnstone have made the rst contributions on this topic in the early nineties. For instance, in gwas, our primary problem of interest is to. Asymptotic properties of bridge estimators in sparse high dimensional regression models jian huang joel horowitz shuangge ma presenter. We propose a constructive approach to estimating sparse, highdimensional linear regression models. Introduction we consider linear, high dimensional sparse hds regression models in econometrics. Minjing tao asymptotic properties of bridge estimators 1 45. Horowitz2, and shuangge ma3 1department of statistics and actuarial science, university of iowa 2department of economics, northwestern university 3department of biostatistics, university of washington march 2006 the university of iowa department of statistics. Nonasymptotic analysis of semiparametric regression models with highdimensional parametric coefficients zhu, ying, annals of statistics, 2017. Least squares after model selection in highdimensional. We show that, if a reasonable initial estimator is available, then under appropriate. We provide a novel and to the best of our knowledge, the first algorithm for high dimensional sparse regression with corruptions in explanatory andor response variables. Highdimensional sparse econometric models, 2010, advances in economics and econometrics, 10th world congress.

This article is about estimation and inference methods for high dimensional sparse hds regression models in econometrics. Olii massachusettsinstituteoftechnology departmentofeconomics workingpaperseries penalizedquantileregressioninhigh. Horowitz2, and shuangge ma3 1university of iowa, 2northwestern university, 3yale university summary. These variable selection methods are effective in sparse highdimensional modeling. Highdimensional structured quantile regression vidyashankar sivakumar 1arindam banerjee abstract quantile regression aims at modeling the conditional median and quantiles of a response variable given certain predictor variables. Adaptive lasso for sparse highdimensional regression models 1607 of appropriate dimension with all components zero. Asymptotic properties of bridge estimators in sparse high dimensional regression models. High dimensional sparse models arise in situations where many regressors or series terms are available and the regression function is wellapproximated by a parsimonious, yet unknown set of regressors. Multivariate regression modeling with a multivariate response y. Partial correlation estimation by joint sparse regression models jie peng, pei wang, nengfeng zhou, and ji zhu in this article, we propose a computationally efficient approachspace sparse partial correlation estimationfor selecting nonzero partial correlations under the.