Variable selection of regularized stochastic gradient descent in logistic regression

Volume 9, Issue 1, February 2024     |     PP. 38-44      |     PDF (257 K)    |     Pub. Date: May 18, 2022
DOI: 10.54647/mathematics11319    67 Downloads     1009 Views  


Ping Guo, College of Mathematics and Statistics, Guangxi Normal University, Guilin, Guangxi, China

In the modern big data environment, Stochastic gradient descent is an important method for training neural networks, processing largescale data sets, optimization, etc. Deeply welcomed in various fields. With regard to SGD, the existing literature considers the stopping condition of parameter iteration. In fact, some unimportant parameters do not always have values of 0 during iteration, and it is not clear whether they are important or not even if the stop condition is reached. We consider variable selection of SGD parameter iteration with L1 regular in generalized linear regression model (taking Logistic regression as an example). Monte Carlo numerical simulation and practical application examples were given to illustrate the consistency of variable selection. The results show that high accuracy can be achieved by using the selected variables to build the model.

SGD; Lasso; Logistic regression; Variable selection

Cite this paper
Ping Guo, Variable selection of regularized stochastic gradient descent in logistic regression , SCIREA Journal of Mathematics. Volume 9, Issue 1, February 2024 | PP. 38-44. 10.54647/mathematics11319


[ 1 ] Kushner, H. and Yin, G.(1997). Stochastic Approximation Algorithms and Applications. Springer Verlag, New York.
[ 2 ] Nemirovski, A., Juditsky, A., Lan, G.H., et al(2009). Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19:1574–1609.
[ 3 ] Du, S.S., Zhai, X.Y., Póczos, B., et al(2018). Gradient descent provably optimizes overparameterized neural networks. Statistics, 1467-5463.
[ 4 ] Bottou,L. and Bousquet, O.(2007). The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems (NeurIPS), 161–168.
[ 5 ] Tibshirani, R.(1996). Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society. Series B:Methodological, 58(1):267-288.
[ 6 ] Khalili, A., and Chen, J.H.(2007). Variable Selection in Finite Mixture of Regression Models. Journal of the American Statistical Association, 102(479): 1025-1038.
[ 7 ] Wang, P.Q. and Nguyen, P.X.(2012). Variations of Logistic Regression with Stochastic Gradient Descent.
[ 8 ] Sun,Y., Song Q.F.,and Liang, F.M.(2021). Consistent Sparse Deep Learning: Theory and Computation. Journal of the American Statistical Association, 1-42.