Title: | Correlation-Based Penalized Estimators |
---|---|
Description: | Provides correlation-based penalty estimators for both linear and logistic regression models by implementing a new regularization method that incorporates correlation structures within the data. This method encourages a grouping effect where strongly correlated predictors tend to be in or out of the model together. See Tutz and Ulbricht (2009) <doi:10.1007/s11222-008-9088-5> and Algamal and Lee (2015) <doi:10.1016/j.eswa.2015.08.016>. |
Authors: | Mohammad Arashi [ctb] , Mahdi Rahimi [ctb], Mina Norouzirad [aut, cre, cph] , FCT, I.P. [fnd] (under the scope of the projects UIDB/00297/2020 and UIDP/00297/2020 (NovaMath)) |
Maintainer: | Mina Norouzirad <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.0 |
Built: | 2024-11-08 04:57:42 UTC |
Source: | https://github.com/mnrzrad/cbpe |
This function computes the correlation-based estimator for linear regression models.
CBPLinearE(X, y, lambda)
CBPLinearE(X, y, lambda)
X |
A numeric matrix of predictors where rows represent observations and columns represent variables. |
y |
A numeric vector of response variables. |
lambda |
A regularization parameter. |
The correlation-based penalized linear estimator is calculated as:
where denotes the (empirical) correlation between the
th and the
th predictor.
A numeric vector of the estimated coefficients for the specified model.
Tutz, G., Ulbricht, J. (2009). Penalized regression with correlation-based penalty. Stat Comput 19, 239–253.
set.seed(42) n <- 100 p <- 4 X <- matrix(rnorm(n * p), n, p) beta_true <- c(0.5, -1, 2, 5) y <- X %*% beta_true + rnorm(n) lambda <- 0.1 result <- CBPLinearE(X, y, lambda = lambda) print(result)
set.seed(42) n <- 100 p <- 4 X <- matrix(rnorm(n * p), n, p) beta_true <- c(0.5, -1, 2, 5) y <- X %*% beta_true + rnorm(n) lambda <- 0.1 result <- CBPLinearE(X, y, lambda = lambda) print(result)
This function computes the correlation-based estimator for logistic regression models.
CBPLogisticE(X, y, lambda, max_iter = 100, tol = 1e-06)
CBPLogisticE(X, y, lambda, max_iter = 100, tol = 1e-06)
X |
A numeric matrix of predictors where rows represent observations and columns represent variables. |
y |
A numeric vector of binary outcomes (0 or 1). |
lambda |
A regularization parameter. |
max_iter |
An integer specifying the maximum number of iterations for the logistic regression algorithm. Default is 100. |
tol |
A numeric value specifying the convergence tolerance for the logistic regression algorithm. Default is 1e-10. |
The correlation-based penalized logistic estimator is calculated as:
where and
denotes the (empirical) correlation between the
th and the
th predictor.
A numeric vector of the estimated coefficients for the specified model.
Algamal, Z. Y., & Lee, M. H. (2015). Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification. Expert Systems with Applications, 42(23), 9326-9332.
set.seed(42) n <- 100 p <- 4 X <- matrix(rnorm(n * p), n, p) beta_true <- c(0.5, -1, 2, 5) y <- rbinom(n, 1, 1 / (1 + exp(-X %*% beta_true))) lambda <- 0.1 result <- CBPLogisticE(X, y, lambda) print(result)
set.seed(42) n <- 100 p <- 4 X <- matrix(rnorm(n * p), n, p) beta_true <- c(0.5, -1, 2, 5) y <- rbinom(n, 1, 1 / (1 + exp(-X %*% beta_true))) lambda <- 0.1 result <- CBPLogisticE(X, y, lambda) print(result)