Type: | Package |
Title: | Algorithms using Alternating Direction Method of Multipliers |
Version: | 0.3.3 |
Description: | Provides algorithms to solve popular optimization problems in statistics such as regression or denoising based on Alternating Direction Method of Multipliers (ADMM). See Boyd et al (2010) <doi:10.1561/2200000016> for complete introduction to the method. |
License: | GPL (≥ 3) |
Encoding: | UTF-8 |
Imports: | Rcpp, Matrix, Rdpack, stats, doParallel, foreach, parallel, utils |
LinkingTo: | Rcpp, RcppArmadillo |
RoxygenNote: | 7.1.1 |
RdMacros: | Rdpack |
NeedsCompilation: | yes |
Packaged: | 2021-08-07 04:02:53 UTC; kisung |
Author: | Kisung You |
Maintainer: | Kisung You <kisungyou@outlook.com> |
Repository: | CRAN |
Date/Publication: | 2021-08-08 04:20:08 UTC |
ADMM : Algorithms using Alternating Direction Method of Multipliers
Description
An introduction of Alternating Direction Method of Multipliers (ADMM) method has been a breakthrough in solving complex and non-convex optimization problems in a reasonably stable as well as scalable fashion. Our package aims at providing handy tools for fast computation on well-known problems using the method. For interested users/readers, please visit Prof. Stephen Boyd's website entirely devoted to the topic.
Basis Pursuit
Description
For an underdetermined system, Basis Pursuit aims to find a sparse solution that solves
\textrm{min}_x ~ \|x\|_1 \quad \textrm{s.t} \quad Ax=b
which is a relaxed version of strict non-zero support finding problem. The implementation is borrowed from Stephen Boyd's MATLAB code.
Usage
admm.bp(
A,
b,
xinit = NA,
rho = 1,
alpha = 1,
abstol = 1e-04,
reltol = 0.01,
maxiter = 1000
)
Arguments
A |
an |
b |
a length- |
xinit |
a length- |
rho |
an augmented Lagrangian parameter |
alpha |
an overrelaxation parameter in [1,2] |
abstol |
absolute tolerance stopping criterion |
reltol |
relative tolerance stopping criterion |
maxiter |
maximum number of iterations |
Value
a named list containing
- x
a length-
n
solution vector- history
dataframe recording iteration numerics. See the section for more details.
Iteration History
When you run the algorithm, output returns not only the solution, but also the iteration history recording following fields over iterates,
- objval
object (cost) function value
- r_norm
norm of primal residual
- s_norm
norm of dual residual
- eps_pri
feasibility tolerance for primal feasibility condition
- eps_dual
feasibility tolerance for dual feasibility condition
In accordance with the paper, iteration stops when both r_norm
and s_norm
values
become smaller than eps_pri
and eps_dual
, respectively.
Examples
## generate sample data
n = 30
m = 10
A = matrix(rnorm(n*m), nrow=m) # design matrix
x = c(stats::rnorm(3),rep(0,n-3)) # coefficient
x = base::sample(x)
b = as.vector(A%*%x) # response
## run example
output = admm.bp(A, b)
niter = length(output$history$s_norm)
history = output$history
## report convergence plot
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(1:niter, history$objval, "b", main="cost function")
plot(1:niter, history$r_norm, "b", main="primal residual")
plot(1:niter, history$s_norm, "b", main="dual residual")
par(opar)
Elastic Net Regularization
Description
Elastic Net regularization is a combination of \ell_2
stability and
\ell_1
sparsity constraint simulatenously solving the following,
\textrm{min}_x ~ \frac{1}{2}\|Ax-b\|_2^2 + \lambda_1 \|x\|_1 + \lambda_2 \|x\|_2^2
with nonnegative constraints \lambda_1
and \lambda_2
. Note that if both lambda values are 0,
it reduces to least-squares solution.
Usage
admm.enet(
A,
b,
lambda1 = 1,
lambda2 = 1,
rho = 1,
abstol = 1e-04,
reltol = 0.01,
maxiter = 1000
)
Arguments
A |
an |
b |
a length- |
lambda1 |
a regularization parameter for |
lambda2 |
a regularization parameter for |
rho |
an augmented Lagrangian parameter |
abstol |
absolute tolerance stopping criterion |
reltol |
relative tolerance stopping criterion |
maxiter |
maximum number of iterations |
Value
a named list containing
- x
a length-
n
solution vector- history
dataframe recording iteration numerics. See the section for more details.
Iteration History
When you run the algorithm, output returns not only the solution, but also the iteration history recording following fields over iterates,
- objval
object (cost) function value
- r_norm
norm of primal residual
- s_norm
norm of dual residual
- eps_pri
feasibility tolerance for primal feasibility condition
- eps_dual
feasibility tolerance for dual feasibility condition
In accordance with the paper, iteration stops when both r_norm
and s_norm
values
become smaller than eps_pri
and eps_dual
, respectively.
Author(s)
Xiaozhi Zhu
References
Zou H, Hastie T (2005). “Regularization and variable selection via the elastic net.” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2), 301–320. ISSN 1369-7412, 1467-9868, doi: 10.1111/j.1467-9868.2005.00503.x.
See Also
Examples
## generate underdetermined design matrix
m = 50
n = 100
p = 0.1 # percentange of non-zero elements
x0 = matrix(Matrix::rsparsematrix(n,1,p))
A = matrix(rnorm(m*n),nrow=m)
for (i in 1:ncol(A)){
A[,i] = A[,i]/sqrt(sum(A[,i]*A[,i]))
}
b = A%*%x0 + sqrt(0.001)*matrix(rnorm(m))
## run example with both regularization values = 1
output = admm.enet(A, b, lambda1=1, lambda2=1)
niter = length(output$history$s_norm)
history = output$history
## report convergence plot
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(1:niter, history$objval, "b", main="cost function")
plot(1:niter, history$r_norm, "b", main="primal residual")
plot(1:niter, history$s_norm, "b", main="dual residual")
par(opar)
Generalized LASSO
Description
Generalized LASSO is solving the following equation,
\textrm{min}_x ~ \frac{1}{2}\|Ax-b\|_2^2 + \lambda \|Dx\|_1
where the choice of regularization matrix D
leads to different problem formulations.
Usage
admm.genlasso(
A,
b,
D = diag(length(b)),
lambda = 1,
rho = 1,
alpha = 1,
abstol = 1e-04,
reltol = 0.01,
maxiter = 1000
)
Arguments
A |
an |
b |
a length- |
D |
a regularization matrix of |
lambda |
a regularization parameter |
rho |
an augmented Lagrangian parameter |
alpha |
an overrelaxation parameter in [1,2] |
abstol |
absolute tolerance stopping criterion |
reltol |
relative tolerance stopping criterion |
maxiter |
maximum number of iterations |
Value
a named list containing
- x
a length-
n
solution vector- history
dataframe recording iteration numerics. See the section for more details.
Iteration History
When you run the algorithm, output returns not only the solution, but also the iteration history recording following fields over iterates,
- objval
object (cost) function value
- r_norm
norm of primal residual
- s_norm
norm of dual residual
- eps_pri
feasibility tolerance for primal feasibility condition
- eps_dual
feasibility tolerance for dual feasibility condition
In accordance with the paper, iteration stops when both r_norm
and s_norm
values
become smaller than eps_pri
and eps_dual
, respectively.
Author(s)
Xiaozhi Zhu
References
Tibshirani RJ, Taylor J (2011). “The solution path of the generalized lasso.” The Annals of Statistics, 39(3), 1335–1371. ISSN 0090-5364, doi: 10.1214/11-AOS878.
Zhu Y (2017). “An Augmented ADMM Algorithm With Application to the Generalized Lasso Problem.” Journal of Computational and Graphical Statistics, 26(1), 195–204. ISSN 1061-8600, 1537-2715, doi: 10.1080/10618600.2015.1114491.
Examples
## generate sample data
m = 100
n = 200
p = 0.1 # percentange of non-zero elements
x0 = matrix(Matrix::rsparsematrix(n,1,p))
A = matrix(rnorm(m*n),nrow=m)
for (i in 1:ncol(A)){
A[,i] = A[,i]/sqrt(sum(A[,i]*A[,i]))
}
b = A%*%x0 + sqrt(0.001)*matrix(rnorm(m))
D = diag(n);
## set regularization lambda value
regval = 0.1*Matrix::norm(t(A)%*%b, 'I')
## solve LASSO via reducing from Generalized LASSO
output = admm.genlasso(A,b,D,lambda=regval) # set D as identity matrix
niter = length(output$history$s_norm)
history = output$history
## report convergence plot
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(1:niter, history$objval, "b", main="cost function")
plot(1:niter, history$r_norm, "b", main="primal residual")
plot(1:niter, history$s_norm, "b", main="dual residual")
par(opar)
Least Absolute Deviations
Description
Least Absolute Deviations (LAD) is an alternative to traditional Least Sqaures by using cost function
\textrm{min}_x ~ \|Ax-b\|_1
to use \ell_1
norm instead of square loss for robust estimation of coefficient.
Usage
admm.lad(
A,
b,
xinit = NA,
rho = 1,
alpha = 1,
abstol = 1e-04,
reltol = 0.01,
maxiter = 1000
)
Arguments
A |
an |
b |
a length- |
xinit |
a length- |
rho |
an augmented Lagrangian parameter |
alpha |
an overrelaxation parameter in [1,2] |
abstol |
absolute tolerance stopping criterion |
reltol |
relative tolerance stopping criterion |
maxiter |
maximum number of iterations |
Value
a named list containing
- x
a length-
n
solution vector- history
dataframe recording iteration numerics. See the section for more details.
Iteration History
When you run the algorithm, output returns not only the solution, but also the iteration history recording following fields over iterates,
- objval
object (cost) function value
- r_norm
norm of primal residual
- s_norm
norm of dual residual
- eps_pri
feasibility tolerance for primal feasibility condition
- eps_dual
feasibility tolerance for dual feasibility condition
In accordance with the paper, iteration stops when both r_norm
and s_norm
values
become smaller than eps_pri
and eps_dual
, respectively.
Examples
## generate data
m = 1000
n = 100
A = matrix(rnorm(m*n),nrow=m)
x = 10*matrix(rnorm(n))
b = A%*%x
## add impulsive noise to 10% of positions
idx = sample(1:m, round(m/10))
b[idx] = b[idx] + 100*rnorm(length(idx))
## run the code
output = admm.lad(A,b)
niter = length(output$history$s_norm)
history = output$history
## report convergence plot
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(1:niter, history$objval, "b", main="cost function")
plot(1:niter, history$r_norm, "b", main="primal residual")
plot(1:niter, history$s_norm, "b", main="dual residual")
par(opar)
Least Absolute Shrinkage and Selection Operator
Description
LASSO, or L1-regularized regression, is an optimization problem to solve
\textrm{min}_x ~ \frac{1}{2}\|Ax-b\|_2^2 + \lambda \|x\|_1
for sparsifying the coefficient vector x
.
The implementation is borrowed from Stephen Boyd's
MATLAB code.
Usage
admm.lasso(
A,
b,
lambda = 1,
rho = 1,
alpha = 1,
abstol = 1e-04,
reltol = 0.01,
maxiter = 1000
)
Arguments
A |
an |
b |
a length- |
lambda |
a regularization parameter |
rho |
an augmented Lagrangian parameter |
alpha |
an overrelaxation parameter in [1,2] |
abstol |
absolute tolerance stopping criterion |
reltol |
relative tolerance stopping criterion |
maxiter |
maximum number of iterations |
Value
a named list containing
- x
a length-
n
solution vector- history
dataframe recording iteration numerics. See the section for more details.
Iteration History
When you run the algorithm, output returns not only the solution, but also the iteration history recording following fields over iterates,
- objval
object (cost) function value
- r_norm
norm of primal residual
- s_norm
norm of dual residual
- eps_pri
feasibility tolerance for primal feasibility condition
- eps_dual
feasibility tolerance for dual feasibility condition
In accordance with the paper, iteration stops when both r_norm
and s_norm
values
become smaller than eps_pri
and eps_dual
, respectively.
References
Tibshirani R (1996). “Regression Shrinkage and Selection via the Lasso.” Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267–288. ISSN 00359246.
Examples
## generate sample data
m = 50
n = 100
p = 0.1 # percentange of non-zero elements
x0 = matrix(Matrix::rsparsematrix(n,1,p))
A = matrix(rnorm(m*n),nrow=m)
for (i in 1:ncol(A)){
A[,i] = A[,i]/sqrt(sum(A[,i]*A[,i]))
}
b = A%*%x0 + sqrt(0.001)*matrix(rnorm(m))
## set regularization lambda value
lambda = 0.1*base::norm(t(A)%*%b, "F")
## run example
output = admm.lasso(A, b, lambda)
niter = length(output$history$s_norm)
history = output$history
## report convergence plot
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
plot(1:niter, history$objval, "b", main="cost function")
plot(1:niter, history$r_norm, "b", main="primal residual")
plot(1:niter, history$s_norm, "b", main="dual residual")
par(opar)
Robust Principal Component Analysis
Description
Given a data matrix M
, it finds a decomposition
\textrm{min}~\|L\|_*+\lambda \|S\|_1\quad \textrm{s.t.}\quad L+S=M
where \|L\|_*
represents a nuclear norm for a matrix L
and
\|S\|_1 = \sum |S_{i,j}|
, and \lambda
a balancing/regularization
parameter. The choice of such norms leads to impose low-rank property for L
and
sparsity on S
.
Usage
admm.rpca(
M,
lambda = 1/sqrt(max(nrow(M), ncol(M))),
mu = 1,
tol = 1e-07,
maxiter = 1000
)
Arguments
M |
an |
lambda |
a regularization parameter |
mu |
an augmented Lagrangian parameter |
tol |
relative tolerance stopping criterion |
maxiter |
maximum number of iterations |
Value
a named list containing
- L
an
(m\times n)
low-rank matrix- S
an
(m\times n)
sparse matrix- history
dataframe recording iteration numerics. See the section for more details.
Iteration History
For RPCA implementation, we chose a very simple stopping criterion
\|M-(L_k+S_k)\|_F \le tol*\|M\|_F
for each iteration step k
. So for this method, we provide a vector of only relative errors,
- error
relative error computed
References
Candès EJ, Li X, Ma Y, Wright J (2011). “Robust principal component analysis?” Journal of the ACM, 58(3), 1–37. ISSN 00045411, doi: 10.1145/1970392.1970395.
Examples
## generate data matrix from standard normal
X = matrix(rnorm(20*5),nrow=5)
## try different regularization values
out1 = admm.rpca(X, lambda=0.01)
out2 = admm.rpca(X, lambda=0.1)
out3 = admm.rpca(X, lambda=1)
## visualize sparsity
opar <- par(no.readonly=TRUE)
par(mfrow=c(1,3))
image(out1$S, main="lambda=0.01")
image(out2$S, main="lambda=0.1")
image(out3$S, main="lambda=1")
par(opar)
Semidefinite Programming
Description
We solve the following standard semidefinite programming (SDP) problem
\textrm{min}_X ~ \textrm{tr}(CX)
\textrm{s.t.} A(X)=b, ~ X \geq 0
with A(X)_i = \textrm{tr}(A_i^\top X) = b_i
for i=1,\ldots,m
and X \geq 0
stands for positive-definiteness of the matrix X
. In the standard form,
matrices C, A_1,A_2,\ldots,A_m
are symmetric and solution X
would be symmetric and positive semidefinite. This function implements alternating direction augmented Lagrangian methods.
Usage
admm.sdp(
C,
A,
b,
mu = 1,
rho = 1,
abstol = 1e-10,
maxiter = 496,
print.progress = FALSE
)
Arguments
C |
an |
A |
a length- |
b |
a length- |
mu |
penalty parameter; positive real number. |
rho |
step size for updating in |
abstol |
absolute tolerance stopping criterion. |
maxiter |
maximum number of iterations. |
print.progress |
a logical; |
Value
a named list containing
- x
a length-
n
solution vector- history
dataframe recording iteration numerics. See the section for more details.
Iteration History
When you run the algorithm, output returns not only the solution, but also the iteration history recording following fields over iterates,
- objval
object (cost) function value
- eps_pri
feasibility tolerance for primal feasibility condition
- eps_dual
feasibility tolerance for dual feasibility condition
- gap
gap between primal and dual cost function.
We use the stopping criterion which breaks the iteration when all eps_pri
,eps_dual
, and gap
become smaller than abstol
.
Author(s)
Kisung You
References
Wen Z, Goldfarb D, Yin W (2010). “Alternating direction augmented Lagrangian methods for semidefinite programming.” Mathematical Programming Computation, 2(3-4), 203–230. ISSN 1867-2949, 1867-2957, doi: 10.1007/s12532-010-0017-1.
Examples
## a toy example
# generate parameters
C = matrix(c(1,2,3,2,9,0,3,0,7),nrow=3,byrow=TRUE)
A1 = matrix(c(1,0,1,0,3,7,1,7,5),nrow=3,byrow=TRUE)
A2 = matrix(c(0,2,8,2,6,0,8,0,4),nrow=3,byrow=TRUE)
A = list(A1, A2)
b = c(11, 19)
# run the algorithm
run = admm.sdp(C,A,b)
hst = run$history
# visualize
opar <- par(no.readonly=TRUE)
par(mfrow=c(2,2))
plot(hst$objval, type="b", cex=0.25, main="objective value")
plot(hst$eps_pri, type="b", cex=0.25, main="primal feasibility")
plot(hst$eps_dual, type="b", cex=0.25, main="dual feasibility")
plot(hst$gap, type="b", cex=0.25, main="primal-dual gap")
par(opar)
## Not run:
## comparison with CVXR's result
require(CVXR)
# problems definition
X = Variable(3,3,PSD=TRUE)
myobj = Minimize(sum_entries(C*X)) # objective
mycon = list( # constraint
sum_entries(A[[1]]*X) == b[1],
sum_entries(A[[2]]*X) == b[2]
)
myp = Problem(myobj, mycon) # problem
# run and visualize
res = solve(myp)
Xsol = res$getValue(X)
opar = par(no.readonly=TRUE)
par(mfrow=c(1,2), pty="s")
image(run$X, axes=FALSE, main="ADMM result")
image(Xsol, axes=FALSE, main="CVXR result")
par(opar)
## End(Not run)
Sparse PCA
Description
Sparse Principal Component Analysis aims at finding a sparse vector by solving
\textrm{max}_x~x^T\Sigma x \quad \textrm{s.t.} \quad \|x\|_2\le 1,~\|x\|_0\le K
where \|x\|_0
is the number of non-zero elements in a vector x
. A convex relaxation
of this problem was proposed to solve the following problem,
\textrm{max}_X~<\Sigma,X> ~\textrm{s.t.} \quad Tr(X)=1,~\|X\|_0 \le K^2, ~X\ge 0,~\textrm{rank}(X)=1
where X=xx^T
is a (p\times p)
matrix that is outer product of a vector x
by itself,
and X\ge 0
means the matrix X
is positive semidefinite.
With the rank condition dropped, it can be restated as
\textrm{max}_X~ <\Sigma,X>-\rho\|X\|_1 \quad \textrm{s.t.}\quad Tr(X)=1,X\ge 0.
After acquiring each principal component vector, an iterative step based on Schur complement deflation method is applied to regress out the impact of previously-computed projection vectors. It should be noted that those sparse basis may not be orthonormal.
Usage
admm.spca(
Sigma,
numpc,
mu = 1,
rho = 1,
abstol = 1e-04,
reltol = 0.01,
maxiter = 1000
)
Arguments
Sigma |
a |
numpc |
number of principal components to be extracted. |
mu |
an augmented Lagrangian parameter. |
rho |
a regularization parameter for sparsity. |
abstol |
absolute tolerance stopping criterion. |
reltol |
relative tolerance stopping criterion. |
maxiter |
maximum number of iterations. |
Value
a named list containing
- basis
a
(p\times numpc)
matrix whose columns are sparse principal components.- history
a length-
numpc
list of dataframes recording iteration numerics. See the section for more details.
Iteration History
For SPCA implementation, main computation is sequentially performed for each projection vector. The history
field is a list of length numpc
, where each element is a data frame containing iteration history recording
following fields over iterates,
- r_norm
norm of primal residual
- s_norm
norm of dual residual
- eps_pri
feasibility tolerance for primal feasibility condition
- eps_dual
feasibility tolerance for dual feasibility condition
In accordance with the paper, iteration stops when both r_norm
and s_norm
values
become smaller than eps_pri
and eps_dual
, respectively.
References
Ma S (2013). “Alternating Direction Method of Multipliers for Sparse Principal Component Analysis.” Journal of the Operations Research Society of China, 1(2), 253–274. ISSN 2194-668X, 2194-6698, doi: 10.1007/s40305-013-0016-9.
Examples
## generate a random matrix and compute its sample covariance
X = matrix(rnorm(1000*5),nrow=1000)
covX = stats::cov(X)
## compute 3 sparse basis
output = admm.spca(covX, 3)
Total Variation Minimization
Description
1-dimensional total variation minimization - also known as signal denoising - is to solve the following
\textrm{min}_x ~ \frac{1}{2}\|x-b\|_2^2 + \lambda \sum_i |x_{i+1}-x_i|
for a given signal b
.
The implementation is borrowed from Stephen Boyd's
MATLAB code.
Usage
admm.tv(
b,
lambda = 1,
xinit = NA,
rho = 1,
alpha = 1,
abstol = 1e-04,
reltol = 0.01,
maxiter = 1000
)
Arguments
b |
a length- |
lambda |
regularization parameter |
xinit |
a length- |
rho |
an augmented Lagrangian parameter |
alpha |
an overrelaxation parameter in |
abstol |
absolute tolerance stopping criterion |
reltol |
relative tolerance stopping criterion |
maxiter |
maximum number of iterations |
Value
a named list containing
- x
a length-
m
solution vector- history
dataframe recording iteration numerics. See the section for more details.
Iteration History
When you run the algorithm, output returns not only the solution, but also the iteration history recording following fields over iterates,
- objval
object (cost) function value
- r_norm
norm of primal residual
- s_norm
norm of dual residual
- eps_pri
feasibility tolerance for primal feasibility condition
- eps_dual
feasibility tolerance for dual feasibility condition
In accordance with the paper, iteration stops when both r_norm
and s_norm
values
become smaller than eps_pri
and eps_dual
, respectively.
Examples
## generate sample data
x1 = as.vector(sin(1:100)+0.1*rnorm(100))
x2 = as.vector(cos(1:100)+0.1*rnorm(100)+5)
x3 = as.vector(sin(1:100)+0.1*rnorm(100)+2.5)
xsignal = c(x1,x2,x3)
## run example
output = admm.tv(xsignal)
## visualize
opar <- par(no.readonly=TRUE)
plot(1:300, xsignal, type="l", main="TV Regularization")
lines(1:300, output$x, col="red", lwd=2)
par(opar)