Supported solvers
The following list contains all the solvers and the required packages that provide access to the solver. A solver of name X
will always provide at least one of the two methods :XSOS
or :XMoment
. The former models the sum-of-squares formulation of the problem - each monomial corresponds to one constraint, and the solution comes from the dual vector. The latter models the Lasserre moment hierarchy of the problem - each monomial corresponds to a primal variable. Which method (or both) is offered by the solver depends on the particular interface that is supported by the solver and which form fits more naturally to this interface. Every solver will also provide an alias :X
method that defaults to the recommended method for this solver. The given performance indicators are merely based on experience and may not be accurate for your particular problem. In particular the maximally recommended basis size depends heavily on the structure of the final problem, which can easily put the number up or down by 100 or more. All solvers may expose options that can influence the runtime behavior.
Solver | Package | License | Methods | Speed | Accuracy | Memory | max. recomm. basis size |
---|---|---|---|---|---|---|---|
Clarabel | Clarabel.jl | Apache | moment | 👍👍👍 | 👍👍👍 | 👍👍 | ~200 |
COPT | COPT.jl | commercial | moment | 👍👍👍 | 👍👍👍 | 👍👍👍 | ~700 |
Hypatia[1] | Hypatia.jl | MIT | moment | 👍👍 | 👍👍 | 👍 | ~100 |
LANCELOT[2] | GALAHAD.jl | BSD | nonlinear | n.a. | n.a. | 👍👍👍 | n.a. |
Loraine | ∅[3] | MIT | primal moment | 👍👍👍 | 👍👍 | 👍👍👍 | moderately large |
LoRADS | LoRADS | MIT | primal moment | 👍👍👍 | 👍 | 👍👍👍 | very large |
Mosek[4] | Mosek.jl | commercial | SOS, moment | 👍👍👍 | 👍👍👍 | 👍👍 | ~300 - 500 |
ProxSDP | ProxSDP.jl | MIT | primal moment | 👍👍👍 | 👍👍 | 👍👍👍 | very large |
SCS | SCS.jl | MIT | moment | 👍 | 👍 | 👍👍👍 | |
SketchyCGAL | ∅ | MIT | primal moment | 👎 | 👍 | 👍👍👍 | |
SpecBM | ∅[5] | MIT | SOS | n.a. | n.a. | 👍👍👍 |
Packaged solvers
During the development of this package, several interesting solvers were proposed in research. The ones that were implemented are documented on this page. They can be accessed from the PolynomialOptimization.Solvers
submodule.
Loraine
The Loraine solver is suiteable for large-scale low-rank semidefinite programming. Note that the implementation provided by PolynomialOptimization
supports only a subset of Loraine's features (but this much more efficiently): the direct solver is not available, data matrices are assumed not to be rank-one and always sparse.
PolynomialOptimization.Solvers.Loraine.Model
— TypeModel(; A, C, b, [c_lin, A_lin,] coneDims, check=true)
Constructs a model for PolynomialOptimization
's rewrite of the Loraine solver. This solves the problem
\[ \min \vec c_{\mathrm{lin}} \cdot \vec x + \sum_j \langle C_j, X_j\rangle \\ \text{such that} \\ x_i \geq 0 \ \forall i \\ X_j \succeq 0 \forall j \\ A_{\mathrm{lin}, k} \cdot \vec x + \sum_j \langle \operatorname{mat}(A_{k, j}), X_k\rangle = b_k \ \forall k\]
with the following representation in the variables:
1 ≤ j ≤ length(coneDims) = length(A) = length(C)
A
is a vector ofSparseMatricCSC
s, where every matrix corresponds to a semidefinite optimization variable. Each row in the matrix corresponds to one constraint; when the row is reshaped into a matrix in a col-major order (which must then be symmetric), it defines the coefficient matrix $\operatorname{mat}(A_{k, j})$. The side dimension of the matrix is stored inconeDims
.- If present,
c_lin
is aSparseVector
andA_lin
aSparseMatrixCSC
that define the objective and constraints coefficients for the nonnegative variables. They may be omitted only together.
PolynomialOptimization.Solvers.Loraine.Preconditioner
— TypePreconditioner
- per CG iteration,
PRECONDITIONER_NONE
is faster (lower complexity) thanPRECONDITIONER_HBETA
which is faster thanPRECONDITIONER_HALPHA
- as a preconditioner,
PRECONDITIONER_HALPHA
is better thanPRECONDITIONER_HBETA
is better thanPRECONDITIONER_NONE
, in the sense of CG iterations needed to solve the linear system - some SDP problems are "easy", meaning that CG always converges without preconditioner (i.e.,
preconditioner = PRECONDITIONER_NONE
), so it's always worth trying this option PRECONDITIONER_HYBRID
starts with (cheaper)H_beta
and once it gets into difficulties, switches toH_alpha
PolynomialOptimization.Solvers.Loraine.Solver
— TypeSolver(model; tol_cg=0.01, tol_cg_up=0.5, tol_cg_min=1e-7, eDIMACS=1e-7,
preconditioner=PRECONDITIONER_HALPHA, erank=1, aamat=AMAT_DIAGAᵀA, fig_ev=0,
verb=VERBOSITY_SHORT, initpoint=INITPOINT_LORAINE, maxit=100)
Defines a solver for a previously defined model. Only the iterative conjugate gradient method is implemented.
PolynomialOptimization.Solvers.Loraine.solve!
— Functionsolve!(solver)
Solves a freshly initialized solver. Note that calling this function twice will produce the same results (unless parameters are changed), as the initial state is always restored at the beginning of the call.
LoRADS
The experimental LoRADS solver is suitable for large-scale low-rank semidefinite programming. Note that it is currently in a rather early stage of development and lacks some configurability (in particular with regard to logging). The version that ships with PolynomialOptimization
is a slight modification of the original.
PolynomialOptimization.Solvers.LoRADS.Solver
— TypeSolver()
Creates a new empty LoRADS solver object. This has to be initialized by called init_solver
.
PolynomialOptimization.Solvers.LoRADS.ConeType
— TypeConeType
The type of the current cone. Only CONETYPE_DENSE_SDP
and CONETYPE_SPARSE_SDP
are implemented.
PolynomialOptimization.Solvers.LoRADS.init_solver
— Functioninit_solver(solver, nConstrRows, coneDims::AbstractVector{LoRADSInt}, nLpCols)
Initializes a fresh Solver
object with nConstrRows
constraints, positive semidefinite variables of side dimension coneDims
(a vector of integers), and nLpCols
scalar nonnegative variables.
PolynomialOptimization.Solvers.LoRADS.set_dual_objective
— Functionset_dual_objective(solver, dObj::AbstractVector{Cdouble})
Sets the dual objective, i.e., the right-hand side of the constraints, in an initialized Solver
object.
PolynomialOptimization.Solvers.LoRADS.conedata_to_userdata
— Functionconedata_to_userdata(cone::ConeType, nConstrRows, dim,
coneMatBeg::AbstractVector{LoRADSInt}, coneMatIdx::AbstractVector{LoRADSInt},
coneMatElem::AbstractVector{Cdouble})
Allocates a new user data objects and sets its conic data. This consists of a cone type (only CONETYPE_DENSE_SDP
and CONETYPE_SPARSE_SDP
are supported), the number of rows (which is the same as the number of constraints in the solver) and the side dimension of the semidefinite variable, followed by the constraint matrices in zero-indexed CSR format. Every row corresponds to the vectorized lower triangle of the column of a constraint matrix. The zeroth row is the coefficient matrix for the objective. Therefore, nConstrRows +1 = length(coneMatBeg) -1
should hold (+1
for the objective; -1
for CSR).
The returned userdata pointer should be assigned to a solver, which will take care of freeing the allocated data. Note that the vectors passed to this function must be preserved until the preprocess
function was called, after which they can be freed.
See also set_cone
, init_cone_data
.
PolynomialOptimization.Solvers.LoRADS.set_cone
— Functionset_cone(solver, iCone, userCone)
Sets the iCone
th cone to the data previously defined using conedata_to_userdata
.
See also init_cone_data
.
PolynomialOptimization.Solvers.LoRADS.set_lp_cone
— Functionset_lp_cone(solver, nConstrRows, nLpCols, lpMatBeg::AbstractVector{LoRADSInt},
lpMatIdx::AbstractVector{LoRADSInt}, lpMatElem::AbstractVector{Cdouble})
Set the data of the constraint matrix for the linear variables according to the CSR data specified in the parameters.
PolynomialOptimization.Solvers.LoRADS.init_cone_data
— Functioninit_cone_data(solver, coneMat, coneDims, lpMat)
Initializes the solver for a problem in the form
\[ \min \vec a_0 \cdot \vec x + \sum_j \langle\operatorname{mat}(G_{j, 0}), Z_j\rangle \\ \text{such that} \\ x_i \geq 0 \ \forall i \\ Z_j \succeq 0 \ \forall j \\ \vec a_k \cdot \vec x - \sum_j \langle\operatorname{mat}(G_{j, k}, Z_j)\rangle = c_k \ \forall k\]
with the following representation in the variables:
1 ≤ j ≤ length(coneDims) = length(coneMat)
coneMat
is a vector of matrices,lpMat
is a matrix. They should be in CSR storage, where the row index (starting at 0 for the objective, thenk
for thek
th constraint) is the constraint. Since CSR is not natively supported by Julia, the transpose of aSparseMatrixCSC{Cdouble,LoRADSInt}
is expected.mat
makes the unscaled lower triangle into a full matrix
This is a convenience function that does the job of conedata_to_userdata
, set_cone
, and preprocess
in one step. However, note that it is more efficient to call these functions individually.
PolynomialOptimization.Solvers.LoRADS.preprocess
— Functionpreprocess(solver, coneDims::AbstractVector{LoRADSInt})
Invokes the preprocessor. This should be called after all cones were set up, after which their original data may be reused or destroyed.
See also conedata_to_userdata
, set_cone
.
PolynomialOptimization.Solvers.LoRADS.load_sdpa
— Functionload_sdpa(fn)
Loads a problem from a file fn
in SDPA format and returns a preprocessed Solver
instance, a vector containing the cone dimensions, and the number of nonnegative scalar variables.
PolynomialOptimization.Solvers.LoRADS.solve
— Functionsolve(solver, params, coneDims)
Solves a preprocessed Solver
instance.
PolynomialOptimization.Solvers.LoRADS.get_X
— Functionget_X(solver, i)
Returns the i
th PSD solution matrix $X_i$. The result will be a freshly allocated symmetric view of a dense matrix.
PolynomialOptimization.Solvers.LoRADS.get_Xlin
— Functionget_Xlin(solver)
Returns the linear solution vector $x$. The result will be a vector backed by internal solver data and will be invalidated if the solver is destroyed. Copy it if desired.
PolynomialOptimization.Solvers.LoRADS.get_S
— Functionget_S(solver, i)
Returns the i
th slack variable for the PSD solution matrix $S_i$. The result will be a freshly allocated symmetric view of a dense matrix.
PolynomialOptimization.Solvers.LoRADS.get_Slin
— Functionget_Slin(solver[, i::AbstractUnitRange])
Returns the slack variables to the nonnegative variables of index i
(by default, all). The result will be a freshly allocated vector.
SketchyCGAL
While the solver was implemented for the purpose of being used within PolynomialOptimization
, it also works as a standalone routine (and could in principle be a separate package). SketchyCGAL is a solver that scales very well for large problem sizes by maintaining a sketch of the semidefinite matrices based on the assumption that the optimal solution has low rank; indeed, in polynomial optimizations, if there is an optimal point for the problem that can be encoded in the chosen basis, then this automatically gives rise to a rank-one semidefinite encoding of this point.
PolynomialOptimization.Solvers.SketchyCGAL.sketchy_cgal
— Functionsketchy_cgal(A, b, C; α=(0, 1), rank, ϵ=1e-4, max_iter=0, time_limit=0, verbose=false,
β₀=1, K=∞, method=:auto, callback=(_) -> ()[, A_norm][, A_normsquare])
Enhanced implementation of the SketchyCGAL algorithm. This solves the following problem:
\[ \min \bigl\{ \sum_i \operatorname{tr}(C_i X_i) : \sum_i \operatorname{tr}(A_{j, i} X_i) = b_j \ \forall j, \alpha_1 \leq \sum_i \operatorname{tr}(X_i) \leq \alpha_2, X_i \succeq 0 \ \forall i \bigr\}\]
The function returns the optimization status, objective value, and the optimal matrix $X$ (in the form of an Eigen
factorization object).
Parameters
A
is anAbstractMatrix
whose entries are of typeAbstractMatrix
themselves. Alternatively,A
can also be anAbstractVector
ofAbstractMatrices
; in this case, $A_{j, i}$ is given by taking the matrixA[i]
and reshaping its columns into a square matrix.b
is anAbstractVector
of real numbersC
is anAbstractVector
whose entries are of typeAbstractMatrix
themselvesα
is a 2-tuples of nonnegative numbers, where the numbers defines the bounds on the sum of all tracesrank
controls the rank that is used to approximate the end result. It might either be an integer, which then puts the same rank constraint on all $X_i$, or a tuple/vector of integers, which allows to specify different rank constraints.- The solution accuracy can be controlled by the parameter
ϵ
; however, no more thanmax_iter
iterations are carried out, and no more iterations will be performed iftime_limit
was exceeded (in seconds), regardless ofϵ
. Set any of those three parameters to zero to disable the check. - The
callback
may be called after each iteration and will receive aStatus
as parameter. If the callback returnsfalse
, the iteration will be the last one. - The parameters
β₀
andK
allow to tune the optimization.β₀
is a smoothing,K
limits the dual vector to a generalized sphere of radiusK
around the origin. - The
method
determines the way in which the smallest eigenvalue and its eigenvector are calculated during each iteration. Possible values are (this might also be a vector/tuple, to specify the method for each $X_i$):lanczos_space
(uses the space-efficient implementation described in the SketchyCGAL paper, memory is linear in the problem size):lanczos_time
(uses the same principle, but can save about half of the operations by using more memory: quadratic in the problem size):lobpcg_fast
(uses the LOBPCG solver from theIterativeSolvers
package, bounding the number of iterations with the same heuristic as for the Lanczos methods):lobpcg_accurate
(bounds the error instead toϵ/100
):auto
chooses:lobpcg_accurate
for problem sizes smaller than 10,:lanczos_time
for problem sizes less than 11500 (where roughly 1 GiB is required for the speedup), and:lanczos_space
for all larger problems.
A_normsquare
(orA_norm
) is supposed to hold the sum of the squares of the Frobenius-to-ℓ₂-norms of all the linear operators contained in the columns ofA
. If both parameters are omitted, it is calculated automatically; however, this requires memory that scales quartically in the largest side dimension of theA
(and may not be supported for allAbstractMatrix
types). If both parameters are specified and their values are not consistent, the behavior is undefined.
Optimization status values
:optimal
: the desired accuracyϵ
was reached in both the relative suboptimality gap as well as the relative infeasibility:max_iter
: the maximum number of iterations was reached:timeout
: the maximum computation time was hit:canceled
: the callback returnedfalse
:unknown
: an internal error has happened
See also Status
.
sketchy_cgal(primitive1!, primitive2!, primitive3!, n, b, α; rank, primitive3_norm=0,
primitive3_normsquare=0, ϵ=1e-8, max_iter=10_000, verbose=false, rescale_C=1,
rescale_A=[1, ...], rescale_X=1, β₀=1, K=∞, method=:auto, callback=(_) -> ())
This is the black-box version that allows for matrix-free operations. n
is a tuple or vector that indicates the side dimensions of the semidefinite variables, b
is the right-hand side of the constraint vector (of length d
); and the primitives effectively calculate
primitive1!(v, u, i, α, β) = (v = α * C[i] * u + β * v)
, $u, v \in \mathbb R^{n_i}$primitive2!(v, u, z, i, α, β) = (v = α * adjoint(A[:, i])(z) * u + β * v)
, $u, v \in \mathbb R^{n_i}$, $z \in \mathbb R^d$primitive3!(v, u, i, α, β) = (v = α * A[:, i](u * u') + β * v)
, $u \in \mathbb R^{n_i}$, $v \in \mathbb R^d$
A[:, i](X)
is the linear map [⟨X, A[1, i]⟩, ..., ⟨X, A[d, i]⟩]
and adjoint(A[:, i])(z) = sum(z[j] * A[j, i])
. All of them must also return their outputs (which is v
).
If you are able to calculate these oracles faster or more memory-efficiently than the straightforward implementation (which is based on mul!
), use the blackbox method. It is recommended to obey the following normalization conditions:
\[ \sum_i \lVert C_i\rVert_{\mathrm F}^2 = 1; \quad \sum_i \lVert primitive3!_i\rVert_{\mathrm F \to \ell_2}^2 = 1; \quad \sum_i \lVert A_{1, i}\rVert^2 = \sum_i \lVert A_{2, i}\rVert^2 = \dotsb\]
However, in any case, you need to specify the norm of primitive3!
(i.e., the supremum of norm(primitive3!)
applied to matrices with unit Frobenius norm) in the parameter primitive3_norm
(or primitive3_normsquare
, which is the sum of the individual normsquares). If the norm is unavailable, you need to at least give a lower bound. You may not specify both parameters inconsistently, else the behavior is undefined. If you had to rescale those matrices in order to achieve this normalization condition, you may pass the corresponding rescaling factors in rescale_C
(implying that all C
have been scaled by this one factor) and rescale_A
(implying that a whole row of A
has been scaled by one element from this vector). Additionally, the upper bound in α
should be one for a better performance, which is achievable through rescale_X
(implying that all X
have been scaled by this factor). These are posterior factors that indicate the multiplication that has been done before calling the function in order to enforce compliance. They will be taken into account when calculating the termination criteria (such that ϵ
then corresponds to the original problem and not the rescaled one), filling the status structure or verbose output, and the final values of primal objective and optimal $X$.
PolynomialOptimization.Solvers.SketchyCGAL.Status
— TypeStatus{R}
This struct contains the current information for the Sketchy CGAL solver. Per-iteration callbacks will receive this structure to gather current information.
See also sketchy_cgal
.
SpecBM
While the solver was implemented for the purpose of being used within PolynomialOptimization
, it also works as a standalone routine (and could in principle be a separate package). SpecBM is a spectral bundle algorithm for primal semidefinite programs and is based on the assumption that the optimal dual solution has low rank; indeed, in polynomial optimizations, if there is an optimal point for the problem that can be encoded in the chosen basis, then this automatically gives rise to a rank-one semidefinite moment matrix this point. The implementation also allows for free variables and multiple semidefinite constraints and contains further improvements compared to the reference implementation. It requires either Hypatia or a rather recent version of Mosek (at least 10.1.13) as subsolvers.
PolynomialOptimization.Solvers.SpecBM.specbm_primal
— Functionspecbm_primal(A, b, c; [num_frees,] psds, ρ, r_past, r_current, ϵ=1e-4, β=0.1,
maxiter=10000, maxnodescent=15, adaptiveρ=false, α=1., adaptiveα=true, αmin=1e-5,
αmax=1000., ml=0.001, mu=min(1.5β, 1), Nmin=10, verbose=false, step=20, offset=0,
At=transpose(A), AAt=A*At, [subsolver,]
callback=(data, mastersolver_data)->nothing)
Solves the minimization problem
\[ \min_x \{ ⟨c, x⟩ : A x = b, x = (x_{\mathrm{free}}, \operatorname{svec}(X_1), \dotsc), X_i ⪰ 0, \sum_i \operatorname{tr}(X_i) ≤ ρ \} + \mathit{offset}\]
where the vector $x$ contains num_frees
free variables, followed by the vectorized and scaled lower triangles of PSD matrices $X_i$ that have side dimensions given in psds
. Scaled here means that the off-diagonal elements must be multiplied by $\sqrt2$ when going from the matrix to its vectorization, so that scalar products are preserved. This corresponds to the :LS
format of an SPMatrix
from the StandardPacked
package.
Arguments
Problem formulation
A::AbstractMatrix{R}
: a sparse or dense matrixAt::AbstractMatrix{R}
: the transpose ofA
. If omitted,transpose(A)
is used instead. However, if the transpose is already known in explicit form (in particular, as anotherSparseMatrixCSC
), some operations can be carried out faster.AAt::AbstractMatrix{R}
: the productA*At
, which is also calculated automatically, but can be given if it is already known.b::AbstractVector{R}
: a dense or sparse vectorc::AbstractVector{R}
: a dense or sparse vectoroffset::Real
: an offset that is added to the objectivenum_frees
: the number of free variables in the problem. The firstnum_frees
entries in $x$ will be free. If this value is omitted, it is automatically calculated based on the dimensions ofA
andpsds
.psds::AbstractVector{<:Integer}
: a vector that, for each semidefinite matrix in the problem, specifies its side dimension. A side dimension of $n$ will affect $\frac{n(n +1)}{2}$ variables.ρ::Real
: an upper bound on the total trace in the problem. Note that by settingadaptiveρ=true
, this bound will effectively be removed by dynamically growing as necessary. In this case, the value specified here is the initial value.adaptiveρ::Bool
: effectively sets $\rho \to \infty$; note that an initialρ
still has to be provided.
Spectral bundle parameters
r_past::Integer
: the number of past eigenvectors to keep, must be nonnegativer_current::Integer
: the number of current eigenvectors to keep, must be positiveβ::Real
: A step is recognized as a descent step if the decrease in the objective value is at least a factor $\beta \in (0, 1)$ smaller than the decrease predicted by the model.α::Real
: the regularization parameter for the augmented Lagrangian; must be positiveadaptiveα::Bool=true
: enables adaptive updating ofα
depending on the following five parameters, as described in Liao et al.αmin::Real
: lower bound for the adaptive algorithm thatα
may not exceedαmax::Real
: upper bound for the adaptive algorithm thatα
may not exceedml::Real
:α
is doubled if the decrease in the objective value is at least a factor $m_{\mathrm l} \in (0, \beta)$ larger than predicted by the model, provided no descent step was recognized for at leastNmin
iterations.mu::Real
:α
is halved if the decrease in the objective value is at least a factor $m_{\mathrm u} > \beta$ smaller than predicted by the model.Nmin::Integer
: minimum number of no-descent-steps beforeml
becomes relevantsubsolver::Symbol
: subsolver to solve the quadratic semidefinite subproblem in every iteration of SpecBM. Currently,:Hypatia
and:Mosek
are supported; however, note that Mosek will require at least version 10.1.11 (better 10.1.13 to avoid some rare crashes).
Termination criteria
ϵ::Real
: minimum quality of the result in order for the algorithm to terminate successfully (status:Optimal
)maxiter::Integer
: maximum number of iterations before the algorithm terminates anyway (status:IterationLimit
). Must be at least2
.maxnodescent::Integer
: maximum number of consecutive iterations that may report no descent step before the algorithm terminates (status:SlowProgress
). Must be positive or zero to disable this check.
Logging
verbose::Bool
: print the status everystep
iterations. Note that the first (incomplete) iteration will never be printed.step::Integer
: skip a number of iterations and only print everystep
th.
Advanced solver interaction
callback::Function
: a callback that is called with the last problem data (typeData
) and the last mastersolver data (typeMastersolverData
) before the mastersolver is called anew. Changes to the structures may be made.
See also Result
.
PolynomialOptimization.Solvers.SpecBM.Result
— TypeResult
Contains the result of a SpecBM run
Fields
status::Symbol
: one of:Optimal
,:IterationLimit
,:SlowProgress
objective::R
: the objective valuex::Vector{R}
: the optimal vector of primal variables: first,num_frees
free variables, then all scaled vectorized lower triangles of the PSD variablesy::Vector{R}
: the optimal vector of dual variables, one for each constraintiterations::Int
: the number of iterations until the given status was reachedquality::R
: the optimality quantifier that is compared againstϵ
to determine convergence, which is determined by the maximum of the relative quantities below and the negative primal infeasibility.primal_infeas::R
dual_infeas::R
gap::R
rel_accuracy::R
rel_primal_infeas::R
rel_dual_infeas::R
rel_gap
LANCELOT
PolynomialOptimization
provides an interface to LANCELOT. While LANCELOT is part of the GALAHAD library which has a quite recent Julia interface, the LANCELOT part is still pure Fortran without even a C interface. Therefore, here, we exploit that the pre-packaged binaries are compiled with GFortran, version at least 9, so that we know the binary layout of the parameters and can pretend that we like Fortran. Currently, only LANCELOT_simple
is supported, which is of course not quite ideal[6]. Since Galahad.jl
is a weak dependency, the package has to be loaded first before the Solvers.LANCELOT
module becomes available:
PolynomialOptimization.Solvers.LANCELOT.LANCELOT_simple
— FunctionLANCELOT_simple(n, X, MY_FUN; MY_GRAD=missing, MY_HESS=missing, BL=nothing, BU=nothing,
neq=0, nin=0, CX=nothing, Y=nothing, maxit=1000, gradtol=1e-5, feastol=1e-5,
print_level=1)
Purpose
A simple and somewhat NAIVE interface to LANCELOT B for solving the nonlinear optimization problem
\[\min_x f(x)\]
possibly subject to constraints of the one or more of the forms
\[\begin{aligned} b_{\mathrm l} & \leq x \leq b_{\mathrm u}, \\ c_{\mathrm e}( x ) & = 0, \\ c_{\mathrm i}( x ) & \leq 0 \end{aligned}\]
where $f\colon \mathbb R^n \to \mathbb R$, $c_{\mathrm e}: \mathbb R^n \to \mathbb R^{n_{\mathrm{eq}}}$ and $c_{\mathrm i}\colon \mathbb R^n \to \mathbb R^{n_{\mathrm{in}}}$ are twice-continuously differentiable functions.
Why naive?
At variance with more elaborate interfaces for LANCELOT, the present one completely ignores underlying partial separability or sparsity structure, restricts the possible forms under which the problem may be presented to the solver, and drastically limits the range of available algorithmic options. If simpler to use than its more elaborate counterparts, it therefore provides a possibly substantially inferior numerical performance, especially for difficult/large problems, where structure exploitation and/or careful selection of algorithmic variants matter.
How to use it?
Unconstrained problems
The user should provide, at the very minimum, suitable values for the following input arguments:
n::Integer
: the number of variables,X::AbstractVector{Float64}
(strided vector of sizen
): the starting point for the minimizationMY_FUN
: a function for computing the objective function value for anyX
, whose interface has the default formMY_FUN(X::AbstractVector{Float64})::Float64
whereX[1:n]
contains the values of the variables on input, and which returns a double precision scalar representing the value $f(X)$.- If the gradient of $f$ can be computed, then the (optional) keyword argument
MY_GRAD
must be specified and given a function computing the gradient, whose interface must be of the formMY_GRAD(G::AbstractVector{Float64}, X::AbstractVector{Float64})
, whereG
is a double precision vector of sizen
in which the function returns the value of the gradient of $f$ atX
. - If, additionally, the second-derivative matrix of $f$ at
X
can be computed, the (optional) keyword argumentMY_HESS
must be specified and given a function computing the Hessian, whose interface must be of the formMY_HESS(H::SPMatrix{Float64}, X::AbstractVector{Float64})
, whereH
is a double precision symmetrix matrix in packed storage format (upper triangular by column, see theStandardPacked.jl
package) of the Hessian of $f$ atX
.
In all cases, the best value of $x$ found by LANCELOT B is returned to the user in the vector X
and the associated objective function value is the first return value. The second return value reports the number of iterations performed by LANCELOT before exiting. Finally, the last return value contains the exit status of the LANCELOT run, the value 0
indicating a successful run. Other values indicate errors in the input or unsuccessful runs, and are detailed in the specsheet of LANCELOT B (with the exception of the value 19, which reports a negative value for one or both input arguments nin
and neq
).
Example
Let us consider the optimization problem
\[\min_{x_1, x_2} f(x_1, x_2) = 100 ( x_2 - x_1^2 )^2 + ( 1 - x_1 )^2\]
which is the ever-famous Rosenbrock "banana" problem. The most basic way to solve the problem (but NOT the most efficient) is, assuming the starting point X = [-1.2, 1.]
known, to perform the call PolynomialOptimization.LANCELOT_simple(2, X, FUN)
where the user-provided function FUN
is given by
FUN(X) = @inbounds 100 * (X[2] - X[1]^2)^2 + (1 - X[1])^2
The solution is returned in 60 iterations with exit code 0
.
If we now wish to use first and second derivatives of the objective function, one should use the call
PolynomialOptimization.LANCELOT_simple(2, X, FUN, MY_GRAD=GRAD!, MY_HESS=HESS!)
and provide the additional routines
GRAD!(G, X) = @inbounds begin
G[1] = -400 * (X[2] - X[1]^2) * X[1] - 2 * (1 - X[1])
G[2] = 200 * (X[2] - X[1]^2)
end
HESS!(H, X) = @inbounds begin
H[1, 1] = -400 * (X[2] - 3 * X[1]^2) + 2
H[1, 2] = -400 * X[1]
H[2, 2] = 200
end
Convergence is then obtained in 23 iterations. Note that using exact first-derivatives only is also possible: MY_HESS
should then be absent from the calling sequence and providing the subroutine HESS!
unnecessary.
Bound constrained problems
Bound on the problem variables may be imposed by specifying one or both of
BL::AbstractVector{Float64}
(double precision vector of sizen
): the lower bounds onX
,BU::AbstractVector{Float64}
(double precision vector of sizen
): the upper bounds onX
.
Note that infinite bounds (represented by a number larger than 1e20
in absolute value) are acceptable, as well as equal lower and upper bounds, which amounts to fixing the corresponding variables. Except for the specification of BL
and/or BU
, the interface is identical to that for unconstrained problems.
Example
If one now wishes to impose zero upper bounds on the variables of our unconstrained problem, one could use the following call
PolynomialOptimization.LANCELOT_simple(2, X, FUN, MY_GRAD=GRAD!, MY_HESS=HESS!,
BU=zeros(2))
in which case convergence is obtained in 6 iterations.
Equality constrained problems
If, additionally, general equality constraints are also present in the problem, this must be declared by specifying the following (optional) input argument:
neq::Integer
: the number of equality constraints.
In this case, the equality constraints are numbered from 1 to neq
and the value of the i
-th equality constraint must be computed by a user-supplied routine of the form FUN(X, i)
(with i = 1, ..., neq
) where the function now returns the value of the i
-th equality constraint evaluated at X
if i
is specified. (This extension of the unconstrained case can be implemented by adding an optional argument i
to the unconstrained version of FUN
or by defining a three-parameter method on its own.) If derivatives are available, then the MY_GRAD
and MY_HESS
subroutines must be adapted as well: MY_GRAD(G, X, i)
and MY_HESS(H, X, i)
for computing the gradient and Hessian of the i
-th constraint at X
. Note that, if the gradient of the objective function is available, so must be the gradients of the equality constraints. The same level of derivative availability is assumed for all problem functions (objective and constraints). The final values of the constraints and the values of their associated Lagrange multipliers is optionally returned to the user in the (optional) double precision keyword arguments CX
and Y
, respectively (both being of size neq
).
Inequality constrained problems
If inequality constraints are present in the problem, their inclusion is similar to that of equality constraints. One then needs to specify the (optional) input argument
nin::Integer
: the number of inequality constraints.
The inequality constraints are then numbered from neq+1
to neq+nin
and their values or that of their derivatives is again computed by calling, for i = 1, ..., nin
, FUN(X, i)
, MY_GRAD(G, X, i)
, MY_HESS(H, X, i)
. The inequality constraints are internally converted in equality ones by the addition of a slack variables, whose names are set to 'Slack_i
', where the character i
in this string takes the integers values 1
to nin
. The values of the inequality constraints at the final X
are finally returned (as for equalities) in the optional double precision keyword argument CX
of size nin
. The values of the Lagrange multipliers are returned in the optional double precision output argument Y
of size nin
.
Problems with equality and inequality constraints
If they are both equalities and inequalities, neq
and nin
must be specified and the values and derivatives of the constraints are computed by FUN(X, i)
, GRAD(G, X, i)
, HESS(H, X, i)
(i = 1, ..., neq
) for the equality constraints, and FUN(X, i)
, GRAD(G, X, i)
, HESS(H, X, i)
(i = neq+1, ..., neq+nin
) for the inequality constraints. Again, the same level of derivative availability is assumed for all problem functions (objective and constraints). Finally, the optional arguments CX
and/or Y
, if used, are then of size neq+nin
.
Example
If we now wish the add to the unconstrained version the new constraints
\[\begin{aligned} 0 & \leq x_1 \\ x_1 + 3x_2 - 3 & = 0 \\ x_1^2 + x_2^2 - 4 & \leq 0, \end{aligned}\]
we may transform our call to
CX = Vector{Float64}(undef, 2)
Y = Vector{Float64}(undef, 2)
LANCELOT_simple(2, X, FUN; MY_GRAD=GRAD!, MY_HESS=HESS!, BL=[0., -1e20], neq=1, nin=1,
CX, Y)
(assuming we need CX
and Y
), and add methods for FUN
, GRAD!
and HESS!
as follows
FUN(X, i) = @inbounds begin
if i == 1 # the equality constraint
return X[1] + 3X[2] - 3
elseif i == 2 # the inequality constraint
return X[1]^2 + X[2]^2 - 4
end
return NaN # should never happen
end
GRAD!(G, X, i) = @inbounds begin
if i == 1 # equality constraint's gradient components
G[1] = 1
G[2] = 3
elseif i == 2 # inequality constraint's gradient components
G[1] = 2X[1]
G[2] = 2X[2]
end
return
end
HESS!(H, X, i) = @inbounds begin
if i == 1 # equality constraint's Hessian
fill!(H, 1.)
elseif i == 2 # inequality constraint's Hessian
H[1] = 2
H[2] = 0
H[3] = 2
end
return
end
Convergence is then obtained in 8 iterations. Note that, in our example, the objective function or its derivatives is/are computed if the index i
is omitted (see above). Of course, the above examples can easily be modified to represent new minimization problems :-).
Available algorithmic options
Beyond the choice of derivative level for the problem functions, the following arguments allow a (very limited) control of the algorithmic choices used in LANCELOT.
maxit::Integer
: maximum number of iterations (default:1000
)gradtol::Real
: the threshold on the infinity norm of the gradient (or of the lagrangian's gradient) for declaring convergence (default:1.0e-5
)feastol::Real
: the threshold on the infinity norm of the constraint violation for declaring convergence (for constrained problems) (default:1.0e-5
)print_level::Integer
: a positive number proportional to the amount of output by the package:0
corresponds to the silent mode,1
to a single line of information per iteration (default), while higher values progressively produce more output.
Other sources
The user is encouraged to consult the specsheet of the (non-naive) interface to LANCELOT within the GALAHAD software library for a better view of all possibilities offered by an intelligent use of the package. The library is described in the paper
N. I. M. Gould, D. Orban, Ph. L. Toint,
GALAHAD, a library of thread-sage Fortran 90 packages for large-scale
nonlinear optimization,
Transactions of the AMS on Mathematical Software, vol 29(4),
pp. 353-372, 2003
The book
A. R. Conn, N. I. M. Gould, Ph. L. Toint,
LANCELOT, A Fortan Package for Large-Scale Nonlinear Optimization
(Release A),
Springer Verlag, Heidelberg, 1992
is also a good source of additional information.
Main author: Ph. Toint, November 2007. Copyright reserved, Gould/Orban/Toint, for GALAHAD productions
- 1Note that by default, a sparse solver is used (unless the problem was constructed with a
factor_coercive
different from one). This is typically a good idea for large systems with not too much monomials. However, if you have a very dense system, the sparse solver will take forever; better passdense=true
to the optimization routine. This will then be much faster (and always much more accurate). - 2LANCELOT is a nonlinear solver that directly works on the problem itself. It does not use a relaxation. Therefore, it cannot provide lower-bound guarantees on the objective value; however, there is no problem with the extraction of a solution, as the solver directly works on the decision variables. When invoking the LANCELOT solver, a function is returned which performs the optimization and which requires a vector of initial values as parameter. This function will then return a 2-tuple with the (locally) optimal objective value and the point of the local optimum. Currently, the LANCELOT interface does not support complex-valued problems.
- 3There is a separate Julia package for Loraine. However, the implementation is so bad that it is not only unnecessarily inefficient, but would also not allow to solve large-scale systems. Therefore,
PolynomialOptimization
provides a rewritten implementation, which is heavily based on the original source code, but tries to use the available memory more efficiently (though there is still room for improvement). Since only a subset of the features of the original package has been implemented, this is not yet ready to be contributed to the solver; but it can be expected that in the future, an external package will be required to use Loraine. - 4
:MosekMoment
requires at least version 10,:MosekSOS
already works with version 9. The moment variant is more prone to failure in case of close-to-illposed problems; sometimes, this is an issue of the presolver, which can be turned off by passingMSK_IPAR_PRESOLVE_USE="MSK_PRESOLVE_MODE_OFF"
topoly_optimize
. The performance indicators in the table are valid for:MosekSOS
. The new PSD cone interface of Mosek 10 that is used by the moment-based variant proves to be much slower than the old one; therefore, using:MosekMoment
is not recommended. - 5
SpecBM
is provided byPolynomialOptimization
; however, it requires a subsolver for the quadratic master problem. Currently,Mosek
andHypatia
are implemented and must therefore be loaded to makeSpecBM
work. - 6The branch
lancelot
goes further and defines an interface for the full version of LANCELOT, which is a lot more sophisticated. Unfortunately, it also seems to be broken at the moment and bugfixing will require some debugging of the disassembly. This is not a priority at the moment (which is whenever you read this statement).