Supported solvers

The following list contains all the solvers and the required packages that provide access to the solver. A solver of name X will always provide at least one of the two methods :XSOS or :XMoment. The former models the sum-of-squares formulation of the problem - each monomial corresponds to one constraint, and the solution comes from the dual vector. The latter models the Lasserre moment hierarchy of the problem - each monomial corresponds to a primal variable. Which method (or both) is offered by the solver depends on the particular interface that is supported by the solver and which form fits more naturally to this interface. Every solver will also provide an alias :X method that defaults to the recommended method for this solver. The given performance indicators are merely based on experience and may not be accurate for your particular problem. In particular the maximally recommended basis size depends heavily on the structure of the final problem, which can easily put the number up or down by 100 or more. All solvers may expose options that can influence the runtime behavior.

SolverPackageLicenseMethodsSpeedAccuracyMemorymax. recomm. basis size
ClarabelClarabel.jlApachemoment👍👍👍👍👍👍👍👍~200
COPTCOPT.jlcommercialmoment👍👍👍👍👍👍👍👍👍~700
Hypatia[1]Hypatia.jlMITmoment👍👍👍👍👍~100
LANCELOT[2]GALAHAD.jlBSDnonlinearn.a.n.a.👍👍👍n.a.
Loraine[3]MITprimal moment👍👍👍👍👍👍👍👍moderately large
LoRADSLoRADSMITprimal moment👍👍👍👍👍👍👍very large
Mosek[4]Mosek.jlcommercialSOS, moment👍👍👍👍👍👍👍👍~300 - 500
ProxSDPProxSDP.jlMITprimal moment👍👍👍👍👍👍👍👍very large
SCSSCS.jlMITmoment👍👍👍👍👍
SketchyCGALMITprimal moment👎👍👍👍👍
SpecBM[5]MITSOSn.a.n.a.👍👍👍

Packaged solvers

During the development of this package, several interesting solvers were proposed in research. The ones that were implemented are documented on this page. They can be accessed from the PolynomialOptimization.Solvers submodule.

Loraine

The Loraine solver is suiteable for large-scale low-rank semidefinite programming. Note that the implementation provided by PolynomialOptimization supports only a subset of Loraine's features (but this much more efficiently): the direct solver is not available, data matrices are assumed not to be rank-one and always sparse.

PolynomialOptimization.Solvers.Loraine.ModelType
Model(; A, C, b, [c_lin, A_lin,] coneDims, check=true)

Constructs a model for PolynomialOptimization's rewrite of the Loraine solver. This solves the problem

\[ \min \vec c_{\mathrm{lin}} \cdot \vec x + \sum_j \langle C_j, X_j\rangle \\ \text{such that} \\ x_i \geq 0 \ \forall i \\ X_j \succeq 0 \forall j \\ A_{\mathrm{lin}, k} \cdot \vec x + \sum_j \langle \operatorname{mat}(A_{k, j}), X_k\rangle = b_k \ \forall k\]

with the following representation in the variables:

  • 1 ≤ j ≤ length(coneDims) = length(A) = length(C)
  • A is a vector of SparseMatricCSCs, where every matrix corresponds to a semidefinite optimization variable. Each row in the matrix corresponds to one constraint; when the row is reshaped into a matrix in a col-major order (which must then be symmetric), it defines the coefficient matrix $\operatorname{mat}(A_{k, j})$. The side dimension of the matrix is stored in coneDims.
  • If present, c_lin is a SparseVector and A_lin a SparseMatrixCSC that define the objective and constraints coefficients for the nonnegative variables. They may be omitted only together.

See also Solver, solve!.

Info

This function checks the validity of the variables; however, it is quite expensive to do the checks for symmetry. They can be disabled by setting check to false.

source
PolynomialOptimization.Solvers.Loraine.PreconditionerType
Preconditioner
  • per CG iteration, PRECONDITIONER_NONE is faster (lower complexity) than PRECONDITIONER_HBETA which is faster than PRECONDITIONER_HALPHA
  • as a preconditioner, PRECONDITIONER_HALPHA is better than PRECONDITIONER_HBETA is better than PRECONDITIONER_NONE, in the sense of CG iterations needed to solve the linear system
  • some SDP problems are "easy", meaning that CG always converges without preconditioner (i.e., preconditioner = PRECONDITIONER_NONE), so it's always worth trying this option
  • PRECONDITIONER_HYBRID starts with (cheaper) H_beta and once it gets into difficulties, switches to H_alpha
source
PolynomialOptimization.Solvers.Loraine.SolverType
Solver(model; tol_cg=0.01, tol_cg_up=0.5, tol_cg_min=1e-7, eDIMACS=1e-7,
    preconditioner=PRECONDITIONER_HALPHA, erank=1, aamat=AMAT_DIAGAᵀA, fig_ev=0,
    verb=VERBOSITY_SHORT, initpoint=INITPOINT_LORAINE, maxit=100)

Defines a solver for a previously defined model. Only the iterative conjugate gradient method is implemented.

See also Model, solve!.

source
PolynomialOptimization.Solvers.Loraine.solve!Function
solve!(solver)

Solves a freshly initialized solver. Note that calling this function twice will produce the same results (unless parameters are changed), as the initial state is always restored at the beginning of the call.

source

LoRADS

The experimental LoRADS solver is suitable for large-scale low-rank semidefinite programming. Note that it is currently in a rather early stage of development and lacks some configurability (in particular with regard to logging). The version that ships with PolynomialOptimization is a slight modification of the original.

PolynomialOptimization.Solvers.LoRADS.init_solverFunction
init_solver(solver, nConstrRows, coneDims::AbstractVector{LoRADSInt}, nLpCols)

Initializes a fresh Solver object with nConstrRows constraints, positive semidefinite variables of side dimension coneDims (a vector of integers), and nLpCols scalar nonnegative variables.

source
PolynomialOptimization.Solvers.LoRADS.conedata_to_userdataFunction
conedata_to_userdata(cone::ConeType, nConstrRows, dim,
    coneMatBeg::AbstractVector{LoRADSInt}, coneMatIdx::AbstractVector{LoRADSInt},
    coneMatElem::AbstractVector{Cdouble})

Allocates a new user data objects and sets its conic data. This consists of a cone type (only CONETYPE_DENSE_SDP and CONETYPE_SPARSE_SDP are supported), the number of rows (which is the same as the number of constraints in the solver) and the side dimension of the semidefinite variable, followed by the constraint matrices in zero-indexed CSR format. Every row corresponds to the vectorized lower triangle of the column of a constraint matrix. The zeroth row is the coefficient matrix for the objective. Therefore, nConstrRows +1 = length(coneMatBeg) -1 should hold (+1 for the objective; -1 for CSR).

The returned userdata pointer should be assigned to a solver, which will take care of freeing the allocated data. Note that the vectors passed to this function must be preserved until the preprocess function was called, after which they can be freed.

See also set_cone, init_cone_data.

source
PolynomialOptimization.Solvers.LoRADS.set_lp_coneFunction
set_lp_cone(solver, nConstrRows, nLpCols, lpMatBeg::AbstractVector{LoRADSInt},
    lpMatIdx::AbstractVector{LoRADSInt}, lpMatElem::AbstractVector{Cdouble})

Set the data of the constraint matrix for the linear variables according to the CSR data specified in the parameters.

Warning

This function is not exported on the original code release and can therefore not be used. However, only the patched version should be used, as it fixes heap corruption errors that can arise during the optimization.

source
PolynomialOptimization.Solvers.LoRADS.init_cone_dataFunction
init_cone_data(solver, coneMat, coneDims, lpMat)

Initializes the solver for a problem in the form

\[ \min \vec a_0 \cdot \vec x + \sum_j \langle\operatorname{mat}(G_{j, 0}), Z_j\rangle \\ \text{such that} \\ x_i \geq 0 \ \forall i \\ Z_j \succeq 0 \ \forall j \\ \vec a_k \cdot \vec x - \sum_j \langle\operatorname{mat}(G_{j, k}, Z_j)\rangle = c_k \ \forall k\]

with the following representation in the variables:

  • 1 ≤ j ≤ length(coneDims) = length(coneMat)
  • coneMat is a vector of matrices, lpMat is a matrix. They should be in CSR storage, where the row index (starting at 0 for the objective, then k for the kth constraint) is the constraint. Since CSR is not natively supported by Julia, the transpose of a SparseMatrixCSC{Cdouble,LoRADSInt} is expected.
  • mat makes the unscaled lower triangle into a full matrix

This is a convenience function that does the job of conedata_to_userdata, set_cone, and preprocess in one step. However, note that it is more efficient to call these functions individually.

source
PolynomialOptimization.Solvers.LoRADS.load_sdpaFunction
load_sdpa(fn)

Loads a problem from a file fn in SDPA format and returns a preprocessed Solver instance, a vector containing the cone dimensions, and the number of nonnegative scalar variables.

Warning

This function will produce memory leaks. The data that is allocated by the LoRADS library cannot be freed by Julia, as no corresponding functions are exported. Only use it for quick tests, not in production.

source
PolynomialOptimization.Solvers.LoRADS.get_XFunction
get_X(solver, i)

Returns the ith PSD solution matrix $X_i$. The result will be a freshly allocated symmetric view of a dense matrix.

Warning

This method may only be called once per i. All further calls with the same i will give wrong output, as the internal solver data is modified.

source
PolynomialOptimization.Solvers.LoRADS.get_XlinFunction
get_Xlin(solver)

Returns the linear solution vector $x$. The result will be a vector backed by internal solver data and will be invalidated if the solver is destroyed. Copy it if desired.

Warning

This method may only be called once. All further calls will give wrong output, as the internal solver data is modified.

source

SketchyCGAL

While the solver was implemented for the purpose of being used within PolynomialOptimization, it also works as a standalone routine (and could in principle be a separate package). SketchyCGAL is a solver that scales very well for large problem sizes by maintaining a sketch of the semidefinite matrices based on the assumption that the optimal solution has low rank; indeed, in polynomial optimizations, if there is an optimal point for the problem that can be encoded in the chosen basis, then this automatically gives rise to a rank-one semidefinite encoding of this point.

PolynomialOptimization.Solvers.SketchyCGAL.sketchy_cgalFunction
sketchy_cgal(A, b, C; α=(0, 1), rank, ϵ=1e-4, max_iter=0, time_limit=0, verbose=false,
    β₀=1, K=∞, method=:auto, callback=(_) -> ()[, A_norm][, A_normsquare])

Enhanced implementation of the SketchyCGAL algorithm. This solves the following problem:

\[ \min \bigl\{ \sum_i \operatorname{tr}(C_i X_i) : \sum_i \operatorname{tr}(A_{j, i} X_i) = b_j \ \forall j, \alpha_1 \leq \sum_i \operatorname{tr}(X_i) \leq \alpha_2, X_i \succeq 0 \ \forall i \bigr\}\]

The function returns the optimization status, objective value, and the optimal matrix $X$ (in the form of an Eigen factorization object).

Parameters

  • A is an AbstractMatrix whose entries are of type AbstractMatrix themselves. Alternatively, A can also be an AbstractVector of AbstractMatrices; in this case, $A_{j, i}$ is given by taking the matrix A[i] and reshaping its columns into a square matrix.
  • b is an AbstractVector of real numbers
  • C is an AbstractVector whose entries are of type AbstractMatrix themselves
  • α is a 2-tuples of nonnegative numbers, where the numbers defines the bounds on the sum of all traces
  • rank controls the rank that is used to approximate the end result. It might either be an integer, which then puts the same rank constraint on all $X_i$, or a tuple/vector of integers, which allows to specify different rank constraints.
  • The solution accuracy can be controlled by the parameter ϵ; however, no more than max_iter iterations are carried out, and no more iterations will be performed if time_limit was exceeded (in seconds), regardless of ϵ. Set any of those three parameters to zero to disable the check.
  • The callback may be called after each iteration and will receive a Status as parameter. If the callback returns false, the iteration will be the last one.
  • The parameters β₀ and K allow to tune the optimization. β₀ is a smoothing, K limits the dual vector to a generalized sphere of radius K around the origin.
  • The method determines the way in which the smallest eigenvalue and its eigenvector are calculated during each iteration. Possible values are (this might also be a vector/tuple, to specify the method for each $X_i$)
    • :lanczos_space (uses the space-efficient implementation described in the SketchyCGAL paper, memory is linear in the problem size)
    • :lanczos_time (uses the same principle, but can save about half of the operations by using more memory: quadratic in the problem size)
    • :lobpcg_fast (uses the LOBPCG solver from the IterativeSolvers package, bounding the number of iterations with the same heuristic as for the Lanczos methods)
    • :lobpcg_accurate (bounds the error instead to ϵ/100)
    • :auto chooses :lobpcg_accurate for problem sizes smaller than 10, :lanczos_time for problem sizes less than 11500 (where roughly 1 GiB is required for the speedup), and :lanczos_space for all larger problems.
  • A_normsquare (or A_norm) is supposed to hold the sum of the squares of the Frobenius-to-ℓ₂-norms of all the linear operators contained in the columns of A. If both parameters are omitted, it is calculated automatically; however, this requires memory that scales quartically in the largest side dimension of the A (and may not be supported for all AbstractMatrix types). If both parameters are specified and their values are not consistent, the behavior is undefined.

Optimization status values

  • :optimal: the desired accuracy ϵ was reached in both the relative suboptimality gap as well as the relative infeasibility
  • :max_iter: the maximum number of iterations was reached
  • :timeout: the maximum computation time was hit
  • :canceled: the callback returned false
  • :unknown: an internal error has happened

See also Status.

sketchy_cgal(primitive1!, primitive2!, primitive3!, n, b, α; rank, primitive3_norm=0,
    primitive3_normsquare=0, ϵ=1e-8, max_iter=10_000, verbose=false, rescale_C=1,
    rescale_A=[1, ...], rescale_X=1, β₀=1, K=∞, method=:auto, callback=(_) -> ())

This is the black-box version that allows for matrix-free operations. n is a tuple or vector that indicates the side dimensions of the semidefinite variables, b is the right-hand side of the constraint vector (of length d); and the primitives effectively calculate

  • primitive1!(v, u, i, α, β) = (v = α * C[i] * u + β * v), $u, v \in \mathbb R^{n_i}$
  • primitive2!(v, u, z, i, α, β) = (v = α * adjoint(A[:, i])(z) * u + β * v), $u, v \in \mathbb R^{n_i}$, $z \in \mathbb R^d$
  • primitive3!(v, u, i, α, β) = (v = α * A[:, i](u * u') + β * v), $u \in \mathbb R^{n_i}$, $v \in \mathbb R^d$

A[:, i](X) is the linear map [⟨X, A[1, i]⟩, ..., ⟨X, A[d, i]⟩] and adjoint(A[:, i])(z) = sum(z[j] * A[j, i]). All of them must also return their outputs (which is v).

If you are able to calculate these oracles faster or more memory-efficiently than the straightforward implementation (which is based on mul!), use the blackbox method. It is recommended to obey the following normalization conditions:

\[ \sum_i \lVert C_i\rVert_{\mathrm F}^2 = 1; \quad \sum_i \lVert primitive3!_i\rVert_{\mathrm F \to \ell_2}^2 = 1; \quad \sum_i \lVert A_{1, i}\rVert^2 = \sum_i \lVert A_{2, i}\rVert^2 = \dotsb\]

However, in any case, you need to specify the norm of primitive3! (i.e., the supremum of norm(primitive3!) applied to matrices with unit Frobenius norm) in the parameter primitive3_norm (or primitive3_normsquare, which is the sum of the individual normsquares). If the norm is unavailable, you need to at least give a lower bound. You may not specify both parameters inconsistently, else the behavior is undefined. If you had to rescale those matrices in order to achieve this normalization condition, you may pass the corresponding rescaling factors in rescale_C (implying that all C have been scaled by this one factor) and rescale_A (implying that a whole row of A has been scaled by one element from this vector). Additionally, the upper bound in α should be one for a better performance, which is achievable through rescale_X (implying that all X have been scaled by this factor). These are posterior factors that indicate the multiplication that has been done before calling the function in order to enforce compliance. They will be taken into account when calculating the termination criteria (such that ϵ then corresponds to the original problem and not the rescaled one), filling the status structure or verbose output, and the final values of primal objective and optimal $X$.

source

SpecBM

While the solver was implemented for the purpose of being used within PolynomialOptimization, it also works as a standalone routine (and could in principle be a separate package). SpecBM is a spectral bundle algorithm for primal semidefinite programs and is based on the assumption that the optimal dual solution has low rank; indeed, in polynomial optimizations, if there is an optimal point for the problem that can be encoded in the chosen basis, then this automatically gives rise to a rank-one semidefinite moment matrix this point. The implementation also allows for free variables and multiple semidefinite constraints and contains further improvements compared to the reference implementation. It requires either Hypatia or a rather recent version of Mosek (at least 10.1.13) as subsolvers.

PolynomialOptimization.Solvers.SpecBM.specbm_primalFunction
specbm_primal(A, b, c; [num_frees,] psds, ρ, r_past, r_current, ϵ=1e-4, β=0.1,
    maxiter=10000, maxnodescent=15, adaptiveρ=false, α=1., adaptiveα=true, αmin=1e-5,
    αmax=1000., ml=0.001, mu=min(1.5β, 1), Nmin=10, verbose=false, step=20, offset=0,
    At=transpose(A), AAt=A*At, [subsolver,]
    callback=(data, mastersolver_data)->nothing)

Solves the minimization problem

\[ \min_x \{ ⟨c, x⟩ : A x = b, x = (x_{\mathrm{free}}, \operatorname{svec}(X_1), \dotsc), X_i ⪰ 0, \sum_i \operatorname{tr}(X_i) ≤ ρ \} + \mathit{offset}\]

where the vector $x$ contains num_frees free variables, followed by the vectorized and scaled lower triangles of PSD matrices $X_i$ that have side dimensions given in psds. Scaled here means that the off-diagonal elements must be multiplied by $\sqrt2$ when going from the matrix to its vectorization, so that scalar products are preserved. This corresponds to the :LS format of an SPMatrix from the StandardPacked package.

Arguments

Problem formulation

  • A::AbstractMatrix{R}: a sparse or dense matrix
  • At::AbstractMatrix{R}: the transpose of A. If omitted, transpose(A) is used instead. However, if the transpose is already known in explicit form (in particular, as another SparseMatrixCSC), some operations can be carried out faster.
  • AAt::AbstractMatrix{R}: the product A*At, which is also calculated automatically, but can be given if it is already known.
  • b::AbstractVector{R}: a dense or sparse vector
  • c::AbstractVector{R}: a dense or sparse vector
  • offset::Real: an offset that is added to the objective
  • num_frees: the number of free variables in the problem. The first num_frees entries in $x$ will be free. If this value is omitted, it is automatically calculated based on the dimensions of A and psds.
  • psds::AbstractVector{<:Integer}: a vector that, for each semidefinite matrix in the problem, specifies its side dimension. A side dimension of $n$ will affect $\frac{n(n +1)}{2}$ variables.
  • ρ::Real: an upper bound on the total trace in the problem. Note that by setting adaptiveρ=true, this bound will effectively be removed by dynamically growing as necessary. In this case, the value specified here is the initial value.
  • adaptiveρ::Bool: effectively sets $\rho \to \infty$; note that an initial ρ still has to be provided.

Spectral bundle parameters

  • r_past::Integer: the number of past eigenvectors to keep, must be nonnegative
  • r_current::Integer: the number of current eigenvectors to keep, must be positive
  • β::Real: A step is recognized as a descent step if the decrease in the objective value is at least a factor $\beta \in (0, 1)$ smaller than the decrease predicted by the model.
  • α::Real: the regularization parameter for the augmented Lagrangian; must be positive
  • adaptiveα::Bool=true: enables adaptive updating of α depending on the following five parameters, as described in Liao et al.
  • αmin::Real: lower bound for the adaptive algorithm that α may not exceed
  • αmax::Real: upper bound for the adaptive algorithm that α may not exceed
  • ml::Real: α is doubled if the decrease in the objective value is at least a factor $m_{\mathrm l} \in (0, \beta)$ larger than predicted by the model, provided no descent step was recognized for at least Nmin iterations.
  • mu::Real: α is halved if the decrease in the objective value is at least a factor $m_{\mathrm u} > \beta$ smaller than predicted by the model.
  • Nmin::Integer: minimum number of no-descent-steps before ml becomes relevant
  • subsolver::Symbol: subsolver to solve the quadratic semidefinite subproblem in every iteration of SpecBM. Currently, :Hypatia and :Mosek are supported; however, note that Mosek will require at least version 10.1.11 (better 10.1.13 to avoid some rare crashes).

Termination criteria

  • ϵ::Real: minimum quality of the result in order for the algorithm to terminate successfully (status :Optimal)
  • maxiter::Integer: maximum number of iterations before the algorithm terminates anyway (status :IterationLimit). Must be at least 2.
  • maxnodescent::Integer: maximum number of consecutive iterations that may report no descent step before the algorithm terminates (status :SlowProgress). Must be positive or zero to disable this check.

Logging

  • verbose::Bool: print the status every step iterations. Note that the first (incomplete) iteration will never be printed.
  • step::Integer: skip a number of iterations and only print every stepth.

Advanced solver interaction

  • callback::Function: a callback that is called with the last problem data (type Data) and the last mastersolver data (type MastersolverData) before the mastersolver is called anew. Changes to the structures may be made.

See also Result.

source
PolynomialOptimization.Solvers.SpecBM.ResultType
Result

Contains the result of a SpecBM run

Fields

  • status::Symbol: one of :Optimal, :IterationLimit, :SlowProgress
  • objective::R: the objective value
  • x::Vector{R}: the optimal vector of primal variables: first, num_frees free variables, then all scaled vectorized lower triangles of the PSD variables
  • y::Vector{R}: the optimal vector of dual variables, one for each constraint
  • iterations::Int: the number of iterations until the given status was reached
  • quality::R: the optimality quantifier that is compared against ϵ to determine convergence, which is determined by the maximum of the relative quantities below and the negative primal infeasibility.
  • primal_infeas::R
  • dual_infeas::R
  • gap::R
  • rel_accuracy::R
  • rel_primal_infeas::R
  • rel_dual_infeas::R
  • rel_gap
source

LANCELOT

PolynomialOptimization provides an interface to LANCELOT. While LANCELOT is part of the GALAHAD library which has a quite recent Julia interface, the LANCELOT part is still pure Fortran without even a C interface. Therefore, here, we exploit that the pre-packaged binaries are compiled with GFortran, version at least 9, so that we know the binary layout of the parameters and can pretend that we like Fortran. Currently, only LANCELOT_simple is supported, which is of course not quite ideal[6]. Since Galahad.jl is a weak dependency, the package has to be loaded first before the Solvers.LANCELOT module becomes available:

PolynomialOptimization.Solvers.LANCELOT.LANCELOT_simpleFunction
LANCELOT_simple(n, X, MY_FUN; MY_GRAD=missing, MY_HESS=missing, BL=nothing, BU=nothing,
    neq=0, nin=0, CX=nothing, Y=nothing, maxit=1000, gradtol=1e-5, feastol=1e-5,
    print_level=1)

Purpose

A simple and somewhat NAIVE interface to LANCELOT B for solving the nonlinear optimization problem

\[\min_x f(x)\]

possibly subject to constraints of the one or more of the forms

\[\begin{aligned} b_{\mathrm l} & \leq x \leq b_{\mathrm u}, \\ c_{\mathrm e}( x ) & = 0, \\ c_{\mathrm i}( x ) & \leq 0 \end{aligned}\]

where $f\colon \mathbb R^n \to \mathbb R$, $c_{\mathrm e}: \mathbb R^n \to \mathbb R^{n_{\mathrm{eq}}}$ and $c_{\mathrm i}\colon \mathbb R^n \to \mathbb R^{n_{\mathrm{in}}}$ are twice-continuously differentiable functions.

Why naive?

At variance with more elaborate interfaces for LANCELOT, the present one completely ignores underlying partial separability or sparsity structure, restricts the possible forms under which the problem may be presented to the solver, and drastically limits the range of available algorithmic options. If simpler to use than its more elaborate counterparts, it therefore provides a possibly substantially inferior numerical performance, especially for difficult/large problems, where structure exploitation and/or careful selection of algorithmic variants matter.

Warning

The best performance obtainable with LANCELOT B is probably not with the present interface.

How to use it?

Unconstrained problems

The user should provide, at the very minimum, suitable values for the following input arguments:

  • n::Integer: the number of variables,
  • X::AbstractVector{Float64} (strided vector of size n): the starting point for the minimization
  • MY_FUN: a function for computing the objective function value for any X, whose interface has the default form MY_FUN(X::AbstractVector{Float64})::Float64 where X[1:n] contains the values of the variables on input, and which returns a double precision scalar representing the value $f(X)$.
  • If the gradient of $f$ can be computed, then the (optional) keyword argument MY_GRAD must be specified and given a function computing the gradient, whose interface must be of the form MY_GRAD(G::AbstractVector{Float64}, X::AbstractVector{Float64}), where G is a double precision vector of size n in which the function returns the value of the gradient of $f$ at X.
  • If, additionally, the second-derivative matrix of $f$ at X can be computed, the (optional) keyword argument MY_HESS must be specified and given a function computing the Hessian, whose interface must be of the form MY_HESS(H::SPMatrix{Float64}, X::AbstractVector{Float64}), where H is a double precision symmetrix matrix in packed storage format (upper triangular by column, see the StandardPacked.jl package) of the Hessian of $f$ at X.

In all cases, the best value of $x$ found by LANCELOT B is returned to the user in the vector X and the associated objective function value is the first return value. The second return value reports the number of iterations performed by LANCELOT before exiting. Finally, the last return value contains the exit status of the LANCELOT run, the value 0 indicating a successful run. Other values indicate errors in the input or unsuccessful runs, and are detailed in the specsheet of LANCELOT B (with the exception of the value 19, which reports a negative value for one or both input arguments nin and neq).

Example

Let us consider the optimization problem

\[\min_{x_1, x_2} f(x_1, x_2) = 100 ( x_2 - x_1^2 )^2 + ( 1 - x_1 )^2\]

which is the ever-famous Rosenbrock "banana" problem. The most basic way to solve the problem (but NOT the most efficient) is, assuming the starting point X = [-1.2, 1.] known, to perform the call PolynomialOptimization.LANCELOT_simple(2, X, FUN) where the user-provided function FUN is given by

FUN(X) = @inbounds 100 * (X[2] - X[1]^2)^2 + (1 - X[1])^2

The solution is returned in 60 iterations with exit code 0.

If we now wish to use first and second derivatives of the objective function, one should use the call

PolynomialOptimization.LANCELOT_simple(2, X, FUN, MY_GRAD=GRAD!, MY_HESS=HESS!)

and provide the additional routines

GRAD!(G, X) = @inbounds begin
    G[1] = -400 * (X[2] - X[1]^2) * X[1] - 2 * (1 - X[1])
    G[2] = 200 * (X[2] - X[1]^2)
end

HESS!(H, X) = @inbounds begin
    H[1, 1] = -400 * (X[2] - 3 * X[1]^2) + 2
    H[1, 2] = -400 * X[1]
    H[2, 2] = 200
end

Convergence is then obtained in 23 iterations. Note that using exact first-derivatives only is also possible: MY_HESS should then be absent from the calling sequence and providing the subroutine HESS! unnecessary.

Bound constrained problems

Bound on the problem variables may be imposed by specifying one or both of

  • BL::AbstractVector{Float64} (double precision vector of size n): the lower bounds on X,
  • BU::AbstractVector{Float64} (double precision vector of size n): the upper bounds on X.

Note that infinite bounds (represented by a number larger than 1e20 in absolute value) are acceptable, as well as equal lower and upper bounds, which amounts to fixing the corresponding variables. Except for the specification of BL and/or BU, the interface is identical to that for unconstrained problems.

Example

If one now wishes to impose zero upper bounds on the variables of our unconstrained problem, one could use the following call

PolynomialOptimization.LANCELOT_simple(2, X, FUN, MY_GRAD=GRAD!, MY_HESS=HESS!,
    BU=zeros(2))

in which case convergence is obtained in 6 iterations.

Equality constrained problems

If, additionally, general equality constraints are also present in the problem, this must be declared by specifying the following (optional) input argument:

  • neq::Integer: the number of equality constraints.

In this case, the equality constraints are numbered from 1 to neq and the value of the i-th equality constraint must be computed by a user-supplied routine of the form FUN(X, i) (with i = 1, ..., neq) where the function now returns the value of the i-th equality constraint evaluated at X if i is specified. (This extension of the unconstrained case can be implemented by adding an optional argument i to the unconstrained version of FUN or by defining a three-parameter method on its own.) If derivatives are available, then the MY_GRAD and MY_HESS subroutines must be adapted as well: MY_GRAD(G, X, i) and MY_HESS(H, X, i) for computing the gradient and Hessian of the i-th constraint at X. Note that, if the gradient of the objective function is available, so must be the gradients of the equality constraints. The same level of derivative availability is assumed for all problem functions (objective and constraints). The final values of the constraints and the values of their associated Lagrange multipliers is optionally returned to the user in the (optional) double precision keyword arguments CX and Y, respectively (both being of size neq).

Inequality constrained problems

If inequality constraints are present in the problem, their inclusion is similar to that of equality constraints. One then needs to specify the (optional) input argument

  • nin::Integer: the number of inequality constraints.

The inequality constraints are then numbered from neq+1 to neq+nin and their values or that of their derivatives is again computed by calling, for i = 1, ..., nin, FUN(X, i), MY_GRAD(G, X, i), MY_HESS(H, X, i). The inequality constraints are internally converted in equality ones by the addition of a slack variables, whose names are set to 'Slack_i', where the character i in this string takes the integers values 1 to nin. The values of the inequality constraints at the final X are finally returned (as for equalities) in the optional double precision keyword argument CX of size nin. The values of the Lagrange multipliers are returned in the optional double precision output argument Y of size nin.

Problems with equality and inequality constraints

If they are both equalities and inequalities, neq and nin must be specified and the values and derivatives of the constraints are computed by FUN(X, i), GRAD(G, X, i), HESS(H, X, i) (i = 1, ..., neq) for the equality constraints, and FUN(X, i), GRAD(G, X, i), HESS(H, X, i) (i = neq+1, ..., neq+nin) for the inequality constraints. Again, the same level of derivative availability is assumed for all problem functions (objective and constraints). Finally, the optional arguments CX and/or Y, if used, are then of size neq+nin.

Example

If we now wish the add to the unconstrained version the new constraints

\[\begin{aligned} 0 & \leq x_1 \\ x_1 + 3x_2 - 3 & = 0 \\ x_1^2 + x_2^2 - 4 & \leq 0, \end{aligned}\]

we may transform our call to

CX = Vector{Float64}(undef, 2)
Y = Vector{Float64}(undef, 2)
LANCELOT_simple(2, X, FUN; MY_GRAD=GRAD!, MY_HESS=HESS!, BL=[0., -1e20], neq=1, nin=1,
    CX, Y)

(assuming we need CX and Y), and add methods for FUN, GRAD! and HESS! as follows

FUN(X, i) = @inbounds begin
    if i == 1 # the equality constraint
        return X[1] + 3X[2] - 3
    elseif i == 2 # the inequality constraint
        return X[1]^2 + X[2]^2 - 4
    end
    return NaN # should never happen
end

GRAD!(G, X, i) = @inbounds begin
    if i == 1 # equality constraint's gradient components
        G[1] = 1
        G[2] = 3
    elseif i == 2 # inequality constraint's gradient components
        G[1] = 2X[1]
        G[2] = 2X[2]
    end
    return
end

HESS!(H, X, i) = @inbounds begin
    if i == 1 # equality constraint's Hessian
        fill!(H, 1.)
    elseif i == 2 # inequality constraint's Hessian
        H[1] = 2
        H[2] = 0
        H[3] = 2
    end
    return
end

Convergence is then obtained in 8 iterations. Note that, in our example, the objective function or its derivatives is/are computed if the index i is omitted (see above). Of course, the above examples can easily be modified to represent new minimization problems :-).

Available algorithmic options

Beyond the choice of derivative level for the problem functions, the following arguments allow a (very limited) control of the algorithmic choices used in LANCELOT.

  • maxit::Integer: maximum number of iterations (default: 1000)
  • gradtol::Real: the threshold on the infinity norm of the gradient (or of the lagrangian's gradient) for declaring convergence (default: 1.0e-5)
  • feastol::Real: the threshold on the infinity norm of the constraint violation for declaring convergence (for constrained problems) (default: 1.0e-5)
  • print_level::Integer: a positive number proportional to the amount of output by the package: 0 corresponds to the silent mode, 1 to a single line of information per iteration (default), while higher values progressively produce more output.

Other sources

The user is encouraged to consult the specsheet of the (non-naive) interface to LANCELOT within the GALAHAD software library for a better view of all possibilities offered by an intelligent use of the package. The library is described in the paper

N. I. M. Gould, D. Orban, Ph. L. Toint,
GALAHAD, a library of thread-sage Fortran 90 packages for large-scale
nonlinear optimization,
Transactions of the AMS on Mathematical Software, vol 29(4),
pp. 353-372, 2003

The book

A. R. Conn, N. I. M. Gould, Ph. L. Toint,
LANCELOT, A Fortan Package for Large-Scale Nonlinear Optimization
(Release A),
Springer Verlag, Heidelberg, 1992

is also a good source of additional information.

Main author: Ph. Toint, November 2007. Copyright reserved, Gould/Orban/Toint, for GALAHAD productions

source
  • 1Note that by default, a sparse solver is used (unless the problem was constructed with a factor_coercive different from one). This is typically a good idea for large systems with not too much monomials. However, if you have a very dense system, the sparse solver will take forever; better pass dense=true to the optimization routine. This will then be much faster (and always much more accurate).
  • 2LANCELOT is a nonlinear solver that directly works on the problem itself. It does not use a relaxation. Therefore, it cannot provide lower-bound guarantees on the objective value; however, there is no problem with the extraction of a solution, as the solver directly works on the decision variables. When invoking the LANCELOT solver, a function is returned which performs the optimization and which requires a vector of initial values as parameter. This function will then return a 2-tuple with the (locally) optimal objective value and the point of the local optimum. Currently, the LANCELOT interface does not support complex-valued problems.
  • 3There is a separate Julia package for Loraine. However, the implementation is so bad that it is not only unnecessarily inefficient, but would also not allow to solve large-scale systems. Therefore, PolynomialOptimization provides a rewritten implementation, which is heavily based on the original source code, but tries to use the available memory more efficiently (though there is still room for improvement). Since only a subset of the features of the original package has been implemented, this is not yet ready to be contributed to the solver; but it can be expected that in the future, an external package will be required to use Loraine.
  • 4:MosekMoment requires at least version 10, :MosekSOS already works with version 9. The moment variant is more prone to failure in case of close-to-illposed problems; sometimes, this is an issue of the presolver, which can be turned off by passing MSK_IPAR_PRESOLVE_USE="MSK_PRESOLVE_MODE_OFF" to poly_optimize. The performance indicators in the table are valid for :MosekSOS. The new PSD cone interface of Mosek 10 that is used by the moment-based variant proves to be much slower than the old one; therefore, using :MosekMoment is not recommended.
  • 5SpecBM is provided by PolynomialOptimization; however, it requires a subsolver for the quadratic master problem. Currently, Mosek and Hypatia are implemented and must therefore be loaded to make SpecBM work.
  • 6The branch lancelot goes further and defines an interface for the full version of LANCELOT, which is a lot more sophisticated. Unfortunately, it also seems to be broken at the moment and bugfixing will require some debugging of the disassembly. This is not a priority at the moment (which is whenever you read this statement).