BFGS, conjugate gradient, SANN and Nelder-Mead Maximization
maxBFGS.RdThese functions are wrappers for optim, adding
constrained optimization and fixed parameters.
Usage
maxBFGS(fn, grad=NULL, hess=NULL, start, fixed=NULL,
control=NULL,
constraints=NULL,
finalHessian=TRUE,
parscale=rep(1, length=length(start)),
... )
maxCG(fn, grad=NULL, hess=NULL, start, fixed=NULL,
control=NULL,
constraints=NULL,
finalHessian=TRUE,
parscale=rep(1, length=length(start)), ...)
maxSANN(fn, grad=NULL, hess=NULL, start, fixed=NULL,
control=NULL,
constraints=NULL,
finalHessian=TRUE,
parscale=rep(1, length=length(start)),
... )
maxNM(fn, grad=NULL, hess=NULL, start, fixed=NULL,
control=NULL,
constraints=NULL,
finalHessian=TRUE,
parscale=rep(1, length=length(start)),
...)Arguments
- fn
function to be maximised. Must have the parameter vector as the first argument. In order to use numeric gradient and BHHH method,
fnmust return a vector of observation-specific likelihood values. Those are summed internally where necessary. If the parameters are out of range,fnshould returnNA. See details for constant parameters.- grad
gradient of
fn. Must have the parameter vector as the first argument. IfNULL, numeric gradient is used (maxNMandmaxSANNdo not use gradient). Gradient may return a matrix, where columns correspond to the parameters and rows to the observations (useful for maxBHHH). The columns are summed internally.- hess
Hessian of
fn. Not used by any of these methods, included for compatibility withmaxNR.- start
initial values for the parameters. If start values are named, those names are also carried over to the results.
- fixed
parameters to be treated as constants at their
startvalues. If present, it is treated as an index vector ofstartparameters.- control
list of control parameters or a ‘MaxControl’ object. If it is a list, the default values are used for the parameters that are left unspecified by the user. These functions accept the following parameters:
- reltol
sqrt(.Machine$double.eps), stopping condition. Relative convergence tolerance: the algorithm stops if the relative improvement between iterations is less than ‘reltol’. Note: for compatibility reason ‘tol’ is equivalent to ‘reltol’ for optim-based optimizers.
- iterlim
integer, maximum number of iterations. Default values are 200 for ‘BFGS’, 500 (‘CG’ and ‘NM’), and 10000 (‘SANN’). Note that ‘iteration’ may mean different things for different optimizers.
- printLevel
integer, larger number prints more working information. Default 0, no information.
- nm_alpha
1, Nelder-Mead simplex method reflection coefficient (see Nelder & Mead, 1965)
- nm_beta
0.5, Nelder-Mead contraction coefficient
- nm_gamma
2, Nelder-Mead expansion coefficient
% SANN
- sann_cand
NULLor a function for"SANN"algorithm to generate a new candidate point; ifNULL, Gaussian Markov kernel is used (see argumentgrofoptim).- sann_temp
10, starting temperature for the “SANN” cooling schedule. See
optim.- sann_tmax
10, number of function evaluations at each temperature for the “SANN” optimizer. See
optim.- sann_randomSeed
123, integer to seed random numbers to ensure replicability of “SANN” optimization and preserve
Rrandom numbers. Use options likesann_randomSeed=Sys.time()orsann_randomSeed=sample(100,1)if you want stochastic results.
- constraints
either
NULLfor unconstrained optimization or a list with two components. The components may be eithereqAandeqBfor equality-constrained optimization \(A \theta + B = 0\); orineqAandineqBfor inequality constraints \(A \theta + B > 0\). More than one row inineqAandineqBcorresponds to more than one linear constraint, in that case all these must be zero (equality) or positive (inequality constraints). The equality-constrained problem is forwarded tosumt, the inequality-constrained case toconstrOptim2.- finalHessian
how (and if) to calculate the final Hessian. Either
FALSE(not calculate),TRUE(use analytic/numeric Hessian) or"bhhh"/"BHHH"for information equality approach. The latter approach is only suitable for maximizing log-likelihood function. It requires the gradient/log-likelihood to be supplied by individual observations, seemaxBHHHfor details.- parscale
A vector of scaling values for the parameters. Optimization is performed on 'par/parscale' and these should be comparable in the sense that a unit change in any element produces about a unit change in the scaled value. (see
optim)- ...
further arguments for
fnandgrad.
Details
In order to provide a consistent interface, all these functions also
accept arguments that other optimizers use. For instance,
maxNM accepts the ‘grad’ argument despite being a
gradient-less method.
The ‘state’ (or ‘seed’) of R's random number generator
is saved at the beginning of the maxSANN function
and restored at the end of this function
so this function does not affect the generation of random numbers
although the random seed is set to argument random.seed
and the ‘SANN’ algorithm uses random numbers.
Value
object of class "maxim". Data can be extracted through the following functions:
- maxValue
fnvalue at maximum (the last calculated value if not converged.)- coef
estimated parameter value.
- gradient
vector, last calculated gradient value. Should be close to 0 in case of normal convergence.
- estfun
matrix of gradients at parameter value
estimateevaluated at each observation (only ifgradreturns a matrix orgradis not specified andfnreturns a vector).- hessian
Hessian at the maximum (the last calculated value if not converged).
- returnCode
integer. Success code, 0 is success (see
optim).- returnMessage
a short message, describing the return code.
- activePar
logical vector, which parameters are optimized over. Contains only
TRUE-s if no parameters are fixed.- nIter
number of iterations. Two-element integer vector giving the number of calls to
fnandgr, respectively. This excludes those calls needed to compute the Hessian, if requested, and any calls tofnto compute a finite-difference approximation to the gradient.- maximType
character string, type of maximization.
- maxControl
the optimization control parameters in the form of a
MaxControlobject.
The following components can only be extracted directly (with \$):
- constraints
A list, describing the constrained optimization (
NULLif unconstrained). Includes the following components:- type
type of constrained optimization
- outer.iterations
number of iterations in the constraints step
- barrier.value
value of the barrier function
References
Nelder, J. A. & Mead, R. A, Simplex Method for Function Minimization, The Computer Journal, 1965, 7, 308-313
Examples
# Maximum Likelihood estimation of Poissonian distribution
n <- rpois(100, 3)
loglik <- function(l) n*log(l) - l - lfactorial(n)
# we use numeric gradient
summary(maxBFGS(loglik, start=1))
#> --------------------------------------------
#> BFGS maximization
#> Number of iterations: 24
#> Return code: 0
#> successful convergence
#> Function value: -197.1893
#> Estimates:
#> estimate gradient
#> [1,] 2.849999 2.29794e-05
#> --------------------------------------------
# you would probably prefer mean(n) instead of that ;-)
# Note also that maxLik is better suited for Maximum Likelihood
###
### Now an example of constrained optimization
###
f <- function(theta) {
x <- theta[1]
y <- theta[2]
exp(-(x^2 + y^2))
## you may want to use exp(- theta %*% theta) instead
}
## use constraints: x + y >= 1
A <- matrix(c(1, 1), 1, 2)
B <- -1
res <- maxNM(f, start=c(1,1), constraints=list(ineqA=A, ineqB=B),
control=list(printLevel=1))
#> Nelder-Mead direct search function minimizer
#> function value for initial parameters = -0.135135
#> Scaled convergence tolerance is 2.01367e-09
#> Stepsize computed as 0.100000
#> BUILD 3 -0.109500 -0.135135
#> LO-REDUCTION 5 -0.109500 -0.135135
#> EXTENSION 7 -0.132455 -0.196709
#> LO-REDUCTION 9 -0.135135 -0.196709
#> EXTENSION 11 -0.196709 -0.375079
#> LO-REDUCTION 13 -0.196709 -0.375079
#> HI-REDUCTION 15 -0.276440 -0.375079
#> REFLECTION 17 -0.367648 -0.484044
#> LO-REDUCTION 19 -0.375079 -0.484044
#> HI-REDUCTION 21 -0.429307 -0.484044
#> REFLECTION 23 -0.484044 -0.545734
#> LO-REDUCTION 25 -0.484044 -0.545734
#> HI-REDUCTION 27 -0.513326 -0.545734
#> REFLECTION 29 -0.534921 -0.572951
#> REFLECTION 31 -0.545734 -0.572951
#> HI-REDUCTION 33 -0.560758 -0.572951
#> REFLECTION 35 -0.572951 -0.590899
#> LO-REDUCTION 37 -0.572951 -0.590899
#> REFLECTION 39 -0.582560 -0.601943
#> HI-REDUCTION 41 -0.590442 -0.601943
#> LO-REDUCTION 43 -0.590899 -0.601943
#> HI-REDUCTION 45 -0.596296 -0.601943
#> HI-REDUCTION 47 -0.598979 -0.601943
#> REFLECTION 49 -0.600631 -0.604251
#> LO-REDUCTION 51 -0.601943 -0.604251
#> HI-REDUCTION 53 -0.603278 -0.604251
#> HI-REDUCTION 55 -0.603935 -0.604251
#> REFLECTION 57 -0.604160 -0.605162
#> HI-REDUCTION 59 -0.604251 -0.605162
#> HI-REDUCTION 61 -0.604694 -0.605162
#> LO-REDUCTION 63 -0.604718 -0.605162
#> HI-REDUCTION 65 -0.604949 -0.605162
#> REFLECTION 67 -0.605080 -0.605354
#> HI-REDUCTION 69 -0.605162 -0.605354
#> LO-REDUCTION 71 -0.605207 -0.605354
#> REFLECTION 73 -0.605318 -0.605454
#> LO-REDUCTION 75 -0.605354 -0.605454
#> HI-REDUCTION 77 -0.605409 -0.605454
#> LO-REDUCTION 79 -0.605422 -0.605454
#> LO-REDUCTION 81 -0.605452 -0.605459
#> HI-REDUCTION 83 -0.605454 -0.605459
#> LO-REDUCTION 85 -0.605458 -0.605459
#> HI-REDUCTION 87 -0.605459 -0.605459
#> HI-REDUCTION 89 -0.605459 -0.605459
#> LO-REDUCTION 91 -0.605459 -0.605460
#> HI-REDUCTION 93 -0.605459 -0.605460
#> HI-REDUCTION 95 -0.605460 -0.605460
#> LO-REDUCTION 97 -0.605460 -0.605460
#> HI-REDUCTION 99 -0.605460 -0.605460
#> HI-REDUCTION 101 -0.605460 -0.605460
#> HI-REDUCTION 103 -0.605460 -0.605460
#> Exiting from Nelder Mead minimizer
#> 105 function evaluations used
print(summary(res))
#> --------------------------------------------
#> Nelder-Mead maximization
#> Number of iterations: 105
#> Return code: 0
#> successful convergence
#> Function value: 0.6064308
#> Estimates:
#> estimate gradient
#> [1,] 0.5001167 -0.6065724
#> [2,] 0.5000479 -0.6064889
#>
#> Constrained optimization based on constrOptim
#> 1 outer iterations, barrier value -0.0009712347
#> --------------------------------------------