This an abstract class. Subclasses must implement the method
optimize().
Acquisition functions are strategies to propose new sample points to a
surrogate. The acquisition functions here are modeled as objects with the
goals of adding states to the learning process. Moreover, this design
enables the definition of the optimize() method with a similar API
when we compare different acquisition strategies.
Parameters:
optimizer – Continuous optimizer to be used for the acquisition
function. Default is Differential Evolution (DE) from pymoo. (default: None)
mi_optimizer – Mixed-integer optimizer to be used for the acquisition
function. Default is Genetic Algorithm (MixedVariableGA) from pymoo. (default: None)
rtol (float) – Minimum distance between a candidate point and the
previously selected points relative to the domain size. Default is 1e-6. (default: 1e-06)
Minimum distance between a candidate point and the previously selected
points. This figures out as a constraint in the optimization problem
solved in optimize().
This method is used to check if the acquisition function has converged
based on a termination criterion. The default implementation always
returns False.
Coordinate perturbation acquisition function over the nondominated set.
This acquisition method was proposed in [1]. It perturbs locally each of
the non-dominated sample points to find new sample points. The perturbation
is performed by acquisitionFunc.
Obtain endpoints of the Pareto front as described in [2].
For each component i in the target space, this algorithm solves a cheap
auxiliary optimization problem to minimize the i-th component of the
trained surrogate model. Points that are too close to each other and to
training sample points are eliminated. If all points were to be eliminated,
consider the whole variable domain and sample at the point that maximizes
the minimum distance to training sample points.
Minimize the objective function with surrogate constraints. If a feasible
solution is found and is different from previous sample points, return it as
the new sample. Otherwise, the new sample is the point that is farthest from
previously selected sample points.
This acquisition function is only able to acquire 1 point at a time.
Acquisition by maximization of the expected improvement of a Gaussian
Process.
It starts by running a
global optimization algorithm to find a point xs that maximizes the EI. If
this point is found and the sample size is 1, return this point. Else,
creates a pool of candidates using sampler and xs. From this pool,
select the set of points with that maximize the expected improvement. If
avoid_clusters is True avoid points that are too close to already
chosen ones inspired in the strategy from [4]. Mind that the latter
strategy can slow down considerably the acquisition process, although is
advisable for a sample of good quality.
Parameters:
sampler – Sampler to generate candidate points. Stored in
sampler. (default: None)
avoid_clusters (bool) – When True, use a strategy that avoids points too
close to already chosen ones. Stored in avoid_clusters. (default: True)
Run a global optimization procedure to try to find a point that has the
highest expected improvement for the Gaussian Process.
Moreover, if ybest isn’t provided, run a global optimization procedure
to find the minimum value of the surrogate model. Use the minimum point
as a candidate for this acquisition.
This implementation only works for continuous design variables.
Obtain sample points that are local minima of the surrogate model.
This implementation is based on the one of MISO-MS used in the paper [5].
The original method, Multi-level Single-Linkage, was described in [6].
In each iteration, the algorithm generates a pool of candidates and select
the best candidates (lowest predicted value) that are far enough from each
other. The number of candidates chosen as well as the distance threshold
vary with each iteration. The hypothesis is that the successful candidates
each belong to a different region in the space, which may contain a local
minimum, and those regions cover the whole search space. In the sequence,
the algorithm runs multiple local minimization procedures using the
successful candidates as local guesses. The results of the minimization are
collected for the new sample.
Parameters:
nCand (int) – Number of candidates used on each iteration.
rtol (float) – Minimum distance between a candidate point and the
previously selected points relative to the domain size. Default is 1e-3. (default: 0.001)
Obtain sample points that fill gaps in the Pareto front from [7].
The algorithm proceeds as follows to find each new point:
Find a target value \(\tau\) that should fill a gap in the Pareto
front. Make sure to use a target value that wasn’t used before.
Solve a multi-objective optimization problem that minimizes
\(\|s_i(x)-\tau\|\) for all \(x\) in the search space, where
\(s_i(x)\) is the i-th target value predicted by the surrogate for
\(x\).
If a Pareto-optimal solution was found for the problem above, chooses the
point that minimizes the L1 distance to \(\tau\) to be part of the
new sample.
Parameters:
optimizer – Continuous multi-objective optimizer. If None, use
NSGA2 from pymoo.
mi_optimizer – Mixed-integer multi-objective optimizer. If None, use
MixedVariableGA from pymoo with RankAndCrowding survival strategy.
oldTV – Old target values to be avoided in the acquisition.
Copied to oldTV. (default: ())
Find a target value that should fill a gap in the Pareto front.
As suggested by Mueller (2017), the algorithm fits a linear RBF
model with the points in the Pareto front. This will represent the
(d-1)-dimensional Pareto front surface. Then, the algorithm searches the
a value in the surface that maximizes the distances to previously
selected target values and to the training points of the RBF model. This
value is projected in the d-dimensional space to obtain \(\tau\).
Parameters:
paretoFront (ndarray) – Pareto front in the objective space.
Target value acquisition function for the RBF model based on [8], [9],
and [10].
Every iteration of the algorithm sequentially chooses a number from 0 to
cycleLength + 1 (inclusive) and runs one of the procedures:
Inf-step (0): Selects a sample point that minimizes the
\(\mu\) measure, i.e., mu_measure(). The point selected is
the farthest from the current sample using the kernel measure.
Global search (1 to cycleLength): Minimizes the product of
\(\mu\) measure by the distance to a target value. The target value
is based on the distance to the current minimum of the surrogate. The
described measure is known as the ‘bumpiness measure’.
Local search (cycleLength + 1): Minimizes the bumpiness measure with
a target value equal to the current minimum of the surrogate. If the
current minimum is already represented by the training points of the
surrogate, do a global search with a target value slightly smaller than
the current minimum.
After each sample point is chosen we verify how close it is from the current
sample. If it is too close, we replace it by a random point in the domain
drawn from an uniform distribution. This is strategy was proposed in [11].
Parameters:
cycleLength (int) – Length of the global search cycle. Stored in
cycleLength. (default: 6)
The bumpiness measure \(g_y\) was first defined by Gutmann (2001)
with
suggestions of usage for global optimization with RBF functions. Gutmann
notes that \(g_y(x)\) tends to infinity
when \(x\) tends to a training point of the surrogate, and so they
use \(-1/g_y(x)\) for the minimization problem. Björkman and
Holmström use \(-\log(1/g_y(x))\), which is the same as minimizing
\(\log(g_y(x))\), to avoid a flat minimum. This option seems to
slow down convergence rates for \(g_y(x)\) in [0,1] since it
increases distances in that range.
The present implementation uses genetic algorithms by default, so there
is no point in trying to make \(g_y\) smoother.
Select candidates based on the minimization of an weighted average score.
The weighted average is \(w f_s(x) + (1-w) (-d_s(x))\), where
\(f_s(x)\) is the surrogate value at \(x\) and \(d_s(x)\) is the
distance of \(x\) to its closest neighbor in the current sample. Both
values are scaled to the interval [0, 1], based on the maximum and minimum
values for the pool of candidates. The sampler generates the candidate
points to be scored and then selected.
This acquisition method is prepared deals with multi-objective optimization
following the random perturbation strategy in [12] and [13]. More
specificaly, the
algorithm takes the average value among the predicted target values given by
the surrogate. In other words, \(f_s(x)\) is the average value between
the target components of the surrogate model evaluate at \(x\).
Parameters:
sampler (Sampler) – Sampler to generate candidate points.
Stored in sampler.
weightpattern (float|sequence) – Weight(s) w to be used in the score.
Stored in weightpattern.
The default value is [0.2, 0.4, 0.6, 0.9, 0.95, 1].
maxeval (int) – Description
Stored in maxeval. (default: 0)
Select n points from a pool of candidates using argminscore()
iteratively.
The score on the iteration i > 1 uses the distances to cadidates
selected in the iterations 0 to i-1.
Parameters:
x (ndarray) – Matrix with candidate points.
distx (ndarray) – Matrix with the distances between the candidate points and
the m number of rows of x.
fx (ndarray) – Vector with the estimated values for the objective function
on the candidate points.
n (int) – Number of points to be selected for the next costly
evaluation.
tol (float) – Tolerance value for excluding candidates that are too close to
current sample points.
Return type:
tuple[ndarray, ndarray]
Returns:
n-by-dim matrix with the selected points.
n-by-(n+m) matrix with the distances between the n selected points
and the (n+m) sampled points (m is the number of points that have
been sampled so far).
bounds (sequence) – List with the limits [x_min,x_max] of each
direction x in the space.
n (int) – Number of points requested. (default: 1)
xbest – Best point so far. Used if sampler is an instance
of soogo.sampling.NormalSampler. If not provided,
compute it based on the training data for the surrogate.
Return type:
ndarray
Returns:
m-by-dim matrix with the selected points, where m <= n.