This an abstract class. Subclasses must implement the method
optimize().
Acquisition functions are strategies to propose new sample points to a
surrogate. The acquisition functions here are modeled as objects with the
goals of adding states to the learning process. Moreover, this design
enables the definition of the optimize() method with a similar API
when we compare different acquisition strategies.
Parameters:
optimizer – Continuous optimizer to be used for the acquisition
function. Default is Differential Evolution (DE) from pymoo. (default: None)
mi_optimizer – Mixed-integer optimizer to be used for the acquisition
function. Default is Genetic Algorithm (MixedVariableGA) from pymoo. (default: None)
rtol (float) – Minimum distance between a candidate point and the
previously selected points relative to the domain size. Default is 1e-6. (default: 1e-06)
Minimum distance between a candidate point and the previously selected
points. This figures out as a constraint in the optimization problem
solved in optimize().
This method is used to check if the acquisition function has converged
based on a termination criterion. The default implementation always
returns False.
Alternated acquisition function that cycles through a list of acquisition
functions.
The current acquisition function moves to the next in the list when the
current one’s termination criterion is met. To progress through the
acquisition functions, the update method must be called after each
optimization step. This provides the function with the current optimization
state, allowing it to determine if the termination condition has been
satisfied.
Parameters:
acquisitionFuncArray (Sequence[AcquisitionFunction]) – List of acquisition functions to be used in
sequence.
Coordinate perturbation acquisition function over the nondominated set.
This acquisition method was proposed in [1]. It perturbs locally each of
the non-dominated sample points to find new sample points. The perturbation
is performed by acquisitionFunc.
Obtain endpoints of the Pareto front as described in [2].
For each component i in the target space, this algorithm solves a cheap
auxiliary optimization problem to minimize the i-th component of the
trained surrogate model. Points that are too close to each other and to
training sample points are eliminated. If all points were to be eliminated,
consider the whole variable domain and sample at the point that maximizes
the minimum distance to training sample points.
Minimize the objective function with surrogate constraints. If a feasible
solution is found and is different from previous sample points, return it as
the new sample. Otherwise, the new sample is the point that is farthest from
previously selected sample points.
This acquisition function is only able to acquire 1 point at a time.
surrogateModel (Surrogate) – Multi-target surrogate model for the constraints.
bounds (sequence) – List with the limits [x_min,x_max] of each
direction x in the space.
n (int) – Unused. (default: 1)
constraintTransform (callable) – Function to transform the constraints.
If not provided, use the identity function. The optimizer, pymoo,
expects that negative and zero values are feasible, and positive
values are infeasible. (default: None)
bounds (sequence) – List with the limits [x_min,x_max] of each
direction x in the space.
n (int) – Number of points to be acquired. (default: 1)
points (Optional[ndarray]) – Points to consider for distance maximization. If None,
use all previously sampled points in the surrogate model. (default: None)
Acquisition by maximization of the expected improvement of a Gaussian
Process.
It starts by running a
global optimization algorithm to find a point xs that maximizes the EI. If
this point is found and the sample size is 1, return this point. Else,
creates a pool of candidates using sampler and xs. From this pool,
select the set of points with that maximize the expected improvement. If
avoid_clusters is True avoid points that are too close to already
chosen ones inspired in the strategy from [5]. Mind that the latter
strategy can slow down considerably the acquisition process, although is
advisable for a sample of good quality.
Parameters:
sampler – Sampler to generate candidate points. Stored in
sampler. (default: None)
avoid_clusters (bool) – When True, use a strategy that avoids points too
close to already chosen ones. Stored in avoid_clusters. (default: True)
Run a global optimization procedure to try to find a point that has the
highest expected improvement for the Gaussian Process.
Moreover, if ybest isn’t provided, run a global optimization procedure
to find the minimum value of the surrogate model. Use the minimum point
as a candidate for this acquisition.
This implementation only works for continuous design variables.
Obtain sample points that are local minima of the surrogate model.
This implementation is based on the one of MISO-MS used in the paper [6].
The original method, Multi-level Single-Linkage, was described in [7].
In each iteration, the algorithm generates a pool of candidates and select
the best candidates (lowest predicted value) that are far enough from each
other. The number of candidates chosen as well as the distance threshold
vary with each iteration. The hypothesis is that the successful candidates
each belong to a different region in the space, which may contain a local
minimum, and those regions cover the whole search space. In the sequence,
the algorithm runs multiple local minimization procedures using the
successful candidates as local guesses. The results of the minimization are
collected for the new sample.
Parameters:
nCand (int) – Number of candidates used on each iteration.
rtol (float) – Minimum distance between a candidate point and the
previously selected points relative to the domain size. Default is 1e-3. (default: 0.001)
Obtain sample points that fill gaps in the Pareto front from [8].
The algorithm proceeds as follows to find each new point:
Find a target value \(\tau\) that should fill a gap in the Pareto
front. Make sure to use a target value that wasn’t used before.
Solve a multi-objective optimization problem that minimizes
\(\|s_i(x)-\tau\|\) for all \(x\) in the search space, where
\(s_i(x)\) is the i-th target value predicted by the surrogate for
\(x\).
If a Pareto-optimal solution was found for the problem above, chooses the
point that minimizes the L1 distance to \(\tau\) to be part of the
new sample.
Parameters:
optimizer – Continuous multi-objective optimizer. If None, use
NSGA2 from pymoo.
mi_optimizer – Mixed-integer multi-objective optimizer. If None, use
MixedVariableGA from pymoo with RankAndCrowding survival strategy.
oldTV – Old target values to be avoided in the acquisition.
Copied to oldTV. (default: ())
Find a target value that should fill a gap in the Pareto front.
As suggested by Mueller (2017), the algorithm fits a linear RBF
model with the points in the Pareto front. This will represent the
(d-1)-dimensional Pareto front surface. Then, the algorithm searches the
a value in the surface that maximizes the distances to previously
selected target values and to the training points of the RBF model. This
value is projected in the d-dimensional space to obtain \(\tau\).
Parameters:
paretoFront (ndarray) – Pareto front in the objective space.
Target value acquisition function for the RBF model based on [9], [10],
and [11].
Every iteration of the algorithm sequentially chooses a number from 0 to
cycleLength + 1 (inclusive) and runs one of the procedures:
Inf-step (0): Selects a sample point that minimizes the
\(\mu\) measure, i.e., mu_measure(). The point selected is
the farthest from the current sample using the kernel measure.
Global search (1 to cycleLength): Minimizes the product of
\(\mu\) measure by the distance to a target value. The target value
is based on the distance to the current minimum of the surrogate. The
described measure is known as the ‘bumpiness measure’.
Local search (cycleLength + 1): Minimizes the bumpiness measure with
a target value equal to the current minimum of the surrogate. If the
current minimum is already represented by the training points of the
surrogate, do a global search with a target value slightly smaller than
the current minimum.
After each sample point is chosen we verify how close it is from the current
sample. If it is too close, we replace it by a random point in the domain
drawn from an uniform distribution. This is strategy was proposed in [12].
Parameters:
cycleLength (int) – Length of the global search cycle. Stored in
cycleLength. (default: 6)
The bumpiness measure \(g_y\) was first defined by Gutmann (2001)
with
suggestions of usage for global optimization with RBF functions. Gutmann
notes that \(g_y(x)\) tends to infinity
when \(x\) tends to a training point of the surrogate, and so they
use \(-1/g_y(x)\) for the minimization problem. Björkman and
Holmström use \(-\log(1/g_y(x))\), which is the same as minimizing
\(\log(g_y(x))\), to avoid a flat minimum. This option seems to
slow down convergence rates for \(g_y(x)\) in [0,1] since it
increases distances in that range.
The present implementation uses genetic algorithms by default, so there
is no point in trying to make \(g_y\) smoother.
Transition search acquisition function as described in [13].
This acquisition function is used to find new sample points by perturbing
the current best sample point and uniformly selecting points from the
domain. The scoreWeight parameter can be used to control the transition
from local to global search. A scoreWeight close to 1.0 will favor
the predicted function value (local search), while a scoreWeight close to
0.0 will favor the distance to previously sampled points (global search).
The evaluability of candidate points is predicted using the candidate
surrogate model. If the evaluability probaility of a candidate point
is below a threshold, the point is discarded. If no candidate points
remain after this filtering, all candidate points are considered
evaluable.
The candidate points are scored using a weighted value of their predicted
function value and the distance to previously sampled points. The candidate
with the best total score is selected as the new sample point.
Generate candidate points by perturbing the current best point and
uniformly sampling from the bounds. A total of 2* nCand candidate
points are generated, where the first nCand points are perturbations of
the current best point and the second nCand points are uniformly sampled
from the bounds.
Parameters:
surrogateModel (Surrogate) – Surrogate model for the objective function.
bounds – List with the limits [x_min, x_max] of each direction.
nCand (int) – Number of candidate points to be generated.
xbest (ndarray) – Current best point. If None, use the best point from the
surrogate model. (default: None)
This acquisition function generates candidate points by perturbing the
current best point and uniformly sampling from the bounds. It then
selects the best candidate point based on a weighted score that combines
the predicted function value and the distance to previously sampled
points.
Parameters:
surrogateModel (Surrogate) – Surrogate model for the objective function.
bounds – List with the limits [x_min, x_max] of each direction.
n (int) – Number of points to be acquired. (default: 1)
evaluabilitySurrogate (Surrogate) – Surrogate model for the evaluability
probability of the candidate points. If provided, candidates with
evaluability probability below the threshold are discarded. (default: None)
evaluabilityThreshold (float) – Threshold for the evaluability probability. (default: 0.25)
scoreWeight (float) – Weight for the predicted function value and distance
scores in the total score. The total score is computed as:
scoreWeight * valueScore + (1 - scoreWeight) * distanceScore. (default: 0.5)
Select the best candidate points based on the predicted function
value and distance to previously sampled points. The candidates are
scored using a weighted score that combines the predicted function
value and the distance to previously sampled points.
Parameters:
surrogateModel (Surrogate) – Surrogate model for the objective function.
candidates (ndarray) – Array of candidate points.
bounds – List with the limits [x_min, x_max] of each direction.
n (int) – Number of best candidates to return. (default: 1)
scoreWeight (float) – Weight for the predicted function value and distance
scores in the total score. (default: 0.5)
evaluabilityThreshold (float) – Threshold for the evaluability
probability. Candidates with evaluability probability below this
threshold are discarded. (default: 0.25)
evaluabilitySurrogate (Surrogate) – Surrogate model for the evaluability
probability of the candidate points. If provided, candidates with
evaluability probability below the threshold are discarded. (default: None)
Select candidates based on the minimization of an weighted average score.
The weighted average is \(w f_s(x) + (1-w) (-d_s(x))\), where
\(f_s(x)\) is the surrogate value at \(x\) and \(d_s(x)\) is the
distance of \(x\) to its closest neighbor in the current sample. Both
values are scaled to the interval [0, 1], based on the maximum and minimum
values for the pool of candidates. The sampler generates the candidate
points to be scored and then selected.
This acquisition method is prepared deals with multi-objective optimization
following the random perturbation strategy in [14] and [15]. More
specificaly, the
algorithm takes the average value among the predicted target values given by
the surrogate. In other words, \(f_s(x)\) is the average value between
the target components of the surrogate model evaluate at \(x\).
Parameters:
sampler (Sampler) – Sampler to generate candidate points.
Stored in sampler.
weightpattern (float|sequence) – Weight(s) w to be used in the score.
Stored in weightpattern.
The default value is [0.2, 0.4, 0.6, 0.9, 0.95, 1].
maxeval (int) – Description
Stored in maxeval. (default: 0)
bounds (sequence) – List with the limits [x_min,x_max] of each
direction x in the space.
n (int) – Number of points requested. (default: 1)
xbest – Best point so far. Used if sampler is an instance
of soogo.sampling.NormalSampler. If not provided,
compute it based on the training data for the surrogate.
Return type:
ndarray
Returns:
m-by-dim matrix with the selected points, where m <= n.
Computes the weighted score from the predicted value of the surrogate
model at a point and minimum distance from the point to the set of
previously selected evaluation points.