soogo.model.gp module

Gaussian process module.

class soogo.model.gp.GaussianProcess(scaler=None, **kwargs) → None

Bases: Surrogate

Gaussian Process model.

This model uses default attributes and parameters from GaussianProcessRegressor with the following exceptions:

kernel: Default is sklearn.gaussian_process.kernels.RBF().
optimizer: Default is _optimizer().
normalize_y: Default is True.
n_restarts_optimizer: Default is 10.

Check other attributes and parameters for GaussianProcessRegressor at https://scikit-learn.org/dev/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html.

Parameters:: scaler – Scaler for the input data. For details, see https://scikit-learn.org/stable/modules/preprocessing.html. (default: None)

scaler: Scaler used to preprocess input data.

model: The underlying GaussianProcessRegressor model instance. This is initialized with the provided parameters and can be accessed for further customization or inspection.

property X: ndarray

Get the training data points.

Returns:: m-by-d matrix with m training points in a d-dimensional space.

property Y: ndarray: Get f(x) for the sampled points.

__call__(x: ndarray, i: int = -1, return_std: bool = False, return_cov: bool = False)

Evaluates the model at one or multiple points.

Parameters:

x (ndarray) – m-by-d matrix with m point coordinates in a d-dimensional space.
i (int) – Index of the target dimension to evaluate. If -1, evaluate all. (default: -1)
return_std (bool) – If True, returns the standard deviation of the predictions. (default: False)
return_cov (bool) – If True, returns the covariance of the predictions. (default: False)

Returns:

m-by-n matrix with m predictions.
If return_std is True, the second output is a m-by-n matrix
with the standard deviations.
If return_cov is True, the third output is a m-by-m matrix
with the covariances if n=1, otherwise it is a m-by-m-by-n matrix.

check_initial_design(sample: ndarray) → bool

Check if the sample is able to generate a valid surrogate.

Parameters:: sample (ndarray) – m-by-d matrix with m training points in a d-dimensional space.
Return type:: bool

eval_kernel(x, y=None)

Evaluate the kernel function at a pair (x,y).

The structure of the kernel is the same as the one passed as parameter but with optimized hyperparameters.

Parameters:

x – First entry in the tuple (x,y).
y – Second entry in the tuple (x,y). If None, use x. (default: None)

Returns:

Kernel evaluation result.

expected_improvement(x, ybest)

property iindex: tuple[int, ...]: Return iindex, the sequence of integer variable indexes.

min_design_space_size(dim: int) → int

Return the minimum design space size for a given space dimension.

Return type:: int

reserve(n: int, dim: int, ntarget: int = 1) → None

Reserve space for training data.

Parameters:

n (int) – Number of training points to reserve.
dim (int) – Dimension of the input space.
ntarget (int) – Dimension of the target space. (default: 1)

Return type:

None

reset_data() → None

Reset the surrogate model training data.

This method is used to clear the training data of the surrogate model, allowing it to be reused for a new optimization run.

Return type:: None

update(Xnew, ynew) → None

Updates the model with new pairs of data (x,y).

When the default optimizer method, _optimizer(), is used as optimizer, this routine reports different warnings compared to sklearn.gaussian_process.GaussianProcessRegressor.fit(). The latter reports any convergence failure in L-BFGS-B. This implementation reports the last convergence failure in the multiple L-BFGS-B runs only if there all the runs end up failing. The number of optimization runs is n_restarts_optimizer + 1.

Parameters:

Xnew – m-by-d matrix with m point coordinates in a d-dimensional space.
ynew – Function values on the sampled points.

Return type:

None