soogo.model.gp module

Gaussian process module.

class soogo.model.gp.GaussianProcess(scaler=None, **kwargs) None

Bases: Surrogate

Gaussian Process model.

This model uses default attributes and parameters from GaussianProcessRegressor with the following exceptions:

  • kernel: Default is sklearn.gaussian_process.kernels.RBF().

  • optimizer: Default is _optimizer().

  • normalize_y: Default is True.

  • n_restarts_optimizer: Default is 10.

Check other attributes and parameters for GaussianProcessRegressor at https://scikit-learn.org/dev/modules/generated/sklearn.gaussian_process.GaussianProcessRegressor.html.

Parameters:

scaler – Scaler for the input data. For details, see https://scikit-learn.org/stable/modules/preprocessing.html. (default: None)

scaler

Scaler used to preprocess input data.

model

The underlying GaussianProcessRegressor model instance. This is initialized with the provided parameters and can be accessed for further customization or inspection.

property X: ndarray

Get the training data points.

Returns:

m-by-d matrix with m training points in a d-dimensional space.

property Y: ndarray

Get f(x) for the sampled points.

__call__(x: ndarray, i: int = -1, return_std: bool = False, return_cov: bool = False)

Evaluates the model at one or multiple points.

Parameters:
  • x (ndarray) – m-by-d matrix with m point coordinates in a d-dimensional space.

  • i (int) – Index of the target dimension to evaluate. If -1, evaluate all. (default: -1)

  • return_std (bool) – If True, returns the standard deviation of the predictions. (default: False)

  • return_cov (bool) – If True, returns the covariance of the predictions. (default: False)

Returns:

  • m-by-n matrix with m predictions.

  • If return_std is True, the second output is a m-by-n matrix

    with the standard deviations.

  • If return_cov is True, the third output is a m-by-m matrix

    with the covariances if n=1, otherwise it is a m-by-m-by-n matrix.

check_initial_design(sample: ndarray) bool

Check if the sample is able to generate a valid surrogate.

Parameters:

sample (ndarray) – m-by-d matrix with m training points in a d-dimensional space.

Return type:

bool

eval_kernel(x, y=None)

Evaluate the kernel function at a pair (x,y).

The structure of the kernel is the same as the one passed as parameter but with optimized hyperparameters.

Parameters:
  • x – First entry in the tuple (x,y).

  • y – Second entry in the tuple (x,y). If None, use x. (default: None)

Returns:

Kernel evaluation result.

expected_improvement(x, ybest)
property iindex: tuple[int, ...]

Return iindex, the sequence of integer variable indexes.

min_design_space_size(dim: int) int

Return the minimum design space size for a given space dimension.

Return type:

int

reserve(n: int, dim: int, ntarget: int = 1) None

Reserve space for training data.

Parameters:
  • n (int) – Number of training points to reserve.

  • dim (int) – Dimension of the input space.

  • ntarget (int) – Dimension of the target space. (default: 1)

Return type:

None

reset_data() None

Reset the surrogate model training data.

This method is used to clear the training data of the surrogate model, allowing it to be reused for a new optimization run.

Return type:

None

update(Xnew, ynew) None

Updates the model with new pairs of data (x,y).

When the default optimizer method, _optimizer(), is used as optimizer, this routine reports different warnings compared to sklearn.gaussian_process.GaussianProcessRegressor.fit(). The latter reports any convergence failure in L-BFGS-B. This implementation reports the last convergence failure in the multiple L-BFGS-B runs only if there all the runs end up failing. The number of optimization runs is n_restarts_optimizer + 1.

Parameters:
  • Xnew – m-by-d matrix with m point coordinates in a d-dimensional space.

  • ynew – Function values on the sampled points.

Return type:

None