GP
src.geostat.model.GP
dataclass
Gaussian Process (GP) model class with a mean function and a kernel.
This class represents a Gaussian Process with specified mean and kernel functions. If no mean is provided, a zero mean is used by default. The kernel must always be specified. The class supports addition to combine two GP models, and it allows gathering variables from the mean and kernel.
Parameters:
-
mean
(Mean
, default:None
) –The mean function of the Gaussian Process. If not provided or set to 0, a ZeroTrend is used as the default mean.
-
kernel
(Kernel
, default:None
) –The kernel function of the Gaussian Process. This parameter is required.
Details:
This is how to specify a GP with a squared exponential kernel and superimposed uncorrelated noise:
import geostat.kernel as krn
from geostat import GP, Model, Parameters
p = Parameters(range=1., sill=1., nugget=1.)
kernel = krn.SquaredExponential(range=p.range, sill=p.sill) + krn.Noise(nugget=p.nugget)
gp = GP(0, kernel)
To use the GP, it must be wrapped in a model:
This model object can then be used to generate synthetic data,
fit its parameters to provided data, or make predictions, see
Model
.
In Geostat, GPs can be defined on locations in Euclidean space
of any dimension \(\mathbb{R}^D\), with the number of dimensions
specified implicitly by the shape of the location matrix given
to fit()
or generate()
in the locs
argument.
GPs can also be defined on locations in \(\mathbb{R}^D \times
\mathbb{Z}\) using the Mix
operator.
This construction is for modeling multiple spatial quantities,
with each quantity occupying a different 'plane' of the space.
When multiple spatial quantities are involved, these are specified
in the cats
argument of fit()
, generate()
or predict()
.
GPs can be superimposed:
Examples:
A linear regression is a special case of GP regression that can be modeled by Geostat. Suppose $$ u_i = \beta_1 + \beta_2 x_i + \beta_3 y_i + \beta_4 x_i^2 \ + \beta_5 x_i y_i + \beta_6 y_i^2 + \epsilon_i $$ where \(u_i\) is an observation, \(x_i\) and \(y_i\) are model inputs, \(\epsilon_i \sim \mathcal{N}(0, \sigma^2)\) describes observation noise, and \(\beta_1, \ldots, \beta_6\) are regression coefficients. Geostat can be used to fit this regression (though not in the most efficient way):
@featurizer
def trend_terms(x, y):
return 1, x, y, x*x, x*y, y*y
p = Parameters(beta=np.zeros([6]), sigma2=1.)
mean = mn.Trend(trend_terms, beta=p.beta)
kernel = krn.Noise(nugget=p.sigma2)
gp = GP(mean, kernel)
Model(gp).fit(locs, vals) # locs.shape = [N, 2], vals.shape = [N]
Source code in src/geostat/model.py
31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|