NormalizingFeaturizer class for producing normalized feature matrices (F matrix) with an intercept.
The NormalizingFeaturizer
takes raw location data and applies a specified featurization function.
It normalizes the resulting features and remembers normalization parameters using the mean and standard deviation calculated from the
original data and adds an intercept feature (a column of ones) to the matrix.
Parameters:
-
featurization
(Callable
)
–
A function or callable that defines how the input location data should be featurized.
-
locs
(array - like or Tensor
)
–
The input location data used for calculating normalization parameters (mean and standard
deviation) and featurizing new data.
Examples:
Creating a NormalizingFeaturizer
using a custom featurization function and location data:
import tensorflow as tf
from geostat.model import NormalizingFeaturizer
# Define a simple featurization function
def custom_featurizer(x, y):
return x, y, x * y
# Sample location data
locs = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
# Create the NormalizingFeaturizer
norm_featurizer = NormalizingFeaturizer(custom_featurizer, locs)
Using the NormalizingFeaturizer
to featurize new location data:
new_locs = tf.constant([[7.0, 8.0], [9.0, 10.0]])
F_matrix = norm_featurizer(new_locs)
print(F_matrix) # F_matrix will contain normalized features with an additional intercept column
# tf.Tensor(
# [[1. 2.4494898 2.4494898 3.5676992]
# [1. 3.6742349 3.6742349 6.50242 ]], shape=(2, 4), dtype=float32)
Notes:
- The normalization parameters (
unnorm_mean
and unnorm_std
) are calculated based on the
initial locs
data provided during initialization.
- The
__call__
method applies the normalization and adds an intercept feature when used
to featurize new location data.
Source code in src/geostat/model.py
| class NormalizingFeaturizer:
"""
NormalizingFeaturizer class for producing normalized feature matrices (F matrix) with an intercept.
The `NormalizingFeaturizer` takes raw location data and applies a specified featurization function.
It normalizes the resulting features and remembers normalization parameters using the mean and standard deviation calculated from the
original data and adds an intercept feature (a column of ones) to the matrix.
Parameters:
featurization (Callable):
A function or callable that defines how the input location data should be featurized.
locs (array-like or Tensor):
The input location data used for calculating normalization parameters (mean and standard
deviation) and featurizing new data.
Examples:
Creating a `NormalizingFeaturizer` using a custom featurization function and location data:
```python
import tensorflow as tf
from geostat.model import NormalizingFeaturizer
# Define a simple featurization function
def custom_featurizer(x, y):
return x, y, x * y
# Sample location data
locs = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
# Create the NormalizingFeaturizer
norm_featurizer = NormalizingFeaturizer(custom_featurizer, locs)
```
Using the `NormalizingFeaturizer` to featurize new location data:
```python
new_locs = tf.constant([[7.0, 8.0], [9.0, 10.0]])
F_matrix = norm_featurizer(new_locs)
print(F_matrix) # F_matrix will contain normalized features with an additional intercept column
# tf.Tensor(
# [[1. 2.4494898 2.4494898 3.5676992]
# [1. 3.6742349 3.6742349 6.50242 ]], shape=(2, 4), dtype=float32)
```
Examples: Notes:
- The normalization parameters (`unnorm_mean` and `unnorm_std`) are calculated based on the
initial `locs` data provided during initialization.
- The `__call__` method applies the normalization and adds an intercept feature when used
to featurize new location data.
"""
def __init__(self, featurization, locs):
self.unnorm_featurizer = Featurizer(featurization)
F_unnorm = self.unnorm_featurizer(locs)
self.unnorm_mean = tf.reduce_mean(F_unnorm, axis=0)
self.unnorm_std = tf.math.reduce_std(F_unnorm, axis=0)
def __call__(self, locs):
ones = tf.ones([tf.shape(locs)[0], 1], dtype=tf.float32)
F_unnorm = self.unnorm_featurizer(locs)
F_norm = (F_unnorm - self.unnorm_mean) / self.unnorm_std
return tf.concat([ones, F_norm], axis=1)
|