Featurizer class for producing feature matrices (F matrix) from location data.
The Featurizer
applies a specified featurization function to the input location data
and generates the corresponding feature matrix. If no featurization function is provided,
it produces a matrix with appropriate dimensions containing only ones.
Parameters:
-
featurization
(Callable or None
)
–
A function that takes in the individual components of location data and returns the features.
If set to None
, the featurizer will produce an empty feature matrix (i.e., only ones).
Examples:
Creating a Featurizer
using a custom featurization function:
import tensorflow as tf
from geostat.model import Featurizer
# Define a custom featurization function
def simple_featurizer(x, y):
return x, y, x * y
# Initialize the Featurizer
featurizer = Featurizer(simple_featurizer)
Using the Featurizer
to transform location data:
locs = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
F_matrix = featurizer(locs)
print(F_matrix) # F_matrix will contain the features: (x, y, x*y) for each location
# tf.Tensor(
# [[ 1. 2. 2.]
# [ 3. 4. 12.]
# [ 5. 6. 30.]], shape=(3, 3), dtype=float32)
Handling the case where no featurization is provided:
featurizer_no_feat = Featurizer(None)
F_matrix = featurizer_no_feat(locs)
print(F_matrix) # Since no featurization function is provided, F_matrix will have shape (3, 0)
# tf.Tensor([], shape=(3, 0), dtype=float32)
Notes:
- The
__call__
method is used to apply the featurization to input location data.
- If
featurization
returns a tuple, it is assumed to represent multiple features,
which will be stacked to form the feature matrix.
Source code in src/geostat/model.py
| class Featurizer:
"""
Featurizer class for producing feature matrices (F matrix) from location data.
The `Featurizer` applies a specified featurization function to the input location data
and generates the corresponding feature matrix. If no featurization function is provided,
it produces a matrix with appropriate dimensions containing only ones.
Parameters:
featurization (Callable or None):
A function that takes in the individual components of location data and returns the features.
If set to `None`, the featurizer will produce an empty feature matrix (i.e., only ones).
Examples:
Creating a `Featurizer` using a custom featurization function:
```python
import tensorflow as tf
from geostat.model import Featurizer
# Define a custom featurization function
def simple_featurizer(x, y):
return x, y, x * y
# Initialize the Featurizer
featurizer = Featurizer(simple_featurizer)
```
Using the `Featurizer` to transform location data:
```python
locs = tf.constant([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
F_matrix = featurizer(locs)
print(F_matrix) # F_matrix will contain the features: (x, y, x*y) for each location
# tf.Tensor(
# [[ 1. 2. 2.]
# [ 3. 4. 12.]
# [ 5. 6. 30.]], shape=(3, 3), dtype=float32)
```
Handling the case where no featurization is provided:
```python
featurizer_no_feat = Featurizer(None)
F_matrix = featurizer_no_feat(locs)
print(F_matrix) # Since no featurization function is provided, F_matrix will have shape (3, 0)
# tf.Tensor([], shape=(3, 0), dtype=float32)
```
Examples: Notes:
- The `__call__` method is used to apply the featurization to input location data.
- If `featurization` returns a tuple, it is assumed to represent multiple features,
which will be stacked to form the feature matrix.
"""
def __init__(self, featurization):
self.featurization = featurization
def __call__(self, locs):
locs = tf.cast(locs, tf.float32)
if self.featurization is None: # No features.
return tf.ones([tf.shape(locs)[0], 0], dtype=tf.float32)
feats = self.featurization(*tf.unstack(locs, axis=1))
if isinstance(feats, tuple): # One or many features.
if len(feats) == 0:
return tf.ones([tf.shape(locs)[0], 0], dtype=tf.float32)
else:
feats = self.featurization(*tf.unstack(locs, axis=1))
feats = [tf.broadcast_to(tf.cast(f, tf.float32), [tf.shape(locs)[0]]) for f in feats]
return tf.stack(feats, axis=1)
else: # One feature.
return e(feats)
|