tecton.Aggregation
Summary​
This class describes a single aggregation that is applied in a batch or stream feature view.
Description​
The Aggregation constructor accepts a function input, which can be one of
the
built-in aggregation functions.
For these aggregation functions, you can pass the name of it as a string. Nulls
are handled like Spark SQL Function(column)- for example, sum of all nulls is
null and count of all nulls is 0.
In addition to numeric aggregations, Aggregation supports the last
non-distinct and distinct N aggregation that will compute the last N
non-distinct and distinct values for the column by timestamp. Right now only
string column is supported as input to this aggregation, i.e., the resulting
feature value will be a list of strings. The order of the value in the list is
ascending based on the timestamp. Nulls are not included in the aggregated list.
Example​
You can use it via the last() and last_distinct() helper function like this:
from tecton.aggregation_functions import last_distinct, last
@batch_feature_view(
...
aggregations=[
    Aggregation(
        column='my_column',
        function=last_distinct(15),
        time_window=datetime.timedelta(days=7)),
    Aggregation(
        column='my_column',
        function=last(15),
        time_window=datetime.timedelta(days=7)),
    ],
...
)
def my_fv(data_source):
    pass
Attributes​
The attributes are the same as the __init__ method parameters. See below.
Methods​
__init__(...)​
Method generated by attrs for class Aggregation.
Parameters​
- column(- str) – Column name of the feature we are aggregating.
- function(- Union[- str,- <aggregation function>]) – One of the built-in aggregation functions, such as- count. See the time-window aggregation functions reference for a list of aggregation functions.
- time_window(- datetime.timedelta) – Duration to aggregate over. Example:- datetime.timedelta(days=30).
- name(- str) – The name of this feature. Defaults to an autogenerated name, e.g.- transaction_count_7d_1d.