tecton.declarative.Aggregation¶
-
class
tecton.declarative.
Aggregation
(column, function, time_window, name=None)¶ This class describes a single aggregation that is applied in a batch or stream feature view.
- Parameters
column (str) – Column name of the feature we are aggregating.
function (Union[str, AggregationFunction]) – One of the built-in aggregation functions.
time_window (datetime.timedelta) – Duration to aggregate over. Example:
datetime.timedelta(days=30)
.name (str) – The name of this feature. Defaults to an autogenerated name, e.g. transaction_count_7d_1d.
function can be one of predefined numeric aggregation functions, namely
"count"
,"sum"
,"mean"
,"min"
,"max"
. For these numeric aggregations, you can pass the name of it as a string. Nulls are handled like Spark SQL Function(column), e.g. SUM/MEAN/MIN/MAX of all nulls is null and COUNT of all nulls is 0.In addition to numeric aggregations,
Aggregation
supports “last-n” aggregations that will compute the last N distinct values for the column by timestamp. Right now only string column types are supported as inputs to this aggregation, i.e., the resulting feature value will be a list of strings. Nulls are not included in the aggregated list.You can use it via the
last_distinct()
helper function like this:from tecton.aggregation_functions import last_distinct @batch_feature_view( ... aggregations=[Aggregation( column='my_column', function=last_distinct(15), time_window=datetime.timedelta(days=7))], ... ) def my_fv(data_source): pass
Methods
Method generated by attrs for class Aggregation.
-
__init__
(column, function, time_window, name=None)¶ Method generated by attrs for class Aggregation.