Skip to main content
Version: 0.5

Time-Window Aggregation Functions Reference

Time-window aggregation functions are built-in functions that are used by defining an Aggregation object in a Batch Feature View or a Stream Feature View.

This page is a reference that contains the available time-window aggregation functions.

count​

An aggregation function that returns, for a materialization time window, the number of row values for a column, per entity value (such as a user_id value). Null values are excluded.

Supported Data Platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Tecton on Spark: All types
  • Tecton on Snowflake: All types

Output column types

  • Int64

Usage

To use this aggregation, define an Aggregation object, using function="count", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="transaction_id", function="count", time_window=timedelta(days=1))

last_distinct(n)​

An aggregation function that returns, for a materialization time window, the last N distinct row values for a column, per entity value (such as a user_id value).

For example, if the last 2 distinct row values for a column, in the materialization time window, are 10 and 20, then the function returns [10,20].

note

The output sequence is in ascending order based on the timestamp.

Supported data platforms

  • Tecton on Spark (Databricks and EMR)

Input column types

  • String

Output column type

  • Array[String]

Usage

To use this aggregation, define an Aggregation object, using function=last_distinct(n), where n is an integer > 0 and <= 1000, in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function=last_distinct(2), time_window=timedelta(days=1))

max​

An aggregation function that returns, for a materialization time window, the maximum of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64, String

Output column type

  • Int64, Float64, String

Usage

To use this aggregation, define an Aggregation object, using function="max", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="max", time_window=timedelta(days=1))

mean​

An aggregation function that returns, for a materialization time window, the mean of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="mean", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="mean", time_window=timedelta(days=1))

min​

An aggregation function that returns, for a materialization time window, the minimum of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64, String

Output column type

  • Int64, Float64, String

Usage

To use this aggregation, define an Aggregation object, using function="min", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="min", time_window=timedelta(days=1))

stddev_pop​

An aggregation function that returns, for a materialization time window, the standard deviation of the row values for a column around the population mean, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="stddev_pop", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="stddev_pop", time_window=timedelta(days=1))

stddev_samp​

An aggregation function that returns, for a materialization time window, the standard deviation of the row values for a column around the sample mean, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="stddev_samp", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="stddev_samp", time_window=timedelta(days=1))

sum​

An aggregation function that returns, for a materialization time window, the sum of the row values for a column, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Int64 or Float64

Usage

To use this aggregation, define an Aggregation object, using function="sum", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="sum", time_window=timedelta(days=1))

var_pop​

An aggregation function that returns, for a materialization time window, the variance of the row values for a column around the population mean, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="var_pop", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="var_pop", time_window=timedelta(days=1))

var_samp​

An aggregation function that returns, for a materialization time window, the variance of the row values for a column around the sample mean, per entity value (such as a user_id value).

Supported data platforms

  • Tecton on Spark (Databricks and EMR)
  • Tecton on Snowflake

Input column types

  • Int64, Int32, Float64

Output column type

  • Float64

Usage

To use this aggregation, define an Aggregation object, using function="var_samp", in a Batch Feature View or a Stream Feature View.

Example

Aggregation(column="amt", function="var_samp", time_window=timedelta(days=1))

Was this page helpful?

Happy React is loading...