Skip to main content
Version: 0.6

On-Demand Feature View

An On-Demand Feature View is used to run row-level, request-time transformations on data from Request Sources, Batch Feature Views, or Stream Feature Views. Unlike Batch and Stream Feature Views, On-Demand Feature Views do not precompute and materialize data to the Feature Store, but instead run transformations both online and offline at the time of the request.

Running transformations request time can be useful for:

  1. Calculating features based on data that is only available at the time of the request such as a current transaction or user location
  2. Defining feature crosses that would be inefficient to precompute (example: compare embeddings between two users)
  3. Running additional transformations on Tecton-managed aggregations
  4. Defining new features without needing to rematerialize Feature Store data
  5. Post-processing feature data (example: imputing null values)

Common Examples

  • Turning a user's GPS coordinates into a geohash
  • Parsing a user's search string
  • Checking if a user's incoming transaction is larger than the user's average number of transactions in the last 30 days
  • Picking the maximum transaction of the past 10 transactions of a user (if combined with a last-n aggregation in a Stream Feature View`)
  • Computing the cosine similarity between a pre-computed user embedding and a query embedding
info

On-Demand Feature View transformations introduce request-time latency based on the transformation being executed. For example, if your on-demand transformation executes a sleep("1") statement, the execution of this transformation won't be any faster than 1 second).

On-Demand Feature Transformations​

On-Demand Feature View transformations are written using Python.

When using mode='python', Tecton passes in a row of data for each source in the form of a dictionary. On-demand feature outputs are returned in a single dictionary of one or more feature values. Outputs from an OnDemandFeatureView must be non-null.

When using mode='pandas', Tecton passes in one or many rows of data in the form of a pandas DataFrame. At offline execution time, Tecton will pass in a batch of several rows. At online inference time, Tecton will typically pass in a single row. Tecton expects the function to return a pandas DataFrame.

Example​

from tecton import on_demand_feature_view, RequestSource
from tecton.types import Float64, Bool, Field
from features.user_transaction_amount_averages import user_transaction_amount_averages


transaction_request = RequestSource(schema=[Field("amount", Float64)])


@on_demand_feature_view(
sources=[transaction_request, user_transaction_amount_averages],
mode="python",
schema=[Field("transaction_amount_is_higher_than_average", Bool)],
)
def transaction_amount_is_higher_than_average(transaction_request, user_transaction_amount_averages):
amount_mean = user_transaction_amount_averages["amount_mean_24h_10m"] or 0
return {"transaction_amount_is_higher_than_average": transaction_request["amount"] > amount_mean}

How to choose between pandas and python mode​

mode='python' is significantly more performant than mode='pandas' during online inference, but slightly less performant when offline data is generated for training or offline prediction purposes.

Generally, for any online inference use case, use mode='python'. Only consider using mode='pandas' if you use an ODFV only to generate training data, or offline inference data.

Parameters​

See the API reference for the full list of parameters.

Was this page helpful?

Happy React is loading...