Skip to main content
Version: 0.4

Testing Batch Features

Import libraries and select your workspace

import tecton
import pandas
from datetime import datetime, timedelta

ws = tecton.get_workspace("prod")

Load a Batch Feature View

fv = ws.get_feature_view("user_transaction_counts")

Run a Feature View transformation pipeline

The BatchFeatureView::run function can be used to dry run execute a Feature View transformation pipeline over a given time range. This can be useful for checking the output of your feature transformation logic or debugging a materialization job.


There is no guarantee that the output data is the same as the feature values that would be created in this time frame, such as in the following cases:

  • When using incremental backfills, feature data for a given time range may depend on multiple executions of the Feature view transformation pipeline.
  • Feature values may be dependent on scheduling information (e.g. batch_schedule, data_delay, feature_start_time) that doesn't match the start_time and end_time you provide.
  • Aggregations may require more input data that the window you provide with start_time and end_time.

If you want to produce feature values for a given time range, you should use get_historical_feature(start_time, end_time).

result_dataframe =, 1, 1), end_time=datetime(2022, 1, 2)).to_pandas()
0user_6000032784852021-01-01 06:25:57other
1user_4699984415712021-01-01 07:16:06Visa
2user_5025676046892021-01-01 04:39:10Visa
3user_9306919581072021-01-01 10:52:31Visa
4user_7825107887082021-01-01 20:15:25other

Run with mock sources

Mock input data sources can be passed into the BatchFeatureView::run function using the same source names from the Feature View definition.

users_data = pandas.DataFrame(
"user_id": ["user_1", "user_1", "user_2"],
"cc_num": ["423456789012", "567890123456", "678901234567"],
"signup_timestamp": [
datetime(2022, 1, 1, 2),
datetime(2022, 1, 1, 4),
datetime(2022, 1, 1, 3),

result_dataframe =
start_time=datetime(2022, 1, 1),
end_time=datetime(2022, 1, 2),
users=users_data, # `users` is the name of this FeatureView input.

0user_12022-01-01 02:00:00Visa
1user_12022-01-01 04:00:00MasterCard
2user_22022-01-01 03:00:00Discover

Run a Batch Feature View with tiled aggregations

BatchFeatureView::run for feature views with aggregations is quite similar to with the only different that it also supports aggregation_level parameter.

When a feature view with tile aggregates, the query operates in three logical steps:

  1. The feature view query is run over the provided time range. The user defined transformations are applied over the data source.
  2. The result of #1 is aggregated into tiles the size of the aggregation_interval.
  3. The tiles from #2 are combined to form the final feature values. The number of tiles that are combined is based off of the time_window of the aggregation.

To see the output of #1, use aggregation_level="disabled". For #2, use aggregation_level="partial". For #3, use aggregation_level="full".

aggregation_level="full" is the default behavior.

For more details on aggregate_tiles, refer to Creating Features that use Time-Windowed Aggregations.

agg_fv = ws.get_feature_view("user_transaction_counts")

result_dataframe =
start_time=datetime(2022, 5, 1),
end_time=datetime(2022, 5, 2),

0user_22250678998412022-05-01 21:04:38
1user_2699081696812022-05-01 19:45:14
2user_33775031741212022-05-01 15:18:48
3user_33775031741212022-05-01 07:11:31
4user_33775031741212022-05-01 01:50:51
result_dataframe =
start_time=datetime(2022, 5, 1),
end_time=datetime(2022, 5, 2),

0user_22250678998412022-05-01 00:00:002022-05-02 00:00:00
1user_2699081696812022-05-01 00:00:002022-05-02 00:00:00
2user_33775031741242022-05-01 00:00:002022-05-02 00:00:00
3user_40253984590122022-05-01 00:00:002022-05-02 00:00:00
4user_46161596668512022-05-01 00:00:002022-05-02 00:00:00
end = datetime(2022, 5, 2)

result_dataframe =
- timedelta(days=90), # Note: to get an interesting "full" aggregation, we need to provide adequate input data.

0user_1313404710602022-04-30 00:00:001622
1user_1313404710602022-04-23 00:00:001621
2user_1313404710602022-04-18 00:00:001720
3user_1313404710602022-04-15 00:00:002719
4user_1313404710602022-04-08 00:00:001617

Get a Range of Feature Values from the Offline Store

BatchFeatureView::get_historical_features can read a range of featue values from the offline store between a given start_time and end_time.

from_source=True can be passed in in order to bypass the offline store and compute features on-the-fly against the raw data source. This is useful for testing the expected output of feature values.

Use from_source=False (default) to see what data is materialized in the offline store.

result_dataframe = fv.get_historical_features(
start_time=datetime(2022, 5, 1), end_time=datetime(2022, 5, 2)
0user_2051257466822022-05-01 00:00:0028342022-05-01 00:00:00
1user_2225067899842022-05-01 00:00:001421412022-05-01 00:00:00
2user_2685148449662022-05-01 00:00:00129662022-05-01 00:00:00
3user_3944957590232022-05-01 00:00:00121682022-05-01 00:00:00
4user_4598428899562022-05-01 00:00:00114392022-05-01 00:00:00

Read the Latest Features from Online Feature Store


For performance reasons, this function should only be used for testing and not in a production environment. To read features online efficiently, see Reading Features for Inference

fv.get_online_features({"user_id": "user_609904782486"}).to_dict()
Out: {
"transaction_count_1d_1d": 1,
"transaction_count_30d_1d": 17,
"transaction_count_90d_1d": 56,

Read Historical Features from Offline Feature Store with Time-Travel

Create a spine DataFrame with events to look up. For more information on spines, check out Selecting Sample Keys and Timestamps.

spine_df = pandas.DataFrame(
"user_id": ["user_722584453020", "user_461615966685"],
"timestamp": [datetime(2022, 5, 1, 3, 20, 0), datetime(2022, 6, 6, 2, 30, 0)],
0user_7225844530202022-05-01 03:20:00
1user_4616159666852022-06-06 02:30:00

from_source=True can be passed in in order to bypass the offline store and compute features on-the-fly against the raw data source. However, this will be slower than reading feature data that has been materialized to the offline store.

result_dataframe = fv.get_historical_features(spine_df, from_source=True).to_pandas()
0user_4616159666852022-06-06 02:30:0001340
1user_7225844530202022-05-01 03:20:0002873

Was this page helpful?

Happy React is loading...