tecton.interactive.FeatureTable¶
-
class
tecton.interactive.
FeatureTable
(proto, fco_container)¶ FeatureTable class.
To get a FeatureTable instance, call
tecton.get_feature_table()
.Methods
Deletes any materialized data that matches the specified join keys from the FeatureTable.
Displays information for deletion jobs created with the delete_keys() method, which may include past jobs, scheduled jobs, and job failures.
Returns a Tecton
TectonDataFrame
of historical values for this feature table.Returns a single Tecton FeatureVector from the Online Store.
Ingests a Dataframe into the FeatureTable.
Displays materialization information for the FeatureTable, which may include past jobs, scheduled jobs, and job failures.
Returns various information about this feature definition, including the most critical metadata such as the name, owner, features, etc.
-
delete_keys
(keys, online=True, offline=True)¶ Deletes any materialized data that matches the specified join keys from the FeatureTable. This method kicks off a job to delete the data in the offline and online stores. If a FeatureTable has multiple entities, the full set of join keys must be specified. Only supports Dynamo online store. Maximum 10000 keys can be deleted per request.
- Parameters
- Returns
None if deletion job was created successfully.
-
deletion_status
(verbose=False, limit=1000, sort_columns=None, errors_only=False)¶ Displays information for deletion jobs created with the delete_keys() method, which may include past jobs, scheduled jobs, and job failures.
- Parameters
verbose – If set to true, method will display additional low level deletion information, useful for debugging.
limit – Maximum number of jobs to return.
sort_columns – A comma-separated list of column names by which to sort the rows.
- Param
errors_only: If set to true, method will only return jobs that failed with an error.
-
get_historical_features
(spine=None, timestamp_key=None, entities=None, start_time=None, end_time=None, save=False, save_as=None)¶ Returns a Tecton
TectonDataFrame
of historical values for this feature table. If no arguments are passed in, all feature values for this feature table will be returned in a TectonDataFrame. Note: The timestamp_key parameter is only applicable when a spine is passed in. Parameters start_time, end_time, and entities are only applicable when a spine is not passed in.- Parameters
spine (Union[pyspark.sql.DataFrame, pandas.DataFrame, TectonDataFrame]) – The spine to join against, as a dataframe. If present, the returned DataFrame will contain rollups for all (join key, temporal key) combinations that are required to compute a full frame from the spine. To distinguish between spine columns and feature columns, feature columns are labeled as feature_view_name.feature_name in the returned DataFrame. If spine is not specified, it’ll return a DataFrame of feature values in the specified time range.
timestamp_key (str) – Name of the time column in spine. This method will fetch the latest features computed before the specified timestamps in this column. If unspecified, will default to the time column of the spine if there is only one present.
entities (Union[pyspark.sql.DataFrame, pandas.DataFrame, TectonDataFrame]) – A DataFrame that is used to filter down feature values. If specified, this DataFrame should only contain join key columns.
start_time (Union[pendulum.DateTime, datetime.datetime]) – The interval start time from when we want to retrieve features. If no timezone is specified, will default to using UTC.
end_time (Union[pendulum.DateTime, datetime.datetime]) – The interval end time until when we want to retrieve features. If no timezone is specified, will default to using UTC.
save (bool) – Whether to persist the DataFrame as a Dataset object. Default is False.
save_as (str) – name to save the DataFrame as. If unspecified and save=True, a name will be generated.
Examples
A FeatureTable
ft
with join keyuser_id
.1)
ft.get_historical_features(spine)
wherespine=pandas.Dataframe({'user_id': [1,2,3], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps in the spine.2)
ft.get_historical_features(spine, save_as='my_dataset)
wherespine=pandas.Dataframe({'user_id': [1,2,3], 'date': [datetime(...), datetime(...), datetime(...)]})
Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps in the spine. Save the DataFrame as dataset with the name :py:mod`my_dataset`.3)
ft.get_historical_features(spine, timestamp_key='date_1')
wherespine=pandas.Dataframe({'user_id': [1,2,3], 'date_1': [datetime(...), datetime(...), datetime(...)], 'date_2': [datetime(...), datetime(...), datetime(...)]})
Fetch historical features from the offline store for users 1, 2, and 3 for the specified timestamps in the ‘date_1’ column in the spine.4)
ft.get_historical_features(start_time=datetime(...), end_time=datetime(...))
Fetch all historical features from the offline store in the time range specified by start_time and end_time.- Returns
A TectonDataFrame with features values.
-
get_online_features
(join_keys, include_join_keys_in_response=False)¶ Returns a single Tecton FeatureVector from the Online Store.
- Parameters
Examples
A FeatureTable
ft
with join keyuser_id
.1)
ft.get_online_features(join_keys={'user_id': 1})
Fetch the latest features from the online store for user 1.2)
ft.get_online_features(join_keys={'user_id': 1}, include_join_keys_in_respone=True)
Fetch the latest features from the online store for user 1 and include the join key information (user_id=1) in the returned FeatureVector.- Returns
A FeatureVector of the results.
-
ingest
(df)¶ Ingests a Dataframe into the FeatureTable. This method kicks off a materialization job to write the data into the offline and online store, depending on the Feature Table configuration.
-
materialization_status
(verbose=False, limit=1000, sort_columns=None, errors_only=False)¶ Displays materialization information for the FeatureTable, which may include past jobs, scheduled jobs, and job failures.
- Parameters
verbose – If set to true, method will display additional low level materialization information, useful for debugging.
limit – Maximum number of jobs to return.
sort_columns – A comma-separated list of column names by which to sort the rows.
- Param
errors_only: If set to true, method will only return jobs that failed with an error.
-
summary
()¶ Returns various information about this feature definition, including the most critical metadata such as the name, owner, features, etc.
Attributes
created_at
Returns the creation date of this Tecton Object.
data_source_names
Returns the names of the data sources for this Feature View.
defined_in
Returns filename where this Tecton Object has been declared.
description
The description of this Tecton Object, set by user.
entity_names
Returns the names of entities for this Feature View.
family
Deprecated.
features
Returns the names of the (output) features.
id
Returns the id of this object
join_keys
Returns the join key column names
name
The name of this Tecton Object.
online_serving_index
Returns Defines the set of join keys that will be indexed and queryable during online serving.
owner
The owner of this Tecton Object (typically the email of the primary maintainer.)
tags
Tags associated with this Tecton Object (key-value pairs of arbitrary metadata set by user.)
timestamp_field
Returns the timestamp_field of this FeatureView.
url
Returns a link to the Tecton Web UI.
wildcard_join_key
Returns a wildcard join key column name if it exists; Otherwise returns None.
workspace
Returns the workspace this Tecton Object was created in.
-