Skip to main content
Version: 0.6

Making Changes to Features

Introduction​

This guide explains the best practices for making changes to existing feature views and feature services in Tecton, without causing any disruptions in your production models.

note

Tecton does not recommend recreating an existing feature view that is a dependency for a live model, as this can lead to downtime when the feature view is recreated. We recommend adding the prevent_destroy tag.

There are a four main types of changes that ML organizations need to account for while maintaining a feature store.

  • Adding features to an existing feature view.
  • Updating transformation logic in an existing feature view
  • Deleting features in an existing feature view.
  • There is an upstream data migration and/or change in the schema.
note

This guide uses a basic end-to-end ML application architecture to illustrate how to make changes to features. In this architecture, a fraud_detection model is hosted on a generic model serving layer. This model is dependent on the feature service fd_service that contains features from the feature view transaction_metrics, that aggregates transaction data over a set time window. transaction_metrics is sourced from the transaction_history table within the source data warehouse.

The model fraud_detection is live in production.

Basic Application Architecture

Adding features to an existing feature view​

Scenario​

The ML organization is onboarding a new customer churn use-case (or evaluating a new version of the existing model) and have identified an additional 5 features sourced from the transaction_history table to be added to the feature store. These features logically fit into the transaction_metrics feature view as they are similiar to the other features in the feature view, are sourced from the same data, and are updated on the same lifecycle.

note

This scenario assumes that the feature view has existing consumers in production. If there are no existing consumers, modifying the existing feature view is the best approach.

Option 1: Create variants of Tecton feature view and feature service​

note

Advantage: This option will replace the existing feature view with a new variant, reducing the amount of maintenance.

Disadvantage: This option recreates the entire feature view, not just the additional features, and thus there will be cost implications.

  1. In Tecton, create new variants of the feature view and feature service leveraging Tecton’s variant functionality. Name the new feature view variant transaction_metrics:v2 and the new feature service variant fd_service:v2. The new feature view variant contains updated feature transformation logic.

  2. Update the pre-processing logic within the model code currently deployed on the model endpoint to get features from the new feature service variant (fd_service:v2) and create a new version of the model endpoint fraud_detection:v2. Deploy the updated inference image on your model serving layer (model endpoint).

  3. Run integration tests, shadow traffic, and other deployment best practices to ensure that the model endpoint pointing to the new feature service is working as expected.

Variants of a Tecton feature view and feature service

  1. Leverage your go-to deployment strategy (e.g. blue-green) to deploy the new model endpoint to application traffic.

  2. Sunset previous objects.

You can disable feature view writes to the offline store and online store (by setting online=False and offline=False) for a set period of time, instead of completely deleting the latest variant when sunsetting an object. This will enable you to revert to a previous variant quickly because you will only re-materialize data since you disabled the feature view.

Sunset the Previous Objects

Option 2: Create a new Tecton feature view with only new features​

note

Advantage: This option will not require a re-creation (backfill) of the feature view and will not require an updated deployment of the feature view, service, and model.

Disadvantage: This approach will increase the amount of feature views maintained and add complexity to the feature store. You will avoid the backfill cost and time, but incremental jobs will be more expensive writing to two feature views vs. one merged feature view, on the same entity key.

With this approach you will be creating a new feature view that calculates only the new features.

  1. Create a separate feature view, transaction_metrics_add, that contains the new features needed for the customer churn model.

  2. Build a new feature service, churn_service, that leverages features from both the transaction_metrics and transaction_metrics_add feature views.

    Create a new feature view and feature service

Summary​

  • Tecton highly recommends running a feature modeling exercise prior to building a feature view to identify potential features within a data source for the existing ML use-cases and potential future use-cases to mitigate this issue.

  • Adding additional feature views to a feature service will negligibly impact the online read latency, but will lower the performance reading from the offline store.

  • The primary driver of cost for each materialization job is the number of writes to the online store. Creating a new feature view with only the new features will save the time and cost associated with re-materializing the existing features, but will have approximately the same number of writes for the backfill materialization jobs and will have approximately double the writes for forward-fill incremental jobs.

    The number of writes could vary between feature views based on how frequently the feature data changes for the respective features in each feature view.

  • Tecton recommends that you take option 1 if you expect the features to be re-used by multiple use-cases in the future and the added features have similar life cycles (you do not expect the logic to change in the future).

  • Tecton recommends option 2 if the new features are specific to the new use-case and there are concerns around the time to re-materialize the existing feature view and deploy a new version of the existing model.

Updating transformation logic in an existing feature view​

Scenario​

The ML organization identifies an edge case that the existing feature logic is not handling accurately for a feature within the transaction_metrics feature view. Similar scenarios may also exist if there is a change in the upstream data (e.g. a new possible value is added) and logic needs to be added to handle this change.

Both of the following options to address this scenario will functionally change the feature data. If the model is not re-trained on the new feature data, then this will cause feature data drift which can negatively impact model performance.

Option 1: Modify the existing feature view without re-materialization​

With this approach no new feature views or feature services will be created. Only the forward-fill incremental feature data will be affected by the change. The historical offline data will not be changed. Inherently, there will be different feature definitions in the offline data before and after the logical change is applied.

Modify the Existing Feature View Without Re-materialization

  1. Modify the existing feature view, transaction_metrics with the new feature logic and save the change.
  2. Apply the change to Tecton leveraging the --suppress-recreates option to avoid re-materializing the historical data.

Option 2: Create variants of Tecton feature view/features service and re-train model​

  1. In Tecton, create new variants of the feature view and feature service leveraging Tecton’s variant functionality. Name the new feature view variant transaction_metrics:v2 and the new feature service variant fd_service:v2. The new feature view variant contains updated feature transformation logic.
  2. Update the corresponding model
    • (a) Create a new training dataset with the updated feature view and train the model accordingly.
    • (b) Update the pre-processing logic within the inference image of the model endpoint to get features from the new feature service variant (fd_service:v2 ) and point to the new trained model artifact. Create a new version of the model endpoint: fraud_detection:v2. Deploy the updated inference image on your model serving layer (model endpoint).
  3. Run integration tests, shadow traffic, and other deployment best practices to ensure that the model endpoint pointing to the new feature service is working as expected.

Create a New Variant with Updated Transformation Logic

  1. Leverage your go-to deployment strategy (e.g. blue-green) to deploy the new model endpoint to application traffic.
  2. Sunset previous objects.

Sunset the Previous Objects

Summary​

  • The size and impact of the logical change is the deciding factor between options 1 and 2. If this is an edge case for an unimportant feature then the time and cost of option 2 will not be worth it. If the logical change affects a significant portion of records for an important feature, then the potential model performance improvements gained from option 2 will be worth it.

  • This approach will re-materialize all features within the feature view so there are cost implications when updating the feature logic. Tecton highly recommends taking a test case driven development model to mitigate this scenario from occurring.

  • When defining your features, if you have a set of features from a given data source where the business logic changes frequently, it may make sense to split these features into a separate feature view.

Deleting features in an existing feature view​

Scenario​

The ML organization identifies 3 features that need to be deleted from the feature store to prevent use by downstream ML models.

Implementation​

All of the impacted models and feature services dependent on these 3 features need to be updated leveraging the approach described in creating variants.

Upstream Data Change​

Scenario​

The upstream data source table, transaction_history, is migrated from Redshift to Databricks with a slightly different schema that requires casting, renaming, and small transformations to clean the data to ensure it is consistent with the previous source.

Implementation​

Leverage the approach outlined in modifying the existing feature view without re-materialization.

Scenario: Breaking Change​

The owners of the data source table transaction_history change the data type of a field from numeric to string without notifying the downstream teams, causing the materialization job to fail. Later that day, the owner adds the dropped column back to the transaction_history table.

Implementation​

None; Tecton will self-heal and retry once the fix has been made to the upstream data source.

Scenario: Data Change​

The organization adds the ability to accept cryptocurrency for payments, and a new transaction_type value, cryptocurrency is now being populated in the transaction_history table.

Implementation​

  1. If the upstream data changes in a way that functionally changes the outputted feature data, follow the same steps when adding features to an existing feature view.

  2. If the upstream data change does not impact the outputted feature data, make the change to the existing feature view leveraging suppress-recreates to minimize cost and downtime.

Was this page helpful?

Happy React is loading...