Data Quality Validations
This feature is currently in Public Preview.
- Available for Tecton on Databricks and EMR. Coming to Tecton on Snowflake in a future release.
Data Quality Validations help detect feature data issues once a Feature View has
been materialized. If validation results indicate that feature data failed to
meet expectations during a materialization interval, an alert email will be sent
to an email provided as alert_email
in the Feature View declaration.
Terminology​
- Data Quality Metrics are statistics that describe feature values output by a Feature View during materialization. See Data Quality Metrics for more information.
- Expectations are verifiable assertions about metrics. Expectations can be based on metrics. For example, “Expect that <100% of values for a given feature are null”.
- Validations are the process of validating that the set of expectations has been met when materializing a Feature View. Validations can either pass or fail.
- Alerts notify the specified user when validation fails.
This document covers Data Quality Expectations, Validations, and Alerts.
Default Expectations​
By default, Tecton defines the following expectations for all Batch and Stream Feature Views.
For Stream Feature Views, Data Quality Metrics and Expectations only apply to offline materialized feature data.
Expectation | Applicable to | Explanation |
---|---|---|
Feature View row count > 0 | Feature Views | Expect feature rows to be produced when a Feature View is materialized |
A feature has any non-null values | All types of features | Expect a feature to have at least one non-null value, when there are feature rows. |
A feature has any non-zero values | Numerical features | Expect a feature to have at least one non-zero value, when there are feature rows. |
A feature has any non-empty values | String or Array features | Expect a feature to have at least one non-empty-string/array value when there are feature rows. |
Enable Alert Emails​
Email Alerting is enabled when alert_email
is specified in a Batch or Stream
Feature View definition. The alert email will be sent out at most once in 6
hours per Feature View. If you would like to disable email alerts, leave this
field unset.
Viewing Validation Results​
You can view the validation results for all Feature Views in a workspace by selecting Data Quality in the left navigation panel in Tecton web UI.