Data Quality Validation
Data Quality Validation helps detect feature data issues once a Feature View has
been materialized. If validation results indicate that feature data failed to
meet expectations during a materialization interval, an alert email will be sent
to an email provided as alert_email
in the Feature View declaration.
Terminology​
Multiple components make up the overall Tecton Data Quality:
- Metrics describe feature values output by a Feature View during materialization. For example, the percentage of values for a given feature that are null.
- Expectations are verifiable assertions about feature outputs. Expectations can be based on metrics. For example, “Expect that <100% of values for a given feature are null”.
- Validations are the process of validating that the set of expectations has been met for a Feature View. Validations can either pass or fail.
- Alerts notify the specified user when validation fails.
This document covers Data Quality Expectations, Validations, and Alerts.
Default Expectations​
By default, Tecton defines the following expectations for all Batch and Stream Feature Views.
For Stream Feature Views, Data Quality Metrics and Expectations only apply to offline materialized feature data.
Expectation | Applicable to | Explanation |
---|---|---|
Feature View row count > 0 | Feature Views | Expect feature rows to be produced when a Feature View is materialized |
A feature has any non-null values | All types of features | Expect a feature to have at least one non-null value, when there are feature rows. |
A feature has any non-zero values | Numerical features | Expect a feature to have at least one non-zero value, when there are feature rows. |
A feature has any non-empty values | String or Array features | Expect a feature to have at least one non-empty-string/array value when there are feature rows. |
User-specified expectations are not supported, but please let us know by filing a feature request if you are interested.
Enable Validation Alert Email​
Validation alert email is enabled when alert_email
is specified in a Batch or
Stream Feature View definition. Validation happens right after Feature View
materialization, but the alert email will be sent out at most once in 6 hours
per Feature View. If you would like to disable email alerts, leave this field
unset.
Viewing Validation Results​
You can access the validation results of the applicable Feature Views from the left navigation panel in Tecton web UI.