Defining Features
This section covers how to define and manage features in Tecton using the feature engineering framework. Tecton allows you to define features in SQL, PySpark, SnowPark, or Python and subsequently handles the orchestration and maintenance of your data pipelines, including batch, streaming, and real-time pipelines.
We recommend starting with the Tecton Framework Overview to get a comprehensive understanding of Tecton's operating principles.
The key concepts covered in this section include:
Entities: The objects or concepts that have features associated with them, such as Customer or Product. Entities provide a way to organize, prevent duplication of, and join features.
Feature Tables: Allow you to ingest pre-computed feature values into Tecton. Unlike Feature Views, you are responsible for transforming raw data into feature values and ingesting them into Tecton.
Feature Views: Package together a transformation pipeline, entities, configuration, and metadata needed to manage one or more related features. There are three types:
- Batch Feature Views: Transform a Batch Data Source and materialize features on a schedule.
- Stream Feature Views: Transform a Stream Data Source and materialize features in near real-time.
- On-Demand Feature Views: Run request-time transformations on Batch, Stream, or Request data sources.
Feature Services: Group together features from Feature Views and provide a REST API and methods for generating training data.
This documentation provides an overview of each concept, when and how to use them, and provides examples and best practices for building scalable feature pipelines on Tecton. Please feel free to explore the table of contents on the left to dive into any topic.