tecton.PushSource
Summary​
A Tecton PushSource, used to configure the Tecton Online Ingest API for use in a StreamFeatureView.
PushSource is currently in private preview, please contact Tecton support if
you are interested in participating in the preview.
A PushSource may also contain an optional batch config for backfilling and offline training data generation.
Example​
from tecton import HiveConfig, PushSource, BatchSource
from tecton.types import Field, Int64, String, Timestamp
# Declare a schema for the Push Source
input_schema = [
    Field(name="user_id", dtype=String),
    Field(name="event_timestamp", dtype=String),
    Field(name="clicked", dtype=Int64),
]
# Declare a PushSource with a name, schema and a batch_config parameters
# See the API documentation for BatchConfig
click_event_source = PushSource(
    name="click_event_source",
    schema=input_schema,
    batch_config=HiveConfig(
        database="demo_ads",
        table="impressions_batch",
    ),
    description="Sample Push Source for click events",
    owner="pooja@tecton.ai",
    tags={"release": "staging"},
)
Attributes​
| Name | Data Type | Description | 
|---|---|---|
| data_delay | Optional[datetime.timedelta] | Returns the duration that materialization jobs wait after the batch_schedulebefore starting, typically to ensure that all data has landed. | 
| description | Optional[str] | Returns the description of the Tecton object. | 
| id | str | Returns the unique id of the Tecton object. | 
| info | ||
| is_streaming | Deprecated. | |
| name | str | Returns the name of the Tecton object. | 
| owner | Optional[str] | Returns the owner of the Tecton object. | 
| tags | Dict[str,str] | Returns the tags of the Tecton object. | 
| workspace | Optional[str] | Returns the workspace that this Tecton object belongs to. | 
Methods​
| Name | Description | 
|---|---|
| __init__(...) | Creates a new PushSource. | 
| get_columns() | Returns the column names of the data source’s push schema. | 
| get_dataframe(...) | Returns the data in this Data Source as a Tecton DataFrame. | 
| summary() | Displays a human readable summary of this Data Source. | 
| validate() | Validate this Tecton object and its dependencies (if any). | 
__init__(...)​
Creates a new PushSource.
Parameters​
- name(- str) – A unique name of the DataSource.
- description(- Optional[- str]) – A human-readable description. (Default:- None)
- tags(- Optional[- Dict[- str,- str]]) – Tags associated with this Tecton Object (key-value pairs of arbitrary metadata). (Default:- None)
- owner(- Optional[- str]) – Owner name (typically the email of the primary maintainer). (Default:- None)
- prevent_destroy(- bool) – If True, this Tecton object will be blocked from being deleted or re-created (i.e. a destructive update) during tecton plan/apply. To remove or update this object, prevent_destroy must be first set to False via a separate tecton apply. prevent_destroy can be used to prevent accidental changes such as inadvertantly deleting a Feature Service used in production or recreating a Feature View that triggers expensive rematerialization jobs. prevent_destroy also blocks changes to dependent Tecton objects that would trigger a recreate of the tagged object, e.g. if prevent_destroy is set on a Feature Service, that will also prevent deletions or re-creates of Feature Views used in that service. prevent_destroy is only enforced in live (i.e. non-dev) workspaces. (Default:- False)
- schema(- List[- Field]) – A schema for the PushSource
- batch_config(- Union[- FileConfig,- HiveConfig,- RedshiftConfig,- SnowflakeConfig,- SparkBatchConfig,- None]) – An optional BatchConfig object containing the configuration of the Batch Data Source that backs this Tecton Push Source. The Batch Source’s schema must contain a super-set of all the columns defined in the Push Source schema. (Default:- None)
get_columns()​
Returns the column names of the data source’s push schema.
get_dataframe(...)​
Returns the data in this Data Source as a Tecton DataFrame.
Parameters​
- start_time(- Optional[- datetime]) – The interval start time from when we want to retrieve source data. If no timezone is specified, will default to using UTC. Can only be defined if- apply_translatoris True. (Default:- None)
- end_time(- Optional[- datetime]) – The interval end time until when we want to retrieve source data. If no timezone is specified, will default to using UTC. Can only be defined if- apply_translatoris True. (Default:- None)
- apply_translator(- bool) – If True, the transformation specified by- post_processorwill be applied to the dataframe for the data source.- apply_translatoris not applicable to batch sources configured with- spark_batch_configbecause it does not have a- post_processor. (Default:- None)
Returns​
A Tecton DataFrame containing the data source’s raw or translated source data.
Raises​
TectonValidationError – If apply_translator is False, but start_time or
end_time filters are passed in.
summary()​
Displays a human readable summary of this Data Source.
validate()​
Validate this Tecton object and its dependencies (if any).
Validation performs most of the same checks and operations as tecton plan.
- Check for invalid object configurations, e.g. setting conflicting fields. 
- For Data Sources and Feature Views, test query code and derive schemas. e.g. test that a Data Source’s specified s3 path exists or that a Feature View’s SQL code executes and produces supported feature data types. 
Objects already applied to Tecton do not need to be re-validated on retrieval
(e.g. my_workspace.get_feature_view('my_fv')) since they have already been
validated during tecton plan.
Locally defined objects (e.g. my_ds = BatchSource(name="my_ds", ...)) may need
to be validated before some of their methods can be called (e.g.
my_feature_view.get_historical_features()).