tecton.Transformation
Summary​
A Tecton Transformation. Transformations are used encapsulate and share transformation logic between Feature Views.
Use the tecton.transformation()
decorator to create a Transformation.
Attributes​
Name | Data Type | Description |
---|---|---|
description | str | Returns the description of the Tecton object. |
id | str | Returns the unique id of the Tecton object. |
info | ||
name | str | Returns the name of the Tecton object. |
owner | Optional[str] | Returns the owner of the Tecton object. |
tags | Dict[str, str] | Returns the tags of the Tecton object. |
transformer | The user function for this transformation. | |
workspace | Optional[str] | Returns the workspace that this Tecton object belongs to. |
Methods​
Name | Description |
---|---|
__init__(...) | Creates a new Transformation. |
run(...) | Run the transformation against inputs. |
summary() | Displays a human readable summary of this Transformation. |
validate() | Validate this Tecton object and its dependencies (if any). |
__init__(...)​
Creates a new Transformation. Use the @transformation
decorator to create a
Transformation instead of directly using this constructor.
Parameters​
name
(str
) – A unique name of the Transformation.description
(Optional
[str
]) – A human-readable description.tags
(Optional
[Dict
[str
,str
]]) – Tags associated with this Tecton Transformation (key-value pairs of arbitrary metadata).owner
(Optional
[str
]) – Owner name (typically the email of the primary maintainer).prevent_destroy
(bool
) – If True, this Tecton object will be blocked from being deleted or re-created (i.e. a destructive update) during tecton plan/apply.user_function
(Callable
[…,Union
[str
,DataFrame
]]) – The user function for this transformation.
run(...)​
Run the transformation against inputs.
Currently, this method only supports spark_sql, pyspark, and pandas modes.
Parameters​
*inputs
(Union
[DataFrame
,Series
,TectonDataFrame
,DataFrame
,str
,int
,float
,bool
]) – positional arguments to the transformation function. For PySpark and SQL transformations, these are eitherpandas.DataFrame
orpyspark.sql.DataFrame
objects. For on-demand transformations, these arepandas.Dataframe
objects.context
(Optional
[BaseMaterializationContext
]) – An optional materialization context object. (Default:None
)
summary()​
Displays a human readable summary of this Transformation.
validate()​
Validate this Tecton object and its dependencies (if any).
Validation performs most of the same checks and operations as tecton plan
.
Check for invalid object configurations, e.g. setting conflicting fields.
For Data Sources and Feature Views, test query code and derive schemas. e.g. test that a Data Source’s specified s3 path exists or that a Feature View’s SQL code executes and produces supported feature data types.
Objects already applied to Tecton do not need to be re-validated on retrieval
(e.g. fv = tecton.get_workspace('prod').get_feature_view('my_fv')
) since they
have already been validated during tecton plan
. Locally defined objects (e.g.
my_ds = BatchSource(name="my_ds", ...)
) may need to be validated before some
of their methods can be called, e.g.
my_feature_view.get_historical_features()
.