Importing Python Modules and Objects into Transformations
Importing Python modules into transformations​
Transformations support the pandas and numpy modules, only. These modules
can only be used in Pandas transformations.
Python modules must be imported inside the transformation function.
Avoid using aliases for imports (e.g. use import pandas instead of
import pandas as pd).
Any modules used for type annotations in function signatures must be imported outside the function.
In the following example, the pandas module is imported in two places:
- Inside of the transformation function, because the function uses the
pandasmodule - Outside of the transformation function, because
pandastype annotations are used in the function signature (my_transformation(request: pandas.DataFrame) -> pandas.DataFrame:)
from tecton import transformation
import pandas # required for type hints on my_transformation.
@transformation(mode="pandas")
def my_transformation(request: pandas.DataFrame) -> pandas.DataFrame:
import pandas # required for pandas.DataFrame() below.
df = pandas.DataFrame()
df["amount_is_high"] = (request["amount"] >= 10000).astype("int64")
return df
Importing Python objects into transformation functions​
Object imports must be done outside of the transformation definition.
The following imports of objects into transformation functions are allowed:
- Functions
- Constants
The following imports of objects into transformation functions are not allowed:
- Classes
- Class instances
- Enums
In the following example,
my_func, my_int_const, my_string_const, my_dict_const are imported from
my_local_module. The import takes place outside of the transformation
function.
from tecton import transformation
import pandas # required for type hints on my_transformation.
from my_local_module import my_func, my_int_const, my_string_const, my_dict_const
@transformation(mode="pandas")
def my_transformation(request: pandas.DataFrame) -> pandas.DataFrame:
import pandas # required for pandas.DataFrame() below.
df = pandas.DataFrame()
df[my_dict_const["resultval"]] = my_func(request[my_string_const] >= my_int_const)
return df