Skip to main content
Version: Beta 🚧

Transformation Modes

What is a transformation mode?

A transformation mode specifies the format in which a transformation needs to be written. For example, in spark_sql mode, a transformation needs to be written in SQL, while in pyspark mode, a transformation needs to be written using the PySpark DataFrame API.

This page describes the transformation modes that are supported by transformations defined inside and outside of Feature Views.

The examples show transformations defined inside of Feature Views.

Modes for Batch Feature Views and Stream Feature Views

mode="spark_sql" and mode="snowflake_sql"

CharacteristicDescription
SummaryContains a SQL query
Supported Feature View typesBatch Feature View, Stream Feature View. mode="snowflake_sql" is not supported in Stream Feature Views.
Supported data platformsDatabricks, EMR, Snowflake
Input typeA string (the name of a view generated by Tecton)
Output typeA string

Example

@batch_feature_view(
mode="spark_sql",
# ...
)
def user_has_good_credit(credit_scores):
return f"""
SELECT
user_id,
IF (credit_score > 670, 1, 0) as user_has_good_credit,
date as timestamp
FROM
{credit_scores}
"""

mode="pyspark"

CharacteristicDescription
SummaryContains Python code that is executed within a Spark context.
Supported Feature View typesBatch Feature View, Stream Feature View
Supported data platformsDatabricks, EMR
Input typeA Spark DataFrame or a Tecton constant
Output typeA Spark DataFrame
NotesThird party libraries can be included in user-defined PySpark functions if your cluster allows third party libraries.

Example

@batch_feature_view(
mode="pyspark",
# ...
)
def user_has_good_credit(credit_scores):
from pyspark.sql import functions as F

df = credit_scores.withColumn(
"user_has_good_credit",
F.when(credit_scores["credit_score"] > 670, 1).otherwise(0),
)
return df.select("user_id", df["date"].alias("timestamp"), "user_has_good_credit")

mode="snowpark"

CharacteristicDescription
SummaryContains Python code that is executed in Snowpark, using the Snowpark API for Python.
Supported Feature View TypesBatch Feature View
Supported data platformsSnowflake
Input typea snowflake.snowpark.DataFrame or a Tecton constant
Output typeA snowflake.snowpark.DataFrame
NotesThe transformation function can call functions that are defined in Snowflake.

Example

@batch_feature_view(
mode="snowpark",
# ...
)
def user_has_good_credit(credit_scores):
from snowflake.snowpark.functions import when, col

df = credit_scores.withColumn("user_has_good_credit", when(col("credit_score") > 670, 1).otherwise(0))
return df.select("user_id", "user_has_good_credit", "timestamp")

Modes for On Demand Feature Views

mode="pandas"

CharacteristicDescription
SummaryContains Python code that operates on a Pandas DataFrame
Supported Feature View TypesOn Demand Feature View
Supported data platformsDatabricks, EMR, Snowflake
Input typeA Pandas DataFrame or a Tecton constant
Output typeA Pandas DataFrame

Example

@on_demand_feature_view(
mode="pandas",
# ...
)
def transaction_amount_is_high(transaction_request):
import pandas as pd

df = pd.DataFrame()
df["transaction_amount_is_high"] = (transaction_request["amount"] >= 10000).astype("int64")
return df

mode="python"

CharacteristicDescription
SummaryContains Python code that operates on a dictionary
Supported Feature View TypesOn Demand Feature View
Supported data platformsDatabricks, EMR, Snowflake
Input typeA dictionary
Output typeA dictionary

Example

@on_demand_feature_view(
mode="python",
# ...
)
def user_age(request, user_date_of_birth):
from datetime import datetime, date

request_datetime = datetime.fromisoformat(request["timestamp"]).replace(tzinfo=None)
dob_datetime = datetime.fromisoformat(user_date_of_birth["USER_DATE_OF_BIRTH"])

td = request_datetime - dob_datetime

return {"user_age": td.days}

Was this page helpful?

Happy React is loading...