Version: Beta 🚧

Transformation Modes

What is a transformation mode?

A transformation mode specifies the format in which a transformation needs to be written. For example, in spark_sql mode, a transformation needs to be written in SQL, while in pyspark mode, a transformation needs to be written using the PySpark DataFrame API.

This page describes the transformation modes that are supported by transformations defined inside and outside of Feature Views.

The examples show transformations defined inside of Feature Views.

Modes for Batch Feature Views and Stream Feature Views

`mode="spark_sql"` and `mode="snowflake_sql"`

Characteristic	Description
Summary	Contains a SQL query
Supported Feature View types	Batch Feature View, Stream Feature View. `mode="snowflake_sql"` is not supported in Stream Feature Views.
Supported data platforms	Databricks, EMR, Snowflake
Input type	A string (the name of a view generated by Tecton)
Output type	A string

Example

Spark
Snowflake

@batch_feature_view(
    mode="spark_sql",
    # ...
)
def user_has_good_credit(credit_scores):
    return f"""
                SELECT
                        user_id,
                        IF (credit_score > 670, 1, 0) as user_has_good_credit,
                        date as timestamp
                FROM
                        {credit_scores}
                """

@batch_feature_view(
    mode="snowflake_sql",
    # ...
)
def user_has_good_credit(credit_scores):
    return f"""
                SELECT
                        user_id,
                        IFF (credit_score > 670, 1, 0) as user_has_good_credit,
                        date as timestamp
                FROM
                        {credit_scores}
                """

`mode="pyspark"`

Characteristic	Description
Summary	Contains Python code that is executed within a Spark context.
Supported Feature View types	Batch Feature View, Stream Feature View
Supported data platforms	Databricks, EMR
Input type	A Spark `DataFrame` or a Tecton constant
Output type	A Spark `DataFrame`
Notes	Third party libraries can be included in user-defined PySpark functions if your cluster allows third party libraries.

Example

@batch_feature_view(
    mode="pyspark",
    # ...
)
def user_has_good_credit(credit_scores):
    from pyspark.sql import functions as F

    df = credit_scores.withColumn(
        "user_has_good_credit",
        F.when(credit_scores["credit_score"] > 670, 1).otherwise(0),
    )
    return df.select("user_id", df["date"].alias("timestamp"), "user_has_good_credit")

`mode="snowpark"`

Characteristic	Description
Summary	Contains Python code that is executed in Snowpark, using the Snowpark API for Python.
Supported Feature View Types	Batch Feature View
Supported data platforms	Snowflake
Input type	a `snowflake.snowpark.DataFrame` or a Tecton constant
Output type	A `snowflake.snowpark.DataFrame`
Notes	The transformation function can call functions that are defined in Snowflake.

Example

@batch_feature_view(
    mode="snowpark",
    # ...
)
def user_has_good_credit(credit_scores):
    from snowflake.snowpark.functions import when, col

    df = credit_scores.withColumn("user_has_good_credit", when(col("credit_score") > 670, 1).otherwise(0))
    return df.select("user_id", "user_has_good_credit", "timestamp")

Modes for On Demand Feature Views

`mode="pandas"`

Characteristic	Description
Summary	Contains Python code that operates on a Pandas `DataFrame`
Supported Feature View Types	On Demand Feature View
Supported data platforms	Databricks, EMR, Snowflake
Input type	A Pandas `DataFrame` or a Tecton constant
Output type	A Pandas `DataFrame`

Example

@on_demand_feature_view(
    mode="pandas",
    # ...
)
def transaction_amount_is_high(transaction_request):
    import pandas as pd

    df = pd.DataFrame()
    df["transaction_amount_is_high"] = (transaction_request["amount"] >= 10000).astype("int64")
    return df

`mode="python"`

Characteristic	Description
Summary	Contains Python code that operates on a dictionary
Supported Feature View Types	On Demand Feature View
Supported data platforms	Databricks, EMR, Snowflake
Input type	A dictionary
Output type	A dictionary

Example

@on_demand_feature_view(
    mode="python",
    # ...
)
def user_age(request, user_date_of_birth):
    from datetime import datetime, date

    request_datetime = datetime.fromisoformat(request["timestamp"]).replace(tzinfo=None)
    dob_datetime = datetime.fromisoformat(user_date_of_birth["USER_DATE_OF_BIRTH"])

    td = request_datetime - dob_datetime

    return {"user_age": td.days}

Was this page helpful?

Happy React is loading...

Transformation Modes

What is a transformation mode?​

Modes for Batch Feature Views and Stream Feature Views​

mode="spark_sql" and mode="snowflake_sql"​

Example​

mode="pyspark"​

Example​

mode="snowpark"​

Example​

Modes for On Demand Feature Views​

mode="pandas"​

Example​

mode="python"​

Example​

Was this page helpful?

What is a transformation mode?

Modes for Batch Feature Views and Stream Feature Views

`mode="spark_sql"` and `mode="snowflake_sql"`

Example

`mode="pyspark"`

Example

`mode="snowpark"`

Example

Modes for On Demand Feature Views

`mode="pandas"`

Example

`mode="python"`

Example