Skip to main content
Version: 0.4

Selecting Your Online Store

Introduction

DynamoDB and Redis are key-value stores that offer low-latency retrieval and the ability to perform high-throughput reads and writes. This makes them ideal for real-time online serving use cases.

Tecton supports the following for use as an online store:
  • DynamoDB
  • Redis using Amazon Elasticache and Redis Enterprise Cloud. This document provides information on using Redis with Amazon Elasticache, which is referenced as "Redis".

You can select which online store to use, per Feature View.

In the following sections, we compare different attributes of DynamoDB and Redis.

Functionality comparison

The following table lists the functionality that Tecton supports with DynamoDB and Redis.

DynamoDBRedis
Entity deletion for GDPR
Data deletion when feature views are deleted
Monitoring in Tecton WebUI
Stream Feature Views
Batch Feature Views
Tecton built-in aggregations
Feature Tables
Customer Managed
Tecton Managed
Autoscaling
Provisioned Mode❌ *
TTL Based deletion❌ *
Global Replication❌ *
Point in time restore
Cost attribution per feature view
Durability❌ **

* → Functionality is available in the Dynamo / Redis but is not supported by Tecton. If you are interested in Tecton supporting the functionality, please file a feature request.

** → Redis is not durable, as it stores all its data in memory. However, we suggest customers have replication enabled to failover and have daily snapshot enabled.

Latency comparison

DynamoDB

The following table shows the distribution of the read latency, between Tecton and DynamoDB, for a large number of read requests. Here, latency is defined as the sum of:

  • The time DynamoDB waits to receive the request from Tecton, for a large number of read requests.
  • The time taken by DynamoDB to process the request.
  • The time Tecton waits to receive the response from DynamoDB.
PercentileLatency Value per request
p503 - 4 ms
p906 - 8 ms
p958 - 10 ms
p9920 -25 ms
p99960 - 120ms

Redis

The following table shows the distribution of the read latency, between Tecton and Redis, for a large number of read requests. Here, latency is defined as the sum of:

  • The time Redis waits to receive the request from Tecton.
  • The time taken by Redis to process the request.
  • The time Tecton waits to receive the response from Redis.
PercentileLatency Value per request
p50600 - 700 us
p901.5 - 1.7 ms
p951.8 - 2.0 ms
p992.5 -3.0 ms
p9999.0 - 12.0 ms

Compared to DynamoDB, Redis offers lower latency and significantly better tail latencies. Lower p99 and p999 latencies allow you to retrieve features for a larger candidate set in your latency budget, for use cases such as recommendation systems.

Cost comparison

Online store cost is typically affected by three factors:

  • Read volume
  • Write volume
  • Dataset size

DynamoDB

Tecton uses DynamoDB in on-demand mode, where you pay for reads and writes done along with dataset size. Additionally, Tecton uses eventually consistent reads and hence one query done by Tecton consumes 0.5 RRU instead of 1 RRU. Find DynamoDB pricing details on this page, where you can calculate online store cost for a desired read volume, write volume, and dataset size.

Redis

Redis is priced based on the cluster size and uptime of the cluster. Additionally, while not required, we suggest customers have one replica per primary shard. This can double costs.

In our scenario analysis, we are using Amazon Elasticache in cluster mode with 1 replica for every primary shard. Elasticache pricing details are linked here. While you can choose any node type, most of our customers use cache.m5.2xlarge and cache.m5.4xlarge.

Since Redis has node-based pricing, a precise online store cost cannot be calculated for a desired read volume, write volume, and dataset size. Our estimate for Redis cost calculations is one cache.m5.2xlarge shard can handle either 18,000 QPS of aggregate read + write traffic or 18GB of data size in memory. This is assuming we don't want CPU or memory to go over 75%. We also strongly suggest one read replica per primary shard and our cost calculations will account for this.

Scenario analysis

The following two tables and their accompanying graphs show scenarios for DynamoDB and Redis costs with two varying factors:

  • Query volume (read and write)
  • Dataset size

Read QPS, write QPS, and data size costs for DynamoDB were obtained from the AWS website (the DynamoDB pricing details linked above), on February 6, 2023.

Cost for the cache.m5.2xlarge node type for Redis was obtained from AWS website (the Redis pricing details linked above), on February 6, 2023.

Varying query volume with constant dataset size

Average Read QPSAverage Write QPSDataset SizeYearly CostCost Details
DynamoDB1001050 GB$864
DynamoDB1,00010050 GB$7,960
DynamoDB10,000100050 GB$78,916
DynamoDB100,00010,00050 GB$788,476
Redis1001050 GB$32,745Node Type: cache.m5.2xlarge. Number of Nodes: 6 [3 with replicas]. Cluster is Memory bound
Redis1,00010050 GB$32,745Same cluster as above
Redis10,000100050 GB$32,745Same cluster as above
Redis100,00010,00050 GB$76,404Node Type: cache.m5.2xlarge. Number of Nodes: 14 [7 with replicas]. Cluster is CPU bound

Varying query volume at constant dataset size [50GB].png

Varying dataset size with a constant query volume

Average Read QPSAverage Write QPSDataset SizeYearly CostCost Details
DynamoDB1,00010010 GB$7,838
DynamoDB1,000100100 GB$8,112
DynamoDB1,000100500 GB$9,329
Redis1,00010010 GB$10,915Node Type: cache.m5.2xlarge. Number of Nodes: 2 [1 with a replica]
Redis1,000100100 GB$65,490Node Type: cache.m5.2xlarge. Number of Nodes: 12 [6 with replicas]
Redis1,000100500 GB$305,619Node Type: cache.m5.2xlarge. Number of Nodes: 56 [28 with replicas]

Varying dataset size at constant query volume [1000 Read QPS and 100 Write QPS].png

Cost analysis summary

  • For low query volumes : DynamoDB is significantly cheaper than Redis.
  • For medium query volumes : Redis is marginally cheaper than DynamoDB.
  • For high query volumes : Redis is significantly cheaper than DynamoDB.
  • For medium to large dataset sizes : DynamoDB is significantly cheaper than Redis.

In some situations, it is possible to using a node type that is cheaper than cache.m5.2xlarge, if the node type is compute/memory optimized or has SSD.

Additional DynamoDB costs

Backfilling Data in DynamoDB can be expensive, due to heavy write traffic. Backfilling 100GB of data spread over 10,000,000 rows will cost roughly $150, as 10,000,000 writes would be done and each write would be of size 10KB.

Operational overhead

DynamoDB

DynamoDB is available in two capacity modes: provisioned and on-demand. Tecton supports on-demand mode, only. In this mode, DynamoDB automatically meets the needs of your workload as it increases or decreases; you do not have to manually provision or scale resources. For these reasons, operational overhead with DynamoDB is low, as compared to Redis.

Redis

Redis clusters need to be manually provisioned and scaled to meet changing workload needs. In addition, memory management is required to improve cluster performance. For more information, see Managing your Redis Cluster.

For the reasons mentioned in the previous paragraph, operational overhead with Redis is high, as compared to DynamoDB.

Comparison summary

  • Redis can provide lower latencies, as compared to DynamoDB, and is useful for workloads where single-digit ms latency is needed.
  • DynamoDB is cheaper than Redis for low query volumes as well as moderate to large data set sizes.
  • Redis is cheaper than DynamoDB only for workloads with very high query volumes and low to moderate data sizes.
  • DynamoDB has significantly less operational overhead than Redis.

Specifying the online store to use, per Feature View

In a Batch Feature View or a Stream Feature View, you can specify which online store to use by setting the online_store parameter to either a DynamoConfig() or RedisConfig() object. If online_store is not specified, the Feature View uses DynamoDB as the online store.

Was this page helpful?

Happy React is loading...