← All Use Cases
📊
Latin Hypercube Design

Feature Store Freshness

Latin Hypercube of 4 feature store parameters for serving latency and feature freshness

Summary

This experiment investigates feature store freshness. Latin Hypercube of 4 feature store parameters for serving latency and feature freshness.

The design varies 4 factors: materialization interval m (min), ranging from 1 to 60, cache ttl s (s), ranging from 10 to 300, batch size (rows), ranging from 100 to 10000, and online replicas (count), ranging from 1 to 6. The goal is to optimize 2 responses: serving latency ms (ms) (minimize) and freshness lag min (min) (minimize). Fixed conditions held constant across all runs include offline store = s3_parquet, online store = redis.

Latin Hypercube Sampling was used to space 10 runs across the 4-dimensional factor space with good coverage and minimal gaps, making it ideal for computer experiments where the response surface may be complex.

Key Findings

For serving latency ms, the most influential factors were materialization interval m (25.0%), cache ttl s (25.0%), batch size (25.0%). The best observed value was 1.1 (at materialization interval m = 39.8455, cache ttl s = 221.076, batch size = 7226.38).

For freshness lag min, the most influential factors were materialization interval m (25.0%), cache ttl s (25.0%), batch size (25.0%). The best observed value was -0.2 (at materialization interval m = 11.1795, cache ttl s = 210.907, batch size = 4240.79).

Recommended Next Steps

Experimental Setup

Factors

FactorLowHighUnit
materialization_interval_m160min
cache_ttl_s10300s
batch_size10010000rows
online_replicas16count

Fixed: offline_store = s3_parquet, online_store = redis

Responses

ResponseDirectionUnit
serving_latency_ms↓ minimizems
freshness_lag_min↓ minimizemin

Configuration

use_cases/46_feature_store_freshness/config.json
{ "metadata": { "name": "Feature Store Freshness", "description": "Latin Hypercube of 4 feature store parameters for serving latency and feature freshness" }, "factors": [ { "name": "materialization_interval_m", "levels": [ "1", "60" ], "type": "continuous", "unit": "min" }, { "name": "cache_ttl_s", "levels": [ "10", "300" ], "type": "continuous", "unit": "s" }, { "name": "batch_size", "levels": [ "100", "10000" ], "type": "continuous", "unit": "rows" }, { "name": "online_replicas", "levels": [ "1", "6" ], "type": "continuous", "unit": "count" } ], "fixed_factors": { "offline_store": "s3_parquet", "online_store": "redis" }, "responses": [ { "name": "serving_latency_ms", "optimize": "minimize", "unit": "ms" }, { "name": "freshness_lag_min", "optimize": "minimize", "unit": "min" } ], "settings": { "operation": "latin_hypercube", "test_script": "use_cases/46_feature_store_freshness/sim.sh" } }

Experimental Matrix

The Latin Hypercube Design produces 10 runs. Each row is one experiment with specific factor settings.

Runmaterialization_interval_mcache_ttl_sbatch_sizeonline_replicas
123.762634.73026312.94.85806
24.65909173.9385056.21.34696
337.3373138.4699952.393.76598
450.6236103.511854.4044.06335
548.107277.3932532.272.40346
633.891956.42234505.173.13861
711.0284191.753222.725.92413
828.8017252.5051183.335.25669
917.3975240.7577101.591.55125
1055.974974.97918960.082.81131

Step-by-Step Workflow

1

Preview the design

Terminal
$ doe info --config use_cases/46_feature_store_freshness/config.json
2

Generate the runner script

Terminal
$ doe generate --config use_cases/46_feature_store_freshness/config.json \ --output use_cases/46_feature_store_freshness/results/run.sh --seed 42
3

Execute the experiments

Terminal
$ bash use_cases/46_feature_store_freshness/results/run.sh
4

Analyze results

Terminal
$ doe analyze --config use_cases/46_feature_store_freshness/config.json
5

Get optimization recommendations

Terminal
$ doe optimize --config use_cases/46_feature_store_freshness/config.json
6

Multi-objective optimization

With 2 competing responses, use --multi to find the best compromise via Derringer–Suich desirability.

Terminal
$ doe optimize --config use_cases/46_feature_store_freshness/config.json --multi
7

Generate the HTML report

Terminal
$ doe report --config use_cases/46_feature_store_freshness/config.json \ --output use_cases/46_feature_store_freshness/results/report.html

Features Exercised

FeatureValue
Design typelatin_hypercube
Factor typescontinuous (all 4)
Arg styledouble-dash
Responses2 (serving_latency_ms ↓, freshness_lag_min ↓)
Total runs10

Analysis Results

Generated from actual experiment runs using the DOE Helper Tool.

Response: serving_latency_ms

Top factors: materialization_interval_m (25.0%), cache_ttl_s (25.0%), batch_size (25.0%).

ANOVA

SourceDFSSMSFp-value
SourceDFSSMSFp-value
materialization_interval_m9186.121020.6801
cache_ttl_s9186.121020.6801
batch_size9186.121020.6801
online_replicas9186.121020.6801
Error(LenthPSE)00.00000.0000
Total9186.121020.6801

Pareto Chart

Pareto chart for serving_latency_ms

Main Effects Plot

Main effects plot for serving_latency_ms

Normal Probability Plot of Effects

Normal probability plot for serving_latency_ms

Half-Normal Plot of Effects

Half-normal plot for serving_latency_ms

Model Diagnostics

Model diagnostics for serving_latency_ms

Response: freshness_lag_min

Top factors: materialization_interval_m (25.0%), cache_ttl_s (25.0%), batch_size (25.0%).

ANOVA

SourceDFSSMSFp-value
SourceDFSSMSFp-value
materialization_interval_m9143.481015.9423
cache_ttl_s9143.481015.9423
batch_size9143.481015.9423
online_replicas9143.481015.9423
Error(LenthPSE)00.00000.0000
Total9143.481015.9423

Pareto Chart

Pareto chart for freshness_lag_min

Main Effects Plot

Main effects plot for freshness_lag_min

Normal Probability Plot of Effects

Normal probability plot for freshness_lag_min

Half-Normal Plot of Effects

Half-normal plot for freshness_lag_min

Model Diagnostics

Model diagnostics for freshness_lag_min

Response Surface Plots

3D surfaces fitted with quadratic RSM. Red dots are observed data points.

freshness lag min batch size vs online replicas

RSM surface: freshness lag min batch size vs online replicas

freshness lag min cache ttl s vs batch size

RSM surface: freshness lag min cache ttl s vs batch size

freshness lag min cache ttl s vs online replicas

RSM surface: freshness lag min cache ttl s vs online replicas

freshness lag min materialization interval m vs batch size

RSM surface: freshness lag min materialization interval m vs batch size

freshness lag min materialization interval m vs cache ttl s

RSM surface: freshness lag min materialization interval m vs cache ttl s

freshness lag min materialization interval m vs online replicas

RSM surface: freshness lag min materialization interval m vs online replicas

serving latency ms batch size vs online replicas

RSM surface: serving latency ms batch size vs online replicas

serving latency ms cache ttl s vs batch size

RSM surface: serving latency ms cache ttl s vs batch size

serving latency ms cache ttl s vs online replicas

RSM surface: serving latency ms cache ttl s vs online replicas

serving latency ms materialization interval m vs batch size

RSM surface: serving latency ms materialization interval m vs batch size

serving latency ms materialization interval m vs cache ttl s

RSM surface: serving latency ms materialization interval m vs cache ttl s

serving latency ms materialization interval m vs online replicas

RSM surface: serving latency ms materialization interval m vs online replicas

Multi-Objective Optimization

When responses compete, Derringer–Suich desirability finds the best compromise. Each response is scaled to a 0–1 desirability, then combined via a weighted geometric mean.

Overall Desirability
D = 0.8846

Per-Response Desirability

ResponseWeightDesirabilityPredictedDir
serving_latency_ms 1.0
0.7893
3.30 0.7893 3.30 ms
freshness_lag_min 1.5
0.9545
-0.20 0.9545 -0.20 min

Recommended Settings

FactorValue
materialization_interval_m50.8848 min
cache_ttl_s135.321 s
batch_size7640.26 rows
online_replicas4.98895 count

Source: from observed run #1

Trade-off Summary

Sacrifice = how much worse than single-objective best.

ResponsePredictedBest ObservedSacrifice
freshness_lag_min-0.20-0.20+0.00

Top 3 Runs by Desirability

RunDFactor Settings
#60.8827materialization_interval_m=39.2028, cache_ttl_s=44.4917, batch_size=401.765, online_replicas=2.39833
#30.8210materialization_interval_m=8.10386, cache_ttl_s=185.242, batch_size=6454.55, online_replicas=2.69511

Model Quality

ResponseType
freshness_lag_min0.3923linear

Full Multi-Objective Output

doe optimize --multi
============================================================ MULTI-OBJECTIVE OPTIMIZATION Method: Derringer-Suich Desirability Function ============================================================ Overall desirability: D = 0.8846 Response Weight Desirability Predicted Direction --------------------------------------------------------------------- serving_latency_ms 1.0 0.7893 3.30 ms ↓ freshness_lag_min 1.5 0.9545 -0.20 min ↓ Recommended settings: materialization_interval_m = 50.8848 min cache_ttl_s = 135.321 s batch_size = 7640.26 rows online_replicas = 4.98895 count (from observed run #1) Trade-off summary: serving_latency_ms: 3.30 (best observed: 1.10, sacrifice: +2.20) freshness_lag_min: -0.20 (best observed: -0.20, sacrifice: +0.00) Model quality: serving_latency_ms: R² = 0.1848 (linear) freshness_lag_min: R² = 0.3923 (linear) Top 3 observed runs by overall desirability: 1. Run #1 (D=0.8846): materialization_interval_m=50.8848, cache_ttl_s=135.321, batch_size=7640.26, online_replicas=4.98895 2. Run #6 (D=0.8827): materialization_interval_m=39.2028, cache_ttl_s=44.4917, batch_size=401.765, online_replicas=2.39833 3. Run #3 (D=0.8210): materialization_interval_m=8.10386, cache_ttl_s=185.242, batch_size=6454.55, online_replicas=2.69511

Full Analysis Output

doe analyze
=== Main Effects: serving_latency_ms === Factor Effect Std Error % Contribution -------------------------------------------------------------- materialization_interval_m 12.1000 1.4381 25.0% cache_ttl_s 12.1000 1.4381 25.0% batch_size 12.1000 1.4381 25.0% online_replicas 12.1000 1.4381 25.0% === ANOVA Table: serving_latency_ms === Source DF SS MS F p-value ----------------------------------------------------------------------------- materialization_interval_m 9 186.1210 20.6801 cache_ttl_s 9 186.1210 20.6801 batch_size 9 186.1210 20.6801 online_replicas 9 186.1210 20.6801 Error (Lenth PSE) 0 0.0000 0.0000 Total 9 186.1210 20.6801 Note: Error estimated using Lenth's pseudo-standard-error (unreplicated design) === Summary Statistics: serving_latency_ms === materialization_interval_m: Level N Mean Std Min Max ------------------------------------------------------------ 14.4806 1 7.3000 0.0000 7.3000 7.3000 2.05638 1 3.3000 0.0000 3.3000 3.3000 23.0759 1 1.2000 0.0000 1.2000 1.2000 26.1949 1 3.4000 0.0000 3.4000 3.4000 31.0427 1 13.2000 0.0000 13.2000 13.2000 37.6193 1 12.9000 0.0000 12.9000 12.9000 47.3515 1 1.4000 0.0000 1.4000 1.4000 49.5084 1 7.8000 0.0000 7.8000 7.8000 57.5698 1 6.1000 0.0000 6.1000 6.1000 8.63293 1 1.1000 0.0000 1.1000 1.1000 cache_ttl_s: Level N Mean Std Min Max ------------------------------------------------------------ 104.795 1 7.8000 0.0000 7.8000 7.8000 138.348 1 12.9000 0.0000 12.9000 12.9000 158.475 1 1.2000 0.0000 1.2000 1.2000 187.818 1 1.1000 0.0000 1.1000 1.1000 226.288 1 7.3000 0.0000 7.3000 7.3000 265.043 1 6.1000 0.0000 6.1000 6.1000 292.057 1 13.2000 0.0000 13.2000 13.2000 37.0665 1 3.4000 0.0000 3.4000 3.4000 55.7061 1 1.4000 0.0000 1.4000 1.4000 73.0513 1 3.3000 0.0000 3.3000 3.3000 batch_size: Level N Mean Std Min Max ------------------------------------------------------------ 1079.5 1 12.9000 0.0000 12.9000 12.9000 1376.18 1 1.1000 0.0000 1.1000 1.1000 2621.87 1 3.3000 0.0000 3.3000 3.3000 3692.19 1 6.1000 0.0000 6.1000 6.1000 4713.28 1 7.3000 0.0000 7.3000 7.3000 5078.94 1 1.4000 0.0000 1.4000 1.4000 6307.98 1 13.2000 0.0000 13.2000 13.2000 7401.28 1 3.4000 0.0000 3.4000 3.4000 8882.93 1 7.8000 0.0000 7.8000 7.8000 9898.7 1 1.2000 0.0000 1.2000 1.2000 online_replicas: Level N Mean Std Min Max ------------------------------------------------------------ 1.31093 1 6.1000 0.0000 6.1000 6.1000 1.89122 1 13.2000 0.0000 13.2000 13.2000 2.46586 1 3.3000 0.0000 3.3000 3.3000 2.64077 1 1.1000 0.0000 1.1000 1.1000 3.02501 1 1.4000 0.0000 1.4000 1.4000 3.89754 1 7.8000 0.0000 7.8000 7.8000 4.30846 1 12.9000 0.0000 12.9000 12.9000 4.68266 1 3.4000 0.0000 3.4000 3.4000 5.20503 1 1.2000 0.0000 1.2000 1.2000 5.59373 1 7.3000 0.0000 7.3000 7.3000 === Main Effects: freshness_lag_min === Factor Effect Std Error % Contribution -------------------------------------------------------------- materialization_interval_m 10.9000 1.2626 25.0% cache_ttl_s 10.9000 1.2626 25.0% batch_size 10.9000 1.2626 25.0% online_replicas 10.9000 1.2626 25.0% === ANOVA Table: freshness_lag_min === Source DF SS MS F p-value ----------------------------------------------------------------------------- materialization_interval_m 9 143.4810 15.9423 cache_ttl_s 9 143.4810 15.9423 batch_size 9 143.4810 15.9423 online_replicas 9 143.4810 15.9423 Error (Lenth PSE) 0 0.0000 0.0000 Total 9 143.4810 15.9423 Note: Error estimated using Lenth's pseudo-standard-error (unreplicated design) === Summary Statistics: freshness_lag_min === materialization_interval_m: Level N Mean Std Min Max ------------------------------------------------------------ 14.4806 1 1.1000 0.0000 1.1000 1.1000 2.05638 1 -0.2000 0.0000 -0.2000 -0.2000 23.0759 1 2.6000 0.0000 2.6000 2.6000 26.1949 1 1.3000 0.0000 1.3000 1.3000 31.0427 1 4.2000 0.0000 4.2000 4.2000 37.6193 1 9.5000 0.0000 9.5000 9.5000 47.3515 1 2.2000 0.0000 2.2000 2.2000 49.5084 1 10.7000 0.0000 10.7000 10.7000 57.5698 1 8.7000 0.0000 8.7000 8.7000 8.63293 1 1.2000 0.0000 1.2000 1.2000 cache_ttl_s: Level N Mean Std Min Max ------------------------------------------------------------ 104.795 1 10.7000 0.0000 10.7000 10.7000 138.348 1 9.5000 0.0000 9.5000 9.5000 158.475 1 2.6000 0.0000 2.6000 2.6000 187.818 1 1.2000 0.0000 1.2000 1.2000 226.288 1 1.1000 0.0000 1.1000 1.1000 265.043 1 8.7000 0.0000 8.7000 8.7000 292.057 1 4.2000 0.0000 4.2000 4.2000 37.0665 1 1.3000 0.0000 1.3000 1.3000 55.7061 1 2.2000 0.0000 2.2000 2.2000 73.0513 1 -0.2000 0.0000 -0.2000 -0.2000 batch_size: Level N Mean Std Min Max ------------------------------------------------------------ 1079.5 1 9.5000 0.0000 9.5000 9.5000 1376.18 1 1.2000 0.0000 1.2000 1.2000 2621.87 1 -0.2000 0.0000 -0.2000 -0.2000 3692.19 1 8.7000 0.0000 8.7000 8.7000 4713.28 1 1.1000 0.0000 1.1000 1.1000 5078.94 1 2.2000 0.0000 2.2000 2.2000 6307.98 1 4.2000 0.0000 4.2000 4.2000 7401.28 1 1.3000 0.0000 1.3000 1.3000 8882.93 1 10.7000 0.0000 10.7000 10.7000 9898.7 1 2.6000 0.0000 2.6000 2.6000 online_replicas: Level N Mean Std Min Max ------------------------------------------------------------ 1.31093 1 8.7000 0.0000 8.7000 8.7000 1.89122 1 4.2000 0.0000 4.2000 4.2000 2.46586 1 -0.2000 0.0000 -0.2000 -0.2000 2.64077 1 1.2000 0.0000 1.2000 1.2000 3.02501 1 2.2000 0.0000 2.2000 2.2000 3.89754 1 10.7000 0.0000 10.7000 10.7000 4.30846 1 9.5000 0.0000 9.5000 9.5000 4.68266 1 1.3000 0.0000 1.3000 1.3000 5.20503 1 2.6000 0.0000 2.6000 2.6000 5.59373 1 1.1000 0.0000 1.1000 1.1000

Optimization Recommendations

doe optimize
=== Optimization: serving_latency_ms === Direction: minimize Best observed run: #6 materialization_interval_m = 39.8455 cache_ttl_s = 221.076 batch_size = 7226.38 online_replicas = 5.69694 Value: 1.1 RSM Model (linear, R² = 0.5680, Adj R² = 0.2224): Coefficients: intercept +5.7965 materialization_interval_m +1.2850 cache_ttl_s -2.2339 batch_size +1.1635 online_replicas -6.1432 Predicted optimum (from linear model, at observed points): materialization_interval_m = 33.4834 cache_ttl_s = 96.7479 batch_size = 5949.18 online_replicas = 1.32179 Predicted value: 12.3878 Surface optimum (via L-BFGS-B, linear model): materialization_interval_m = 1 cache_ttl_s = 300 batch_size = 100 online_replicas = 6 Predicted value: -5.0291 Model quality: Moderate fit — use predictions directionally, not precisely. Factor importance: 1. materialization_interval_m (effect: 12.1, contribution: 25.0%) 2. cache_ttl_s (effect: 12.1, contribution: 25.0%) 3. batch_size (effect: 12.1, contribution: 25.0%) 4. online_replicas (effect: 12.1, contribution: 25.0%) === Optimization: freshness_lag_min === Direction: minimize Best observed run: #1 materialization_interval_m = 11.1795 cache_ttl_s = 210.907 batch_size = 4240.79 online_replicas = 3.83861 Value: -0.2 RSM Model (linear, R² = 0.2319, Adj R² = -0.3826): Coefficients: intercept +4.1202 materialization_interval_m -0.4132 cache_ttl_s -1.2667 batch_size -2.3067 online_replicas -2.1018 Predicted optimum (from linear model, at observed points): materialization_interval_m = 16.6117 cache_ttl_s = 263.863 batch_size = 944.75 online_replicas = 1.65859 Predicted value: 6.8249 Surface optimum (via L-BFGS-B, linear model): materialization_interval_m = 60 cache_ttl_s = 300 batch_size = 10000 online_replicas = 6 Predicted value: -1.9681 Model quality: Weak fit — consider adding center points or using a different design. Factor importance: 1. materialization_interval_m (effect: 10.9, contribution: 25.0%) 2. cache_ttl_s (effect: 10.9, contribution: 25.0%) 3. batch_size (effect: 10.9, contribution: 25.0%) 4. online_replicas (effect: 10.9, contribution: 25.0%)
← Previous: Time-Series Downsampling Next: TCP Congestion Control →