← All Use Cases
Central Composite Design

Kubernetes Pod Autoscaling

Central Composite design to optimize HPA target CPU, scale-up window, and replica bounds for request latency and cost

Summary

This experiment investigates kubernetes pod autoscaling. Central Composite design to optimize HPA target CPU, scale-up window, and replica bounds for request latency and cost.

The design varies 3 factors: target cpu pct (%), ranging from 40 to 80, scaleup window (s), ranging from 15 to 120, and max replicas (pods), ranging from 5 to 30. The goal is to optimize 2 responses: p99 latency ms (ms) (minimize) and hourly cost (USD) (minimize). Fixed conditions held constant across all runs include min replicas = 2, namespace = production.

A Central Composite Design (CCD) was selected to fit a full quadratic response surface model, including curvature and interaction effects. With 3 factors this produces 22 runs including center points and axial (star) points that extend beyond the factorial range.

Quadratic response surface models were fitted to capture potential curvature and factor interactions. The RSM contour plots below visualize how pairs of factors jointly affect each response.

Key Findings

For p99 latency ms, the most influential factors were max replicas (66.8%), scaleup window (20.4%), target cpu pct (12.8%). The best observed value was 85.0 (at target cpu pct = 40, scaleup window = 120, max replicas = 30).

For hourly cost, the most influential factors were max replicas (59.0%), target cpu pct (30.4%), scaleup window (10.6%). The best observed value was 0.73 (at target cpu pct = 60, scaleup window = 67.5, max replicas = 17.5).

Recommended Next Steps

Experimental Setup

Factors

FactorLowHighUnit
target_cpu_pct4080%
scaleup_window15120s
max_replicas530pods

Fixed: min_replicas = 2, namespace = production

Responses

ResponseDirectionUnit
p99_latency_ms↓ minimizems
hourly_cost↓ minimizeUSD

Configuration

use_cases/27_kubernetes_pod_autoscaling/config.json
{ "metadata": { "name": "Kubernetes Pod Autoscaling", "description": "Central Composite design to optimize HPA target CPU, scale-up window, and replica bounds for request latency and cost" }, "factors": [ { "name": "target_cpu_pct", "levels": [ "40", "80" ], "type": "continuous", "unit": "%" }, { "name": "scaleup_window", "levels": [ "15", "120" ], "type": "continuous", "unit": "s" }, { "name": "max_replicas", "levels": [ "5", "30" ], "type": "continuous", "unit": "pods" } ], "fixed_factors": { "min_replicas": "2", "namespace": "production" }, "responses": [ { "name": "p99_latency_ms", "optimize": "minimize", "unit": "ms" }, { "name": "hourly_cost", "optimize": "minimize", "unit": "USD" } ], "settings": { "operation": "central_composite", "test_script": "use_cases/27_kubernetes_pod_autoscaling/sim.sh" } }

Experimental Matrix

The Central Composite Design produces 22 runs. Each row is one experiment with specific factor settings.

Runtarget_cpu_pctscaleup_windowmax_replicas
16067.517.5
2801530
3401205
460163.35117.5
56067.517.5
623.485267.517.5
76067.5-5.32177
86067.517.5
9801205
1096.514867.517.5
116067.517.5
1260-28.351417.5
136067.517.5
14401530
156067.517.5
1680155
176067.540.3218
188012030
196067.517.5
2040155
214012030
226067.517.5

Step-by-Step Workflow

1

Preview the design

Terminal
$ doe info --config use_cases/27_kubernetes_pod_autoscaling/config.json
2

Generate the runner script

Terminal
$ doe generate --config use_cases/27_kubernetes_pod_autoscaling/config.json \ --output use_cases/27_kubernetes_pod_autoscaling/results/run.sh --seed 42
3

Execute the experiments

Terminal
$ bash use_cases/27_kubernetes_pod_autoscaling/results/run.sh
4

Analyze results

Terminal
$ doe analyze --config use_cases/27_kubernetes_pod_autoscaling/config.json
5

Get optimization recommendations

Terminal
$ doe optimize --config use_cases/27_kubernetes_pod_autoscaling/config.json
6

Multi-objective optimization

With 2 competing responses, use --multi to find the best compromise via Derringer–Suich desirability.

Terminal
$ doe optimize --config use_cases/27_kubernetes_pod_autoscaling/config.json --multi
7

Generate the HTML report

Terminal
$ doe report --config use_cases/27_kubernetes_pod_autoscaling/config.json \ --output use_cases/27_kubernetes_pod_autoscaling/results/report.html

Features Exercised

FeatureValue
Design typecentral_composite
Factor typescontinuous (all 3)
Arg styledouble-dash
Responses2 (p99_latency_ms ↓, hourly_cost ↓)
Total runs22

Analysis Results

Generated from actual experiment runs using the DOE Helper Tool.

Response: p99_latency_ms

Top factors: max_replicas (66.8%), scaleup_window (20.4%), target_cpu_pct (12.8%).

ANOVA

SourceDFSSMSFp-value
SourceDFSSMSFp-value
target_cpu_pct4467.0540116.76350.1210.9715
scaleup_window4920.1582230.03950.2380.9098
max_replicas48368.56402092.14102.1660.1543
LackofFit24007.53322003.7666
PureError76761.5687
Error910769.1020965.9384
Total2120524.8782977.3752

Pareto Chart

Pareto chart for p99_latency_ms

Main Effects Plot

Main effects plot for p99_latency_ms

Normal Probability Plot of Effects

Normal probability plot for p99_latency_ms

Half-Normal Plot of Effects

Half-normal plot for p99_latency_ms

Model Diagnostics

Model diagnostics for p99_latency_ms

Response: hourly_cost

Top factors: max_replicas (59.0%), target_cpu_pct (30.4%), scaleup_window (10.6%).

ANOVA

SourceDFSSMSFp-value
SourceDFSSMSFp-value
target_cpu_pct44.30341.07590.2330.9129
scaleup_window42.36530.59130.1280.9684
max_replicas449.305412.32642.6690.1019
LackofFit226.940113.4700
PureError732.3289
Error959.26904.6184
Total21115.24315.4878

Pareto Chart

Pareto chart for hourly_cost

Main Effects Plot

Main effects plot for hourly_cost

Normal Probability Plot of Effects

Normal probability plot for hourly_cost

Half-Normal Plot of Effects

Half-normal plot for hourly_cost

Model Diagnostics

Model diagnostics for hourly_cost

Response Surface Plots

3D surfaces fitted with quadratic RSM. Red dots are observed data points.

hourly cost scaleup window vs max replicas

RSM surface: hourly cost scaleup window vs max replicas

hourly cost target cpu pct vs max replicas

RSM surface: hourly cost target cpu pct vs max replicas

hourly cost target cpu pct vs scaleup window

RSM surface: hourly cost target cpu pct vs scaleup window

p99 latency ms scaleup window vs max replicas

RSM surface: p99 latency ms scaleup window vs max replicas

p99 latency ms target cpu pct vs max replicas

RSM surface: p99 latency ms target cpu pct vs max replicas

p99 latency ms target cpu pct vs scaleup window

RSM surface: p99 latency ms target cpu pct vs scaleup window

Multi-Objective Optimization

When responses compete, Derringer–Suich desirability finds the best compromise. Each response is scaled to a 0–1 desirability, then combined via a weighted geometric mean.

Overall Desirability
D = 0.6948

Per-Response Desirability

ResponseWeightDesirabilityPredictedDir
p99_latency_ms 1.5
0.7063
117.80 0.7063 117.80 ms
hourly_cost 1.0
0.6780
3.51 0.6780 3.51 USD

Recommended Settings

FactorValue
target_cpu_pct40 %
scaleup_window15 s
max_replicas5 pods

Source: from observed run #16

Trade-off Summary

Sacrifice = how much worse than single-objective best.

ResponsePredictedBest ObservedSacrifice
hourly_cost3.510.73+2.78

Top 3 Runs by Desirability

RunDFactor Settings
#120.6529target_cpu_pct=60, scaleup_window=67.5, max_replicas=40.3218
#190.6490target_cpu_pct=96.5148, scaleup_window=67.5, max_replicas=17.5

Model Quality

ResponseType
hourly_cost0.1883linear

Full Multi-Objective Output

doe optimize --multi
============================================================ MULTI-OBJECTIVE OPTIMIZATION Method: Derringer-Suich Desirability Function ============================================================ Overall desirability: D = 0.6948 Response Weight Desirability Predicted Direction --------------------------------------------------------------------- p99_latency_ms 1.5 0.7063 117.80 ms ↓ hourly_cost 1.0 0.6780 3.51 USD ↓ Recommended settings: target_cpu_pct = 40 % scaleup_window = 15 s max_replicas = 5 pods (from observed run #16) Trade-off summary: p99_latency_ms: 117.80 (best observed: 85.00, sacrifice: +32.80) hourly_cost: 3.51 (best observed: 0.73, sacrifice: +2.78) Model quality: p99_latency_ms: R² = 0.2777 (linear) hourly_cost: R² = 0.1883 (linear) Top 3 observed runs by overall desirability: 1. Run #16 (D=0.6948): target_cpu_pct=40, scaleup_window=15, max_replicas=5 2. Run #12 (D=0.6529): target_cpu_pct=60, scaleup_window=67.5, max_replicas=40.3218 3. Run #19 (D=0.6490): target_cpu_pct=96.5148, scaleup_window=67.5, max_replicas=17.5

Full Analysis Output

doe analyze
=== Main Effects: p99_latency_ms === Factor Effect Std Error % Contribution -------------------------------------------------------------- max_replicas 92.1000 6.6653 66.8% scaleup_window 28.1500 6.6653 20.4% target_cpu_pct 17.6000 6.6653 12.8% === ANOVA Table: p99_latency_ms === Source DF SS MS F p-value ----------------------------------------------------------------------------- target_cpu_pct 4 467.0540 116.7635 0.121 0.9715 scaleup_window 4 920.1582 230.0395 0.238 0.9098 max_replicas 4 8368.5640 2092.1410 2.166 0.1543 Lack of Fit 2 4007.5332 2003.7666 2.074 0.1961 Pure Error 7 6761.5687 965.9384 Error 9 10769.1020 965.9384 Total 21 20524.8782 977.3752 === Summary Statistics: p99_latency_ms === target_cpu_pct: Level N Mean Std Min Max ------------------------------------------------------------ 23.4852 1 117.5000 0.0000 117.5000 117.5000 40 4 135.1000 33.9439 94.1000 166.9000 60 12 131.3167 32.9425 100.6000 205.1000 80 4 134.3250 39.4290 85.0000 166.4000 96.5148 1 117.8000 0.0000 117.8000 117.8000 scaleup_window: Level N Mean Std Min Max ------------------------------------------------------------ -28.3514 1 119.5000 0.0000 119.5000 119.5000 120 4 141.4500 24.5678 120.0000 166.4000 15 4 127.9750 44.5265 85.0000 166.9000 163.351 1 113.3000 0.0000 113.3000 113.3000 67.5 12 131.5250 32.8165 100.6000 205.1000 max_replicas: Level N Mean Std Min Max ------------------------------------------------------------ -5.32177 1 100.6000 0.0000 100.6000 100.6000 17.5 12 126.4833 25.7949 106.1000 205.1000 30 4 114.4500 33.0166 85.0000 158.7000 40.3218 1 192.7000 0.0000 192.7000 192.7000 5 4 154.9750 22.8536 120.7000 166.9000 === Main Effects: hourly_cost === Factor Effect Std Error % Contribution -------------------------------------------------------------- max_replicas 4.8400 0.4994 59.0% target_cpu_pct 2.4900 0.4994 30.4% scaleup_window 0.8683 0.4994 10.6% === ANOVA Table: hourly_cost === Source DF SS MS F p-value ----------------------------------------------------------------------------- target_cpu_pct 4 4.3034 1.0759 0.233 0.9129 scaleup_window 4 2.3653 0.5913 0.128 0.9684 max_replicas 4 49.3054 12.3264 2.669 0.1019 Lack of Fit 2 26.9401 13.4700 2.917 0.1199 Pure Error 7 32.3289 4.6184 Error 9 59.2690 4.6184 Total 21 115.2431 5.4878 === Summary Statistics: hourly_cost === target_cpu_pct: Level N Mean Std Min Max ------------------------------------------------------------ 23.4852 1 6.0000 0.0000 6.0000 6.0000 40 4 5.0075 3.7598 0.7300 9.8700 60 12 4.4550 1.9446 1.0300 8.7300 80 4 4.8825 2.9963 1.7600 8.9500 96.5148 1 3.5100 0.0000 3.5100 3.5100 scaleup_window: Level N Mean Std Min Max ------------------------------------------------------------ -28.3514 1 4.5700 0.0000 4.5700 4.5700 120 4 4.5625 0.4403 4.1000 5.1000 15 4 5.3275 4.7477 0.7300 9.8700 163.351 1 4.8900 0.0000 4.8900 4.8900 67.5 12 4.4592 2.0153 1.0300 8.7300 max_replicas: Level N Mean Std Min Max ------------------------------------------------------------ -5.32177 1 6.5700 0.0000 6.5700 6.5700 17.5 12 4.5067 1.8044 1.0300 8.7300 30 4 7.1600 2.6297 4.7200 9.8700 40.3218 1 2.3200 0.0000 2.3200 2.3200 5 4 2.7300 1.7680 0.7300 4.3300

Optimization Recommendations

doe optimize
=== Optimization: p99_latency_ms === Direction: minimize Best observed run: #2 target_cpu_pct = 40 scaleup_window = 120 max_replicas = 30 Value: 85.0 RSM Model (linear, R² = 0.0275, Adj R² = -0.1346): Coefficients: intercept +131.3091 target_cpu_pct +4.4455 scaleup_window +1.8609 max_replicas -3.9050 RSM Model (quadratic, R² = 0.3070, Adj R² = -0.2127): Coefficients: intercept +142.8854 target_cpu_pct +4.4455 scaleup_window +1.8609 max_replicas -3.9050 target_cpu_pct*scaleup_window -5.9750 target_cpu_pct*max_replicas +14.0500 scaleup_window*max_replicas -12.5750 target_cpu_pct^2 -4.6982 scaleup_window^2 -10.7582 max_replicas^2 -1.9081 Curvature analysis: scaleup_window coef=-10.7582 concave (has a maximum) target_cpu_pct coef=-4.6982 concave (has a maximum) max_replicas coef=-1.9081 concave (has a maximum) Notable interactions: target_cpu_pct*max_replicas coef=+14.0500 (synergistic) scaleup_window*max_replicas coef=-12.5750 (antagonistic) target_cpu_pct*scaleup_window coef=-5.9750 (antagonistic) Predicted optimum (from linear model, at observed points): target_cpu_pct = 80 scaleup_window = 120 max_replicas = 5 Predicted value: 141.5206 Surface optimum (via L-BFGS-B, linear model): target_cpu_pct = 40 scaleup_window = 15 max_replicas = 30 Predicted value: 121.0976 Model quality: Weak fit — consider adding center points or using a different design. Factor importance: 1. target_cpu_pct (effect: 72.3, contribution: 44.2%) 2. max_replicas (effect: 51.6, contribution: 31.5%) 3. scaleup_window (effect: 39.8, contribution: 24.3%) === Optimization: hourly_cost === Direction: minimize Best observed run: #7 target_cpu_pct = 60 scaleup_window = 67.5 max_replicas = 17.5 Value: 0.73 RSM Model (linear, R² = 0.0949, Adj R² = -0.0560): Coefficients: intercept +4.6605 target_cpu_pct -0.4851 scaleup_window +0.0251 max_replicas +0.7139 RSM Model (quadratic, R² = 0.4391, Adj R² = 0.0184): Coefficients: intercept +3.2023 target_cpu_pct -0.4851 scaleup_window +0.0251 max_replicas +0.7139 target_cpu_pct*scaleup_window -0.5575 target_cpu_pct*max_replicas -0.2175 scaleup_window*max_replicas +0.7600 target_cpu_pct^2 +1.1026 scaleup_window^2 +0.6586 max_replicas^2 +0.4261 Curvature analysis: target_cpu_pct coef=+1.1026 convex (has a minimum) scaleup_window coef=+0.6586 convex (has a minimum) max_replicas coef=+0.4261 convex (has a minimum) Notable interactions: scaleup_window*max_replicas coef=+0.7600 (synergistic) target_cpu_pct*scaleup_window coef=-0.5575 (antagonistic) Predicted optimum (from quadratic model, at observed points): target_cpu_pct = 40 scaleup_window = 120 max_replicas = 30 Predicted value: 8.1487 Surface optimum (via L-BFGS-B, quadratic model): target_cpu_pct = 65.8767 scaleup_window = 103.319 max_replicas = 5 Predicted value: 2.6244 Model quality: Weak fit — consider adding center points or using a different design. Factor importance: 1. target_cpu_pct (effect: 6.0, contribution: 55.9%) 2. scaleup_window (effect: 2.5, contribution: 23.0%) 3. max_replicas (effect: 2.3, contribution: 21.2%)
← All Use Cases Next: Microservice Circuit Breaker →