← All Use Cases
📋
Central Composite

Job Scheduler Packing

Find the optimal node count, tasks-per-node, and memory allocation for Slurm job throughput.

Summary

This experiment investigates job scheduler packing optimization. Optimize HPC job scheduler packing parameters to maximize throughput and resource efficiency.

The design varies 3 factors: nodes (count), ranging from 4 to 64, tasks per node (count), ranging from 8 to 48, and mem per task (GB), ranging from 1 to 8. The goal is to optimize 2 responses: throughput (jobs/h) (maximize) and efficiency (%) (maximize).

A Central Composite Design (CCD) was selected to fit a full quadratic response surface model, including curvature and interaction effects. With 3 factors this produces 22 runs including center points and axial (star) points that extend beyond the factorial range.

Quadratic response surface models were fitted to capture potential curvature and factor interactions. The RSM contour plots below visualize how pairs of factors jointly affect each response.

Key Findings

For throughput, the most influential factors were tasks per node (34.4%), mem per task (34.0%), nodes (31.7%). The best observed value was 261.62 (at nodes = 64, tasks per node = 48, mem per task = 1).

For efficiency, the most influential factors were tasks per node (40.1%), nodes (37.1%), mem per task (22.8%). The best observed value was 86.93 (at nodes = 34, tasks per node = 28, mem per task = 4.5).

Recommended Next Steps

Experimental Setup

Factors

FactorLevelsTypeUnit
nodes4, 64continuouscount
tasks_per_node8, 48continuouscount
mem_per_task1, 8continuousGB

Fixed: none

Responses

ResponseDirectionUnit
throughput↑ maximizejobs/h
efficiency↑ maximize%

Experimental Matrix

The Central Composite Design produces 22 runs. Each row is one experiment with specific factor settings.

Runnodestasks_per_nodemem_per_task
134284.5
26488
34481
43464.51484.5
534284.5
6-20.7723284.5
73428-1.8901
834284.5
964481
1088.7723284.5
1134284.5
1234-8.514844.5
1334284.5
14488
1534284.5
166481
17342810.8901
1864488
1934284.5
20481
214488
2234284.5

How to Run

terminal
$ doe info --config use_cases/12_job_scheduler_packing/config.json $ doe generate --config use_cases/12_job_scheduler_packing/config.json --output results/run.sh --seed 42 $ bash results/run.sh $ doe analyze --config use_cases/12_job_scheduler_packing/config.json $ doe optimize --config use_cases/12_job_scheduler_packing/config.json $ doe report --config use_cases/12_job_scheduler_packing/config.json --output report.html

Analysis Results

Generated from actual experiment runs.

Response: throughput

Pareto Chart

Pareto chart for throughput

Main Effects Plot

Main effects plot for throughput

Response: efficiency

Pareto Chart

Pareto chart for efficiency

Main Effects Plot

Main effects plot for efficiency

Response Surface Plots

3D surfaces fitted with quadratic RSM. Red dots are observed data points.

📊

How to Read These Surfaces

Each plot shows predicted response (vertical axis) across two factors while other factors are held at center. Red dots are actual experimental observations.

  • Flat surface — these two factors have little effect on the response.
  • Tilted plane — strong linear effect; moving along one axis consistently changes the response.
  • Curved/domed surface — quadratic curvature; there is an optimum somewhere in the middle.
  • Saddle shape — significant interaction; the best setting of one factor depends on the other.
  • Red dots far from surface — poor model fit in that region; be cautious about predictions there.

throughput (jobs/h) — R² = 0.653, Adj R² = 0.393
Moderate fit — surface shows general trends but some noise remains.
Curvature detected in tasks_per_node, nodes — look for a peak or valley in the surface.
Strongest linear driver: nodes (decreases throughput).
Notable interaction: nodes × tasks_per_node — the effect of one depends on the level of the other. Look for a twisted surface.

efficiency (%) — R² = 0.785, Adj R² = 0.624
Moderate fit — surface shows general trends but some noise remains.
Curvature detected in tasks_per_node, mem_per_task — look for a peak or valley in the surface.
Strongest linear driver: nodes (increases efficiency).
Notable interaction: nodes × tasks_per_node — the effect of one depends on the level of the other. Look for a twisted surface.

efficiency: nodes vs mem per task

RSM surface: efficiency — nodes vs mem per task

efficiency: nodes vs tasks per node

RSM surface: efficiency — nodes vs tasks per node

efficiency: tasks per node vs mem per task

RSM surface: efficiency — tasks per node vs mem per task

throughput: nodes vs mem per task

RSM surface: throughput — nodes vs mem per task

throughput: nodes vs tasks per node

RSM surface: throughput — nodes vs tasks per node

throughput: tasks per node vs mem per task

RSM surface: throughput — tasks per node vs mem per task

Full Analysis Output

doe analyze
=== Main Effects: throughput === Factor Effect Std Error % Contribution -------------------------------------------------------------- tasks_per_node 254.1600 14.8504 55.2% nodes 120.8200 14.8504 26.2% mem_per_task 85.4775 14.8504 18.6% === Summary Statistics: throughput === nodes: Level N Mean Std Min Max ------------------------------------------------------------ -20.7723 1 5.0000 0.0000 5.0000 5.0000 34 12 85.3292 78.5758 7.4600 261.6200 4 4 91.3550 82.6748 7.7300 182.2200 64 4 109.5325 32.5750 60.6700 125.8200 88.7723 1 125.8200 0.0000 125.8200 125.8200 tasks_per_node: Level N Mean Std Min Max ------------------------------------------------------------ -8.51484 1 7.4600 0.0000 7.4600 7.4600 28 12 73.8075 58.2104 5.0000 137.1600 48 4 117.6150 60.1993 36.6000 182.2200 64.5148 1 261.6200 0.0000 261.6200 261.6200 8 4 83.2725 60.8794 7.7300 138.8700 mem_per_task: Level N Mean Std Min Max ------------------------------------------------------------ -1.8901 1 62.3800 0.0000 62.3800 62.3800 1 4 57.7050 50.3036 7.7300 125.8200 10.8901 1 125.8200 0.0000 125.8200 125.8200 4.5 12 80.5475 81.7799 5.0000 261.6200 8 4 143.1825 26.7422 125.8200 182.2200 === Main Effects: efficiency === Factor Effect Std Error % Contribution -------------------------------------------------------------- mem_per_task 24.5550 2.6532 35.0% tasks_per_node 23.0325 2.6532 32.9% nodes 22.4800 2.6532 32.1% === Summary Statistics: efficiency === nodes: Level N Mean Std Min Max ------------------------------------------------------------ -20.7723 1 49.3900 0.0000 49.3900 49.3900 34 12 67.2883 12.6911 47.3400 86.9300 4 4 71.3800 14.7700 56.8400 84.8400 64 4 66.0150 11.7100 48.4500 71.8700 88.7723 1 71.8700 0.0000 71.8700 71.8700 tasks_per_node: Level N Mean Std Min Max ------------------------------------------------------------ -8.51484 1 67.0100 0.0000 67.0100 67.0100 28 12 67.8133 12.5615 47.3400 86.9300 48 4 70.9825 10.8711 56.8400 83.3500 64.5148 1 47.9500 0.0000 47.9500 47.9500 8 4 66.4125 15.5680 48.4500 84.8400 mem_per_task: Level N Mean Std Min Max ------------------------------------------------------------ -1.8901 1 47.3400 0.0000 47.3400 47.3400 1 4 65.5000 16.1277 48.4500 84.8400 10.8901 1 71.8700 0.0000 71.8700 71.8700 4.5 12 67.4592 12.4089 47.9500 86.9300 8 4 71.8950 9.3326 60.4900 83.3500

Optimization Recommendations

doe optimize
=== Optimization: throughput === Direction: maximize Best observed run: #10 nodes = 88.7723 tasks_per_node = 28 mem_per_task = 4.5 Value: 261.62 RSM Model (linear, R² = 0.09): Coefficients: intercept: +89.0145 nodes: +21.1838 tasks_per_node: +12.8323 mem_per_task: +3.8512 Predicted optimum: nodes = 88.7723 tasks_per_node = 28 mem_per_task = 4.5 Predicted value: 127.6907 Factor importance: 1. nodes (effect: 221.1, contribution: 55.2%) 2. mem_per_task (effect: 93.3, contribution: 23.3%) 3. tasks_per_node (effect: 86.2, contribution: 21.5%) === Optimization: efficiency === Direction: maximize Best observed run: #3 nodes = 4 tasks_per_node = 48 mem_per_task = 1 Value: 86.93 RSM Model (linear, R² = 0.17): Coefficients: intercept: +67.1955 nodes: -5.7499 tasks_per_node: +1.4127 mem_per_task: -1.6230 Predicted optimum: nodes = -20.7723 tasks_per_node = 28 mem_per_task = 4.5 Predicted value: 77.6933 Factor importance: 1. nodes (effect: 25.3, contribution: 41.0%) 2. tasks_per_node (effect: 23.4, contribution: 37.9%) 3. mem_per_task (effect: 13.1, contribution: 21.1%)

Multi-Objective Optimization

When responses compete, Derringer–Suich desirability finds the best compromise. Each response is scaled to a 0–1 desirability, then combined via a weighted geometric mean.

Overall Desirability
D = 0.7468

Per-Response Desirability

ResponseWeightDesirabilityPredictedDir
throughput 1.5
0.6733
182.22 0.6733 182.22 jobs/h
efficiency 1.0
0.8723
83.35 0.8723 83.35 %

Recommended Settings

FactorValue
nodes4 count
tasks_per_node48 count
mem_per_task8 GB

Source: from observed run #7

Trade-off Summary

Sacrifice = how much worse than single-objective best.

ResponsePredictedBest ObservedSacrifice
efficiency83.3586.93+3.58

Top 3 Runs by Desirability

RunDFactor Settings
#10.5235nodes=64, tasks_per_node=8, mem_per_task=8
#50.5235nodes=88.7723, tasks_per_node=28, mem_per_task=4.5

Model Quality

ResponseType
efficiency0.0177linear

Full Multi-Objective Output

doe optimize --multi
============================================================ MULTI-OBJECTIVE OPTIMIZATION Method: Derringer-Suich Desirability Function ============================================================ Overall desirability: D = 0.7468 Response Weight Desirability Predicted Direction --------------------------------------------------------------------- throughput 1.5 0.6733 182.22 jobs/h ↑ efficiency 1.0 0.8723 83.35 % ↑ Recommended settings: nodes = 4 count tasks_per_node = 48 count mem_per_task = 8 GB (from observed run #7) Trade-off summary: throughput: 182.22 (best observed: 261.62, sacrifice: +79.40) efficiency: 83.35 (best observed: 86.93, sacrifice: +3.58) Model quality: throughput: R² = 0.1197 (linear) efficiency: R² = 0.0177 (linear) Top 3 observed runs by overall desirability: 1. Run #7 (D=0.7468): nodes=4, tasks_per_node=48, mem_per_task=8 2. Run #1 (D=0.5235): nodes=64, tasks_per_node=8, mem_per_task=8 3. Run #5 (D=0.5235): nodes=88.7723, tasks_per_node=28, mem_per_task=4.5
← InfiniBand Network Tuning Compiler Optimization Flags →