Everything you need to install, configure, run, and analyze experiments with the DOE Helper Tool.
Get up and running in under two minutes.
Python 3.10 or higher. Works on Linux, macOS, and Windows. Full-factorial designs use only the standard library — no external packages needed.
| Package | Version | Purpose |
|---|---|---|
pyDOE3 | ≥1.0 | PB, fractional, LHS, CCD, Box-Behnken, Taguchi designs |
numpy | ≥1.26 | Array operations, RSM, hat matrix, design evaluation |
pandas | ≥2.0 | Data manipulation |
matplotlib | ≥3.7 | Pareto, effects, diagnostic, normal/half-normal, RSM surface plots |
scipy | ≥1.11 | ANOVA F-tests, confidence intervals, surface optimization, power analysis |
Jinja2 | ≥3.1 | Runner script template rendering |
Everything starts with a JSON config file. It defines your factors, responses, design type, and execution settings.
Categorical — Discrete, unordered levels (e.g., "A", "B", "dark", "light"). Cannot be interpolated.
Continuous — Numeric values that can be interpolated (e.g., temperature 150–200). Required for CCD and Box-Behnken star/center points.
Ordinal — Ordered categorical levels (e.g., "low", "medium", "high"). Treated as categorical in most designs.
double-dash (default): --temperature 150 --pressure 2 --catalyst A
env: TEMPERATURE=150 PRESSURE=2 CATALYST=A ./test.sh
positional: ./test.sh 150 2 A
| Goal | Design | Why |
|---|---|---|
| Test everything | Full Factorial | All combinations, all interactions |
| Reduce runs (2-level) | Fractional Factorial | Half the runs, some aliasing |
| Screen many factors | Plackett-Burman | N+1 runs for N factors |
| Modern screening | Definitive Screening | 3-level, detects curvature, 2k+1 runs |
| Robust design (Taguchi) | Taguchi | Orthogonal arrays, S/N ratios |
| Continuous space filling | Latin Hypercube | Even coverage, configurable N |
| Find the optimum | Central Composite | Quadratic model, star points |
| Avoid corner points | Box-Behnken | Safe RSM, no extreme combos |
| Custom run count | D-Optimal | Algorithmic design, max information per run |
| Formulation/blending | Mixture (Simplex) | Components that sum to 1 |
factors SectionRequired. Array of factor objects, each with:
| Field | Required | Description |
|---|---|---|
name | Yes | Unique factor name |
levels | Yes | Array of at least 2 level values (strings) |
type | No | categorical (default), continuous, or ordinal |
unit | No | Unit of measurement (for display) |
description | No | Human-readable description |
responses SectionOptional (defaults to a single response named "response"). Each response has:
| Field | Required | Description |
|---|---|---|
name | Yes | Must match keys in result JSON files |
optimize | No | maximize (default) or minimize |
unit | No | Unit of measurement |
description | No | Human-readable description |
weight | No | Relative importance for multi-objective optimization (default: 1.0) |
bounds | No | [worst, best] for desirability scaling in --multi mode (auto-computed if omitted) |
settings Section| Field | Default | Description |
|---|---|---|
operation | full_factorial | Design type (11 supported — see table above) |
test_script | — | Path to test script |
block_count | 1 | Number of blocks (replicates) |
out_directory | results | Directory for per-run JSON results |
processed_directory | — | Directory for analysis outputs |
lhs_samples | 0 (auto) | LHS sample count; 0 = max(10, 2×factors) |
Every command the tool supports, with all flags and annotated examples.
| Flag | Default | Description |
|---|---|---|
--config | required | Path to JSON config file |
--output | run_experiments.sh | Output script path |
--format | sh | Script format: sh (Bash) or py (Python) |
--seed | random | Seed for reproducible run order |
--dry-run | off | Print design matrix without writing files |
Always use --dry-run first to preview your design matrix before committing to a full run. Use --seed for reproducible experiments.
| Flag | Description |
|---|---|
--config | Path to JSON config file (required) |
--results-dir | Override out_directory from config |
--no-plots | Skip generating Pareto charts and effects plots (headless mode) |
--csv | Export main effects and summary stats to CSV files |
--partial | Analyze only completed runs (skip missing result files) |
| Flag | Description |
|---|---|
--config | Path to JSON config file (required) |
--results-dir | Override out_directory from config |
--response | Optimize for a single response (default: all responses) |
--partial | Use only completed runs for optimization |
--multi | Multi-objective optimization using Derringer-Suich desirability functions |
--steepest | Show steepest ascent/descent pathway for sequential experimentation |
The optimizer reports the best observed run, fits linear and quadratic RSM models, finds the true surface optimum using L-BFGS-B optimization with multi-start, and ranks factors by importance. Use --response to focus on a specific metric.
Use --steepest to generate a table of follow-up experiment points along the gradient direction (standard RSM Phase 1 methodology).
When your experiment has multiple responses that conflict (e.g., maximize yield AND minimize cost), use --multi to find the best compromise using Derringer-Suich desirability functions.
To prioritize certain responses, add weight and optional bounds to your config:
weight (default: 1.0) controls relative importance — a weight of 3 means that response matters 3× more than a weight of 1. bounds (optional) define [worst, best] for desirability scaling; if omitted, bounds are auto-computed from observed data.
Generates a self-contained HTML file with:
Use --partial to generate reports from incomplete experiments. The report will note which runs are missing.
The HTML report has zero external dependencies. All plots are base64-encoded directly in the file. Share it via email, Slack, or drop it in a wiki — it just works.
Prints the experiment plan summary without writing any files: operation type, number of factors, base runs, total runs with blocking, response definitions, fixed factors, alias structure (for fractional factorials), and design evaluation metrics (D-efficiency, A-efficiency, G-efficiency). Use this to quickly verify your config before running anything.
| Flag | Default | Description |
|---|---|---|
--config | required | Path to JSON config file |
--sigma | auto | Error standard deviation (estimated from results if omitted) |
--delta | 2×sigma | Minimum detectable effect size |
--alpha | 0.05 | Significance level |
--results-dir | from config | Override out_directory from config |
Computes statistical power for detecting effects of a given size using the non-central F distribution. Power < 0.80 indicates you may need more runs or blocks to reliably detect the specified effect size.
Run this before your experiment to check if your design has enough runs. If power is low, consider adding blocks (replicates) or switching to a design with more runs. If results already exist, sigma is estimated automatically from residuals.
| Flag | Default | Description |
|---|---|---|
--config | required | Path to JSON config file |
--type | required | Augmentation type: fold_over, star_points, or center_points |
--output | run_experiments_augmented.sh | Output script path |
--format | sh | Script format: sh or py |
Extends an existing design with additional runs without re-running completed experiments:
| Flag | Default | Description |
|---|---|---|
--config | required | Path to JSON config file |
--run | required | Run number to record, or all to enter all pending runs |
--seed | 42 | Seed for consistent run ordering |
For real-world experiments without a test script: the tool displays each run's factor settings, prompts for response values, validates numeric input, and saves the result as run_N.json. If results already exist, shows current values and asks before overwriting.
| Flag | Default | Description |
|---|---|---|
--config | required | Path to JSON config file |
--seed | 42 | Seed for consistent run ordering |
Shows a progress bar, lists completed and pending runs, and displays full factor details for the next run to complete. Especially useful for long-running real-world experiments.
| Flag | Default | Description |
|---|---|---|
--config | required | Path to JSON config file |
--format | csv | Output format: csv or markdown |
--output | stdout | Write to file instead of stdout |
--seed | 42 | Seed for consistent run ordering |
Generates a worksheet with all runs, factor values, and empty columns for response measurements and notes. Pre-fills response values for any runs that already have results. Perfect for printing and taking to the lab or field.
Use --format markdown for documentation or wikis, and --format csv for importing into Excel or Google Sheets.
Your test script is the bridge between the DOE tool and your actual experiment. It must follow a simple protocol.
Your script must: (1) accept factor values via the configured arg_style, (2) accept --out <path> for the output file, and (3) write a JSON file with keys matching your response names.
For real-world experiments (lab work, physical tests, field measurements), you don't need a test script at all. Leave test_script empty or omit it, and use the manual workflow instead:
The analysis pipeline doesn't care how results were produced. Whether you ran a simulation, measured something in a lab, or collected field data — as long as the response values end up in run_N.json files, everything works.
If your experiments span multiple days, sessions, or batches, use "block_count": 2 (or more) in your config. Each block is an independent replicate with its own randomized order. This lets you detect and account for day-to-day variation.
The --csv flag on the analyze command exports structured data files:
main_effects_{response}.csv — Factor, effect magnitude, std error, % contribution, CI boundssummary_stats_{response}.csv — Per-factor, per-level statistics (N, mean, std, min, max)Use --no-plots to skip matplotlib chart generation. This avoids display issues in SSH sessions and CI environments while still computing all statistical results.
The most efficient experimental strategy uses two (or three) stages:
Total: ~25 runs to fully optimize, compared to hundreds with grid search.
The --seed flag controls run-order randomization. The same seed always produces the same run order, making your experiments reproducible. The design matrix itself (which factor combinations are tested) is always deterministic — the seed only affects the order within each block.
End-to-end walkthroughs of real experimental workflows.
A Plackett-Burman design with 6 PostgreSQL configuration parameters, 2 blocks for replication, and CSV export for downstream analysis in R.
A Box-Behnken design with 3 continuous factors and 3 responses (yield, purity, cost). Demonstrates multi-response analysis, RSM, and the trade-offs inherent in multi-objective optimization.
A hands-on walkthrough for running physical experiments without a test script. Uses the record, status, and export-worksheet commands to manage a manual workflow.
Not sure how to set up your experiment? Use the AI Prompts page to generate experiment configurations with the help of an AI assistant like Claude or ChatGPT.
The AI Prompts page provides ready-to-use prompts that guide an AI assistant through the process of creating a DOE configuration file for your specific problem. Describe your experiment in plain English, and get a complete config.json, simulation script, and analysis workflow.
The prompts cover common scenarios: