Quick Start & Reference

1 Installation & Setup

Get up and running in under two minutes.

ℹ

System Requirements

Python 3.10 or higher. Works on Linux, macOS, and Windows. Full-factorial designs use only the standard library — no external packages needed.

Terminal

# Install from PyPI
$ pip install doehelper

# Verify installation
$ doe --version

Dependencies

Package	Version	Purpose
`pyDOE3`	≥1.0	PB, fractional, LHS, CCD, Box-Behnken, Taguchi designs
`numpy`	≥1.26	Array operations, RSM, hat matrix, design evaluation
`pandas`	≥2.0	Data manipulation
`matplotlib`	≥3.7	Pareto, effects, diagnostic, normal/half-normal, RSM surface plots
`scipy`	≥1.11	ANOVA F-tests, confidence intervals, surface optimization, power analysis
`Jinja2`	≥3.1	Runner script template rendering

2 Writing a Configuration File

Everything starts with a JSON config file. It defines your factors, responses, design type, and execution settings.

config.json
{
  "metadata": {
    "name": "My Experiment",
    "description": "Testing 3 factors at 2 levels"
  },
  "factors": [
    {"name": "temperature", "levels": ["150", "200"], "type": "continuous", "unit": "°C"},
    {"name": "pressure",    "levels": ["2", "6"],       "type": "continuous", "unit": "bar"},
    {"name": "catalyst",    "levels": ["A", "B"],       "type": "categorical"}
  ],
  "fixed_factors": {
    "duration": "60"
  },
  "responses": [
    {"name": "yield",  "optimize": "maximize", "unit": "%"},
    {"name": "cost",   "optimize": "minimize", "unit": "USD"}
  ],
  "runner": {
    "arg_style": "double-dash",
    "result_file": "json"
  },
  "settings": {
    "operation": "full_factorial",
    "test_script": "test.sh",
    "out_directory": "results",
    "block_count": 1
  }
}

Factor Types Explained ▼

Categorical — Discrete, unordered levels (e.g., "A", "B", "dark", "light"). Cannot be interpolated.

Continuous — Numeric values that can be interpolated (e.g., temperature 150–200). Required for CCD and Box-Behnken star/center points.

Ordinal — Ordered categorical levels (e.g., "low", "medium", "high"). Treated as categorical in most designs.

Argument Styles ▼

double-dash (default): --temperature 150 --pressure 2 --catalyst A

env: TEMPERATURE=150 PRESSURE=2 CATALYST=A ./test.sh

positional: ./test.sh 150 2 A

Choosing a Design Type ▼

Goal	Design	Why
Test everything	Full Factorial	All combinations, all interactions
Reduce runs (2-level)	Fractional Factorial	Half the runs, some aliasing
Screen many factors	Plackett-Burman	N+1 runs for N factors
Modern screening	Definitive Screening	3-level, detects curvature, 2k+1 runs
Robust design (Taguchi)	Taguchi	Orthogonal arrays, S/N ratios
Continuous space filling	Latin Hypercube	Even coverage, configurable N
Find the optimum	Central Composite	Quadratic model, star points
Avoid corner points	Box-Behnken	Safe RSM, no extreme combos
Custom run count	D-Optimal	Algorithmic design, max information per run
Formulation/blending	Mixture (Simplex)	Components that sum to 1

The `factors` Section

Required. Array of factor objects, each with:

Field	Required	Description
`name`	Yes	Unique factor name
`levels`	Yes	Array of at least 2 level values (strings)
`type`	No	`categorical` (default), `continuous`, or `ordinal`
`unit`	No	Unit of measurement (for display)
`description`	No	Human-readable description

The `responses` Section

Optional (defaults to a single response named "response"). Each response has:

Field	Required	Description
`name`	Yes	Must match keys in result JSON files
`optimize`	No	`maximize` (default) or `minimize`
`unit`	No	Unit of measurement
`description`	No	Human-readable description
`weight`	No	Relative importance for multi-objective optimization (default: 1.0)
`bounds`	No	[worst, best] for desirability scaling in `--multi` mode (auto-computed if omitted)

The `settings` Section

Field	Default	Description
`operation`	`full_factorial`	Design type (11 supported — see table above)
`test_script`	—	Path to test script
`block_count`	1	Number of blocks (replicates)
`out_directory`	`results`	Directory for per-run JSON results
`processed_directory`	—	Directory for analysis outputs
`lhs_samples`	0 (auto)	LHS sample count; 0 = max(10, 2×factors)

Validation Rules

At least one factor with at least 2 levels
Fractional factorial and Plackett-Burman require exactly 2 levels per factor
Box-Behnken requires 3+ factors with 2 numeric levels each
CCD requires 2 numeric levels per factor
Definitive Screening requires 3+ factors with 2 numeric levels each
Factor names must be unique
Response names must be unique

3 CLI Command Reference

Every command the tool supports, with all flags and annotated examples.

Generate a Design & Runner Script

Usage
doe generate --config FILE [--output FILE] [--format sh|py] [--seed N] [--dry-run]

Flag	Default	Description
`--config`	required	Path to JSON config file
`--output`	`run_experiments.sh`	Output script path
`--format`	`sh`	Script format: `sh` (Bash) or `py` (Python)
`--seed`	random	Seed for reproducible run order
`--dry-run`	off	Print design matrix without writing files

✔

Pro Tip

Always use --dry-run first to preview your design matrix before committing to a full run. Use --seed for reproducible experiments.

Analyze Experiment Results

Usage
doe analyze --config FILE [--results-dir DIR] [--no-plots] [--csv DIR] [--partial]

Flag	Description
`--config`	Path to JSON config file (required)
`--results-dir`	Override `out_directory` from config
`--no-plots`	Skip generating Pareto charts and effects plots (headless mode)
`--csv`	Export main effects and summary stats to CSV files
`--partial`	Analyze only completed runs (skip missing result files)

What It Computes

ANOVA table — Full analysis of variance with SS decomposition, F-tests, and p-values. Uses Lenth's pseudo-standard-error for unreplicated designs (same approach as R's FrF2 package). Includes lack-of-fit test when replicates are available. Significant terms (p < 0.05) are highlighted.
Main effects — For 2-level factors: mean(high) − mean(low). For 3+ levels: max(means) − min(means).
Interaction effects — Two-factor interactions for all pairs of 2-level factors.
95% Confidence intervals — Using the t-distribution on effect estimates.
Summary statistics — Per-factor, per-level: count, mean, std, min, max.
Model diagnostics — 2×2 diagnostic panel: residuals vs fitted values, normal probability plot of residuals, residuals vs run order, and predicted vs actual. Includes PRESS statistic and predicted R².
Pareto chart — Ranked bar chart with cumulative contribution line.
Main effects plot — Grid of line plots showing mean response at each factor level.
Normal probability plot — Effects plotted against normal quantiles; significant effects deviate from the reference line and are labeled.
Half-normal plot — Absolute effects against half-normal quantiles for screening.
Response surface plots — 3D surface plots for each pair of continuous factors.

Get Optimization Recommendations

Usage
doe optimize --config FILE [--results-dir DIR] [--response NAME] [--partial] [--multi] [--steepest]

Flag	Description
`--config`	Path to JSON config file (required)
`--results-dir`	Override `out_directory` from config
`--response`	Optimize for a single response (default: all responses)
`--partial`	Use only completed runs for optimization
`--multi`	Multi-objective optimization using Derringer-Suich desirability functions
`--steepest`	Show steepest ascent/descent pathway for sequential experimentation

The optimizer reports the best observed run, fits linear and quadratic RSM models, finds the true surface optimum using L-BFGS-B optimization with multi-start, and ranks factors by importance. Use --response to focus on a specific metric.

Use --steepest to generate a table of follow-up experiment points along the gradient direction (standard RSM Phase 1 methodology).

Multi-Objective Optimization

When your experiment has multiple responses that conflict (e.g., maximize yield AND minimize cost), use --multi to find the best compromise using Derringer-Suich desirability functions.

Terminal
$ doe optimize --config config.json --multi

============================================================
MULTI-OBJECTIVE OPTIMIZATION
Method: Derringer-Suich Desirability Function
============================================================

Overall desirability: D = 0.7008

Response                  Weight Desirability    Predicted  Direction
---------------------------------------------------------------------
yield                        1.0       0.5648       67.55 %   ↑
purity                       1.0       0.9545       97.99 %   ↑
cost                         1.0       0.6383       47.76 USD  ↓

Recommended settings:
  temperature = 200 °C
  pressure = 4 bar
  catalyst = 2 g/L

Trade-off summary:
  yield: 67.55 (best observed: 78.52, sacrifice: +10.97)
  purity: 97.99 (best observed: 97.99, sacrifice: +0.00)
  cost: 47.76 (best observed: 34.05, sacrifice: +13.71)

To prioritize certain responses, add weight and optional bounds to your config:

config.json excerpt
"responses": [
  {"name": "yield",  "optimize": "maximize", "weight": 3, "bounds": [60, 95]},
  {"name": "cost",   "optimize": "minimize", "weight": 1, "bounds": [20, 80]}
]

ℹ

Weights & Bounds

weight (default: 1.0) controls relative importance — a weight of 3 means that response matters 3× more than a weight of 1. bounds (optional) define [worst, best] for desirability scaling; if omitted, bounds are auto-computed from observed data.

Generate an Interactive HTML Report

Usage
doe report --config FILE [--results-dir DIR] [--output FILE] [--partial]

Generates a self-contained HTML file with:

Design summary (factors, levels, operation type)
ANOVA tables with F-tests, p-values, and significance highlighting
Main effects and interaction tables for each response
Pareto charts, main effects plots, and normal/half-normal probability plots (base64 embedded)
Model diagnostic panels (residuals vs fitted, normal probability, etc.)
3D response surface plots for continuous factor pairs
Optimization results with true surface optimum
Full design matrix as an interactive table
Collapsible sections for easy navigation

Use --partial to generate reports from incomplete experiments. The report will note which runs are missing.

ℹ

Self-Contained

The HTML report has zero external dependencies. All plots are base64-encoded directly in the file. Share it via email, Slack, or drop it in a wiki — it just works.

Display Design Summary

Usage
doe info --config FILE

Prints the experiment plan summary without writing any files: operation type, number of factors, base runs, total runs with blocking, response definitions, fixed factors, alias structure (for fractional factorials), and design evaluation metrics (D-efficiency, A-efficiency, G-efficiency). Use this to quickly verify your config before running anything.

Compute Statistical Power

Usage
doe power --config FILE [--sigma FLOAT] [--delta FLOAT] [--alpha FLOAT] [--results-dir DIR]

Flag	Default	Description
`--config`	required	Path to JSON config file
`--sigma`	auto	Error standard deviation (estimated from results if omitted)
`--delta`	2×sigma	Minimum detectable effect size
`--alpha`	0.05	Significance level
`--results-dir`	from config	Override `out_directory` from config

Computes statistical power for detecting effects of a given size using the non-central F distribution. Power < 0.80 indicates you may need more runs or blocks to reliably detect the specified effect size.

ℹ

When to Use Power Analysis

Run this before your experiment to check if your design has enough runs. If power is low, consider adding blocks (replicates) or switching to a design with more runs. If results already exist, sigma is estimated automatically from residuals.

Augment an Existing Design

Usage
doe augment --config FILE --type TYPE [--output FILE] [--format sh|py]

Flag	Default	Description
`--config`	required	Path to JSON config file
`--type`	required	Augmentation type: `fold_over`, `star_points`, or `center_points`
`--output`	`run_experiments_augmented.sh`	Output script path
`--format`	`sh`	Script format: `sh` or `py`

Extends an existing design with additional runs without re-running completed experiments:

fold_over — Mirrors all runs (swaps high/low levels) to de-alias confounded effects in fractional factorials.
star_points — Adds axial (star) points for continuous factors, converting a factorial design into a CCD for response surface modeling.
center_points — Adds 3 center-point replicates to detect curvature and estimate pure error.

Record Results Interactively

Usage
doe record --config FILE --run N|all [--seed N]

Flag	Default	Description
`--config`	required	Path to JSON config file
`--run`	required	Run number to record, or `all` to enter all pending runs
`--seed`	42	Seed for consistent run ordering

For real-world experiments without a test script: the tool displays each run's factor settings, prompts for response values, validates numeric input, and saves the result as run_N.json. If results already exist, shows current values and asks before overwriting.

Example session
$ doe record --config config.json --run 3

Run 3 / 8 (Block 1)
  temperature = 200 °C
  pressure    = 6 bar
  catalyst    = B

Enter value for 'yield' (%): 87.3
Enter value for 'cost' (USD): 42.10
Saved → results/run_3.json

Check Experiment Progress

Usage
doe status --config FILE [--seed N]

Flag	Default	Description
`--config`	required	Path to JSON config file
`--seed`	42	Seed for consistent run ordering

Shows a progress bar, lists completed and pending runs, and displays full factor details for the next run to complete. Especially useful for long-running real-world experiments.

Example output
Experiment: Chemical Reactor Optimization
Design: box_behnken | 15 runs | 3 factors | 3 responses

Progress: 9/15 complete  [############........]  60%

Pending runs:
  Run 10: temperature=200, pressure=6, catalyst=B
  Run 11: temperature=150, pressure=4, catalyst=A
  ...

Next run to complete: Run 10
  temperature = 200 °C
  pressure    = 6 bar
  catalyst    = B

Record results with: doe record --config config.json --run 10

Export a Printable Worksheet

Usage
doe export-worksheet --config FILE [--format csv|markdown] [--output FILE]

Flag	Default	Description
`--config`	required	Path to JSON config file
`--format`	`csv`	Output format: `csv` or `markdown`
`--output`	stdout	Write to file instead of stdout
`--seed`	42	Seed for consistent run ordering

Generates a worksheet with all runs, factor values, and empty columns for response measurements and notes. Pre-fills response values for any runs that already have results. Perfect for printing and taking to the lab or field.

✔

Pro Tip

Use --format markdown for documentation or wikis, and --format csv for importing into Excel or Google Sheets.

4 Writing a Test Script

Your test script is the bridge between the DOE tool and your actual experiment. It must follow a simple protocol.

⚠

The Protocol

Your script must: (1) accept factor values via the configured arg_style, (2) accept --out <path> for the output file, and (3) write a JSON file with keys matching your response names.

test.sh (double-dash style)

#!/bin/bash
# Parse arguments
while [[ $# -gt 0 ]]; do
  case $1 in
    --temperature) TEMP=$2; shift 2;;
    --pressure)    PRES=$2; shift 2;;
    --catalyst)    CAT=$2;  shift 2;;
    --out)         OUT=$2;  shift 2;;
    *) shift;;
  esac
done

# Run your experiment here...
YIELD=$(your_experiment $TEMP $PRES $CAT)
COST=$(calculate_cost $TEMP $PRES $CAT)

# Write results as JSON
echo "{\"yield\": $YIELD, \"cost\": $COST}" > "$OUT"

test.py (double-dash style)

import argparse, json

parser = argparse.ArgumentParser()
parser.add_argument("--temperature", required=True)
parser.add_argument("--pressure",    required=True)
parser.add_argument("--catalyst",    required=True)
parser.add_argument("--out",         required=True)
args = parser.parse_args()

# Run your experiment here...
result = {
    "yield": run_experiment(args.temperature, args.pressure, args.catalyst),
    "cost":  calculate_cost(args.temperature, args.pressure, args.catalyst),
}

with open(args.out, "w") as f:
    json.dump(result, f)

No Test Script? No Problem.

For real-world experiments (lab work, physical tests, field measurements), you don't need a test script at all. Leave test_script empty or omit it, and use the manual workflow instead:

Manual experiment workflow

# 1. Design your experiment
$ doe generate --config config.json --seed 42

# 2. Print a worksheet for the lab
$ doe export-worksheet --config config.json --format csv --output worksheet.csv

# 3. Check what to run next
$ doe status --config config.json

# 4. After each physical experiment, record the result
$ doe record --config config.json --run 1

# 5. Analyze as you go (partial results OK)
$ doe analyze --config config.json --partial

# 6. When all runs are done, get the full analysis
$ doe analyze --config config.json
$ doe optimize --config config.json
$ doe report --config config.json --output report.html

ℹ

Works With Any Experiment

The analysis pipeline doesn't care how results were produced. Whether you ran a simulation, measured something in a lab, or collected field data — as long as the response values end up in run_N.json files, everything works.

5 Advanced Tips & Patterns

Use blocking for multi-day experiments ▼

If your experiments span multiple days, sessions, or batches, use "block_count": 2 (or more) in your config. Each block is an independent replicate with its own randomized order. This lets you detect and account for day-to-day variation.

config.json excerpt
"settings": {
  "operation": "plackett_burman",
  "block_count": 2,   // base runs x 2 = total
  ...
}

Export to CSV for custom analysis in R or pandas ▼

The --csv flag on the analyze command exports structured data files:

main_effects_{response}.csv — Factor, effect magnitude, std error, % contribution, CI bounds
summary_stats_{response}.csv — Per-factor, per-level statistics (N, mean, std, min, max)

Terminal
$ doe analyze --config config.json --csv results/csv/
$ ls results/csv/
main_effects_yield.csv  summary_stats_yield.csv
main_effects_cost.csv   summary_stats_cost.csv

Run headless in CI/CD pipelines ▼

Use --no-plots to skip matplotlib chart generation. This avoids display issues in SSH sessions and CI environments while still computing all statistical results.

GitHub Actions example
- name: Run DOE analysis
  run: |
    doe generate --config config.json --seed 42
    bash run_experiments.sh
    doe analyze --config config.json --no-plots --csv results/csv/

The screening-to-optimization pipeline ▼

The most efficient experimental strategy uses two (or three) stages:

Screen with Plackett-Burman (N+1 runs): identify the 2–3 factors that matter most out of many candidates.
Optimize with Box-Behnken or CCD (15–20 runs): fit a quadratic model to the important factors and find the true optimum.
Confirm with 3–5 runs at the predicted optimum to validate.

Total: ~25 runs to fully optimize, compared to hundreds with grid search.

Reproducible experiments with --seed ▼

The --seed flag controls run-order randomization. The same seed always produces the same run order, making your experiments reproducible. The design matrix itself (which factor combinations are tested) is always deterministic — the seed only affects the order within each block.

6 Complete Tutorials

End-to-end walkthroughs of real experimental workflows.

Tutorial 1: Database Performance Tuning

A Plackett-Burman design with 6 PostgreSQL configuration parameters, 2 blocks for replication, and CSV export for downstream analysis in R.

Complete workflow

# Step 1: Preview
$ doe info --config use_cases/04_database_performance_tuning/config.json

# Step 2: Generate
$ doe generate --config use_cases/04_database_performance_tuning/config.json \
    --output results/run.sh --seed 42

# Step 3: Execute
$ bash results/run.sh

# Step 4: Analyze (headless + CSV)
$ doe analyze --config use_cases/04_database_performance_tuning/config.json \
    --no-plots --csv results/csv/

# Step 5: Optimize
$ doe optimize --config use_cases/04_database_performance_tuning/config.json

# Step 6: Report
$ doe report --config use_cases/04_database_performance_tuning/config.json \
    --output results/report.html

Tutorial 2: Chemical Reactor Optimization

A Box-Behnken design with 3 continuous factors and 3 responses (yield, purity, cost). Demonstrates multi-response analysis, RSM, and the trade-offs inherent in multi-objective optimization.

Complete workflow

# Preview the Box-Behnken design (15 runs)
$ doe info --config use_cases/01_reactor_optimization/config.json

# Generate, run, and analyze
$ doe generate --config use_cases/01_reactor_optimization/config.json \
    --output results/run.sh --seed 42
$ bash results/run.sh
$ doe analyze --config use_cases/01_reactor_optimization/config.json

# Get optimization recommendations
$ doe optimize --config use_cases/01_reactor_optimization/config.json

# Generate interactive HTML report
$ doe report --config use_cases/01_reactor_optimization/config.json \
    --output results/report.html

Tutorial 3: Real-World Lab Experiment (Manual Entry)

A hands-on walkthrough for running physical experiments without a test script. Uses the record, status, and export-worksheet commands to manage a manual workflow.

Complete manual workflow

# Create a config with no test_script
$ cat config.json
{
  "factors": [
    {"name": "temperature", "levels": ["150", "200"], "type": "continuous", "unit": "°C"},
    {"name": "pressure", "levels": ["2", "6"], "type": "continuous", "unit": "bar"},
    {"name": "catalyst", "levels": ["A", "B"], "type": "categorical"}
  ],
  "responses": [
    {"name": "yield", "optimize": "maximize", "unit": "%"},
    {"name": "cost", "optimize": "minimize", "unit": "USD"}
  ],
  "settings": {"operation": "full_factorial", "out_directory": "results"}
}

# Preview the design
$ doe info --config config.json

# Print worksheet for lab notebook
$ doe export-worksheet --config config.json --format markdown

# Check progress
$ doe status --config config.json

# Record results after each experiment
$ doe record --config config.json --run 1
$ doe record --config config.json --run 2

# Peek at partial results
$ doe analyze --config config.json --partial

# Record remaining runs and finalize
$ doe record --config config.json --run all
$ doe analyze --config config.json
$ doe report --config config.json --output report.html

7 Using AI to Help Design Experiments

Not sure how to set up your experiment? Use the AI Prompts page to generate experiment configurations with the help of an AI assistant like Claude or ChatGPT.

ℹ

How It Works

The AI Prompts page provides ready-to-use prompts that guide an AI assistant through the process of creating a DOE configuration file for your specific problem. Describe your experiment in plain English, and get a complete config.json, simulation script, and analysis workflow.

The prompts cover common scenarios:

Experiment Design — Generate a complete config.json from a natural-language description of your experiment
Factor Selection — Get help choosing which factors to include and what levels to test
Results Interpretation — Paste your analysis output and get a plain-English explanation of what the results mean
Design Selection — Describe your constraints and get a recommendation for which design type to use

Example: asking AI to design your experiment
"I'm optimizing a PostgreSQL database. I want to test shared_buffers
(256MB to 4GB), work_mem (4MB to 256MB), effective_cache_size (1GB to 8GB),
and max_parallel_workers (2 to 8). I care about query throughput and
p99 latency. Generate a DOE config.json for me."

Browse all AI prompts →

1 Installation & Setup

System Requirements

Dependencies

2 Writing a Configuration File

The factors Section

The responses Section

The settings Section

Validation Rules

3 CLI Command Reference

Generate a Design & Runner Script

Pro Tip

Analyze Experiment Results

What It Computes

Get Optimization Recommendations

Multi-Objective Optimization

Weights & Bounds

Generate an Interactive HTML Report

Self-Contained

Display Design Summary

Compute Statistical Power

When to Use Power Analysis

Augment an Existing Design

Record Results Interactively

Check Experiment Progress

Export a Printable Worksheet

Pro Tip

4 Writing a Test Script

The Protocol

No Test Script? No Problem.

Works With Any Experiment

5 Advanced Tips & Patterns

6 Complete Tutorials

Tutorial 1: Database Performance Tuning

Tutorial 2: Chemical Reactor Optimization

Tutorial 3: Real-World Lab Experiment (Manual Entry)

7 Using AI to Help Design Experiments

How It Works

The `factors` Section

The `responses` Section

The `settings` Section