← All Use Cases
Central Composite Design

Distillation Column Optimization

Full pipeline with CCD: star points, center replicates, quadratic RSM, and single-response optimization.

Summary

This experiment investigates distillation column optimization. Central Composite Design to model and optimize separation efficiency.

The design varies 3 factors: reflux ratio, ranging from 1.5 to 4.5, feed rate (L/h), ranging from 50 to 150, and column pressure (atm), ranging from 1.0 to 3.0. The goal is to optimize 2 responses: separation efficiency (%) (maximize) and energy cost (USD/h) (minimize). Fixed conditions held constant across all runs include feed temp = 80, n trays = 20.

A Central Composite Design (CCD) was selected to fit a full quadratic response surface model, including curvature and interaction effects. With 3 factors this produces 22 runs including center points and axial (star) points that extend beyond the factorial range.

Quadratic response surface models were fitted to capture potential curvature and factor interactions. The RSM contour plots below visualize how pairs of factors jointly affect each response.

Key Findings

For separation efficiency, the most influential factors were feed rate (39.9%), reflux ratio (30.2%), column pressure (29.9%). The best observed value was 91.68 (at reflux ratio = 3, feed rate = 100, column pressure = 2).

For energy cost, the most influential factors were feed rate (43.9%), column pressure (41.8%), reflux ratio (14.3%). The best observed value was 10.03 (at reflux ratio = 1.5, feed rate = 150, column pressure = 1).

Recommended Next Steps

The Scenario

You are optimizing a distillation column for maximum separation efficiency at minimum energy cost. You suspect the response surface is curved (quadratic), so you need a design with star points and center points to fit a second-order model.

Why Central Composite Design?

CCD = factorial (23=8) + star (2×3=6) + center (8) = 22 runs. Star points at ±α extend beyond the factorial cube to probe curvature. Center replicates estimate pure error. Orthogonal alpha ensures balanced estimation of all model terms.

Experimental Setup

Factors

FactorLowHighUnit
reflux_ratio1.54.5L/D
feed_rate50150L/h
column_pressure1.03.0atm

Fixed: feed_temp = 80°C, n_trays = 20

Responses

ResponseDirectionUnit
separation_efficiency↑ maximize%
energy_cost↓ minimizeUSD/h

CCD Run Structure

8
Factorial points
All corner combos
6
Star (axial) points
Probe curvature
8
Center replicates
Estimate pure error

Experimental Matrix

The Central Composite Design produces 22 runs. Each row is one experiment with specific factor settings.

Runreflux_ratiofeed_ratecolumn_pressure
131002
24.5503
31.51501
43191.2872
531002
60.2613871002
731000.174258
831002
94.51501
105.738611002
1131002
1238.712912
1331002
141.5503
1531002
164.5501
1731003.82574
184.51503
1931002
201.5501
211.51503
2231002

Step-by-Step Workflow

This use case demonstrates the full pipeline — every command the tool offers:

Full pipeline: info → generate → run → analyze → optimize → report → csv
# 1. Preview the CCD design $ doe info --config use_cases/06_distillation_column/config.json # 2. Generate runner script $ doe generate --config use_cases/06_distillation_column/config.json \ --output results/run.sh --seed 55 # 3. Execute all 22 experiments $ bash results/run.sh # 4. Analyze (with plots) $ doe analyze --config use_cases/06_distillation_column/config.json # 5a. Optimize ALL responses $ doe optimize --config use_cases/06_distillation_column/config.json $ doe optimize --config use_cases/06_distillation_column/config.json --multi # multi-objective # 5b. Optimize a SINGLE response $ doe optimize --config use_cases/06_distillation_column/config.json \ --response separation_efficiency # 6. Generate HTML report $ doe report --config use_cases/06_distillation_column/config.json \ --output results/report.html # 7. Export CSV for custom quadratic modeling $ doe analyze --config use_cases/06_distillation_column/config.json \ --csv results/csv/

Star points extend beyond [low, high]

Some runs have factor values outside the [low, high] range (e.g., reflux_ratio below 1.5 or above 4.5). This is the CCD's circumscribed design: the factorial cube is inscribed within the star points. Make sure your equipment can handle these extended ranges.

Real-World Plant Workflow

Running on Real Equipment? Use the Manual Workflow

Distillation column optimization involves adjusting physical equipment — reflux valves, reboiler temperature, feed rates — and taking samples for analysis. Each run may take hours to reach steady state. The simulation above is for demonstration; for real plant trials, use the manual workflow.

Plant experiments often run one or two conditions per shift. Here's how to manage the process:

Manual workflow for plant trials
# 1. Print a run sheet for the control room $ doe export-worksheet --config use_cases/06_distillation_column/config.json \ --format csv --output distillation_runs.csv # 2. Check today's runs $ doe status --config use_cases/06_distillation_column/config.json # 3. After each steady-state measurement, record the results $ doe record --config use_cases/06_distillation_column/config.json --run 1 # Enter purity, throughput, energy_consumption when prompted # 4. After the first shift, peek at partial results $ doe analyze --config use_cases/06_distillation_column/config.json --partial # 5. After all runs are complete $ doe analyze --config use_cases/06_distillation_column/config.json $ doe optimize --config use_cases/06_distillation_column/config.json $ doe report --config use_cases/06_distillation_column/config.json \ --output distillation_report.html

Built for Multi-Day Experiments

Distillation trials often span multiple shifts or days. The status command tracks progress across sessions, record saves results one at a time as each steady-state condition is reached, and --partial analysis lets the process engineer evaluate trends before all conditions have been tested — critical when plant time is expensive.

Interpreting the Results

Trade-offs

Single vs. Multi-Response Optimization

The --response flag

Compare optimize --response separation_efficiency (ignores cost) vs. optimize (all responses). The settings that maximize efficiency often increase energy cost. This reveals the Pareto frontier — the set of optimal trade-offs.

Next Steps

  1. Fit a full quadratic RSM model using the CCD data
  2. Construct a desirability function weighting efficiency vs. cost
  3. Find the Pareto-optimal frontier
  4. Run confirmation experiments at the predicted optimum

Features Exercised

FeatureValue
Design typecentral_composite (circumscribed, orthogonal α)
Factor typescontinuous (all 3)
Star pointsYes (extends beyond [low, high])
Center replicates8 center points for error estimation
--responseSingle-response optimization
--csvExport for custom modeling
Full pipelineinfo → generate → run → analyze → optimize → report → csv
Total runs22 (8 factorial + 6 star + 8 center)

Analysis Results

Generated from actual experiment runs using the DOE Helper Tool.

Response: separation_efficiency

The Pareto chart identifies which column parameters most strongly influence separation efficiency.

Pareto Chart

Pareto chart for separation efficiency

Main Effects Plot

Main effects plot for separation efficiency

Response: energy_cost

Energy cost responds to a different set of column parameters, requiring careful optimization against separation efficiency.

Pareto Chart

Pareto chart for energy cost

Main Effects Plot

Main effects plot for energy cost

Response Surface Plots

3D surfaces fitted with quadratic RSM. Red dots are observed data points.

📊

How to Read These Surfaces

Each plot shows predicted response (vertical axis) across two factors while other factors are held at center. Red dots are actual experimental observations.

  • Flat surface — these two factors have little effect on the response.
  • Tilted plane — strong linear effect; moving along one axis consistently changes the response.
  • Curved/domed surface — quadratic curvature; there is an optimum somewhere in the middle.
  • Saddle shape — significant interaction; the best setting of one factor depends on the other.
  • Red dots far from surface — poor model fit in that region; be cautious about predictions there.

separation_efficiency (%) — R² = 0.312, Adj R² = -0.204
Weak fit — interpret the surface shape with caution.
Curvature detected in reflux_ratio, column_pressure — look for a peak or valley in the surface.
Strongest linear driver: column_pressure (decreases separation_efficiency).
Notable interaction: reflux_ratio × feed_rate — the effect of one depends on the level of the other. Look for a twisted surface.

energy_cost (USD/h) — R² = 0.277, Adj R² = -0.265
Weak fit — interpret the surface shape with caution.
Curvature detected in column_pressure, reflux_ratio — look for a peak or valley in the surface.
Strongest linear driver: reflux_ratio (decreases energy_cost).
Notable interaction: reflux_ratio × feed_rate — the effect of one depends on the level of the other. Look for a twisted surface.

energy: cost feed rate vs column pressure

RSM surface: energy — cost feed rate vs column pressure

energy: cost reflux ratio vs column pressure

RSM surface: energy — cost reflux ratio vs column pressure

energy: cost reflux ratio vs feed rate

RSM surface: energy — cost reflux ratio vs feed rate

separation: efficiency feed rate vs column pressure

RSM surface: separation — efficiency feed rate vs column pressure

separation: efficiency reflux ratio vs column pressure

RSM surface: separation — efficiency reflux ratio vs column pressure

separation: efficiency reflux ratio vs feed rate

RSM surface: separation — efficiency reflux ratio vs feed rate

Full Analysis Output

doe analyze
=== Main Effects: separation_efficiency === Factor Effect Std Error % Contribution -------------------------------------------------------------- reflux_ratio 16.1300 1.6045 38.5% feed_rate 13.2175 1.6045 31.5% column_pressure 12.5900 1.6045 30.0% === Summary Statistics: separation_efficiency === reflux_ratio: Level N Mean Std Min Max ------------------------------------------------------------ 0.261387 1 75.5500 0.0000 75.5500 75.5500 1.5 4 79.7475 10.9426 63.4000 86.1400 3 12 81.7875 5.9172 71.9400 88.5400 4.5 4 75.6250 7.7099 67.1900 85.1700 5.73861 1 91.6800 0.0000 91.6800 91.6800 feed_rate: Level N Mean Std Min Max ------------------------------------------------------------ 100 12 81.7408 6.8013 71.9400 91.6800 150 4 83.2500 3.7214 77.8500 86.1400 191.287 1 82.4500 0.0000 82.4500 82.4500 50 4 72.1225 9.7014 63.4000 85.6100 8.71291 1 85.3400 0.0000 85.3400 85.3400 column_pressure: Level N Mean Std Min Max ------------------------------------------------------------ 0.174258 1 84.5300 0.0000 84.5300 84.5300 1 4 73.6450 10.3350 63.4000 86.1400 2 12 82.6842 6.0885 72.1700 91.6800 3 4 81.7275 6.3365 72.2900 85.6100 3.82574 1 71.9400 0.0000 71.9400 71.9400 === Main Effects: energy_cost === Factor Effect Std Error % Contribution -------------------------------------------------------------- reflux_ratio 26.3000 2.1251 44.5% feed_rate 19.4875 2.1251 33.0% column_pressure 13.3392 2.1251 22.6% === Summary Statistics: energy_cost === reflux_ratio: Level N Mean Std Min Max ------------------------------------------------------------ 0.261387 1 30.2400 0.0000 30.2400 30.2400 1.5 4 24.7800 9.8810 10.0300 31.0100 3 12 30.9358 7.0655 20.4800 45.9900 4.5 4 27.2325 14.7182 14.0200 48.0900 5.73861 1 51.0800 0.0000 51.0800 51.0800 feed_rate: Level N Mean Std Min Max ------------------------------------------------------------ 100 12 32.0933 8.9857 20.4800 51.0800 150 4 33.5600 9.9320 25.7600 48.0900 191.287 1 37.9400 0.0000 37.9400 37.9400 50 4 18.4525 8.2137 10.0300 28.7000 8.71291 1 29.4900 0.0000 29.4900 29.4900 column_pressure: Level N Mean Std Min Max ------------------------------------------------------------ 0.174258 1 28.4300 0.0000 28.4300 28.4300 1 4 19.7975 9.2405 10.0300 29.3800 2 12 33.1367 8.8992 20.4800 51.0800 3 4 32.2150 11.4055 21.0600 48.0900 3.82574 1 26.4800 0.0000 26.4800 26.4800

Optimization Recommendations

doe optimize
=== Optimization: separation_efficiency === Direction: maximize Best observed run: #18 reflux_ratio = 3 feed_rate = 100 column_pressure = 2 Value: 91.68 RSM Model (linear, R² = 0.21): Coefficients: intercept: +80.4623 reflux_ratio: +0.0513 feed_rate: -3.1826 column_pressure: -2.5962 Predicted optimum: reflux_ratio = 4.5 feed_rate = 50 column_pressure = 1 Predicted value: 86.2923 Factor importance: 1. column_pressure (effect: 14.1, contribution: 47.5%) 2. feed_rate (effect: 8.5, contribution: 28.6%) 3. reflux_ratio (effect: 7.1, contribution: 23.9%) === Optimization: energy_cost === Direction: minimize Best observed run: #6 reflux_ratio = 4.5 feed_rate = 150 column_pressure = 3 Value: 10.03 RSM Model (linear, R² = 0.22): Coefficients: intercept: +30.0273 reflux_ratio: -0.4562 feed_rate: -5.3215 column_pressure: -1.8471 Predicted optimum: reflux_ratio = 3 feed_rate = 8.71291 column_pressure = 2 Predicted value: 39.7429 Factor importance: 1. feed_rate (effect: 16.0, contribution: 55.4%) 2. column_pressure (effect: 10.8, contribution: 37.2%) 3. reflux_ratio (effect: 2.1, contribution: 7.4%)

Multi-Objective Optimization

When responses compete, Derringer–Suich desirability finds the best compromise. Each response is scaled to a 0–1 desirability, then combined via a weighted geometric mean.

Overall Desirability
D = 0.6944

Per-Response Desirability

ResponseWeightDesirabilityPredictedDir
separation_efficiency 2.0
0.7855
86.42 0.7855 86.42 %
energy_cost 1.0
0.5426
28.63 0.5426 28.63 USD/h

Recommended Settings

FactorValue
reflux_ratio4.5
feed_rate150 L/h
column_pressure1 atm

Source: from observed run #5

Trade-off Summary

Sacrifice = how much worse than single-objective best.

ResponsePredictedBest ObservedSacrifice
energy_cost28.6310.03+18.60

Top 3 Runs by Desirability

RunDFactor Settings
#190.6819reflux_ratio=3, feed_rate=100, column_pressure=2
#150.6783reflux_ratio=3, feed_rate=100, column_pressure=2

Model Quality

ResponseType
energy_cost0.0694linear

Full Multi-Objective Output

doe optimize --multi
============================================================ MULTI-OBJECTIVE OPTIMIZATION Method: Derringer-Suich Desirability Function ============================================================ Overall desirability: D = 0.6944 Response Weight Desirability Predicted Direction --------------------------------------------------------------------- separation_efficiency 2.0 0.7855 86.42 % ↑ energy_cost 1.0 0.5426 28.63 USD/h ↓ Recommended settings: reflux_ratio = 4.5 feed_rate = 150 L/h column_pressure = 1 atm (from observed run #5) Trade-off summary: separation_efficiency: 86.42 (best observed: 91.68, sacrifice: +5.26) energy_cost: 28.63 (best observed: 10.03, sacrifice: +18.60) Model quality: separation_efficiency: R² = 0.0559 (linear) energy_cost: R² = 0.0694 (linear) Top 3 observed runs by overall desirability: 1. Run #5 (D=0.6944): reflux_ratio=4.5, feed_rate=150, column_pressure=1 2. Run #19 (D=0.6819): reflux_ratio=3, feed_rate=100, column_pressure=2 3. Run #15 (D=0.6783): reflux_ratio=3, feed_rate=100, column_pressure=2
← Material Formulation All Use Cases →