Summary
This experiment investigates siem alert correlation. Box-Behnken design to tune alert correlation window, similarity threshold, and event count for alert reduction.
The design varies 3 factors: correlation window sec (sec), ranging from 30 to 600, similarity threshold (ratio), ranging from 0.3 to 0.9, and min event count (events), ranging from 2 to 10. The goal is to optimize 2 responses: alert reduction pct (%) (maximize) and missed incident rate (%) (minimize). Fixed conditions held constant across all runs include siem platform = elastic_security, log sources = 12.
A Box-Behnken design was chosen because it efficiently fits quadratic models with 3 continuous factors while avoiding extreme corner combinations — requiring only 15 runs instead of the 8 needed for a full factorial at two levels.
Quadratic response surface models were fitted to capture potential curvature and factor interactions. The RSM contour plots below visualize how pairs of factors jointly affect each response.
Key Findings
For alert reduction pct, the most influential factors were similarity threshold (47.0%), correlation window sec (36.1%), min event count (16.9%). The best observed value was 77.0 (at correlation window sec = 315, similarity threshold = 0.3, min event count = 2).
For missed incident rate, the most influential factors were similarity threshold (38.2%), correlation window sec (32.9%), min event count (29.0%). The best observed value was 0.11 (at correlation window sec = 315, similarity threshold = 0.9, min event count = 10).
Recommended Next Steps
- Run confirmation experiments at the predicted optimal settings to validate the model.
- Consider whether any fixed factors should be varied in a future study.
Experimental Setup
Factors
| Factor | Low | High | Unit |
correlation_window_sec | 30 | 600 | sec |
similarity_threshold | 0.3 | 0.9 | ratio |
min_event_count | 2 | 10 | events |
Fixed: siem_platform = elastic_security, log_sources = 12
Responses
| Response | Direction | Unit |
alert_reduction_pct | ↑ maximize | % |
missed_incident_rate | ↓ minimize | % |
Configuration
{
"metadata": {
"name": "SIEM Alert Correlation",
"description": "Box-Behnken design to tune alert correlation window, similarity threshold, and event count for alert reduction"
},
"factors": [
{
"name": "correlation_window_sec",
"levels": [
"30",
"600"
],
"type": "continuous",
"unit": "sec"
},
{
"name": "similarity_threshold",
"levels": [
"0.3",
"0.9"
],
"type": "continuous",
"unit": "ratio"
},
{
"name": "min_event_count",
"levels": [
"2",
"10"
],
"type": "continuous",
"unit": "events"
}
],
"fixed_factors": {
"siem_platform": "elastic_security",
"log_sources": "12"
},
"responses": [
{
"name": "alert_reduction_pct",
"optimize": "maximize",
"unit": "%"
},
{
"name": "missed_incident_rate",
"optimize": "minimize",
"unit": "%"
}
],
"settings": {
"operation": "box_behnken",
"test_script": "use_cases/59_siem_alert_correlation/sim.sh"
}
}
Experimental Matrix
The Box-Behnken Design produces 15 runs. Each row is one experiment with specific factor settings.
| Run | correlation_window_sec | similarity_threshold | min_event_count |
| 1 | 315 | 0.3 | 2 |
| 2 | 315 | 0.6 | 6 |
| 3 | 600 | 0.6 | 10 |
| 4 | 600 | 0.6 | 2 |
| 5 | 315 | 0.6 | 6 |
| 6 | 315 | 0.6 | 6 |
| 7 | 30 | 0.6 | 10 |
| 8 | 600 | 0.3 | 6 |
| 9 | 315 | 0.3 | 10 |
| 10 | 600 | 0.9 | 6 |
| 11 | 30 | 0.6 | 2 |
| 12 | 315 | 0.9 | 10 |
| 13 | 30 | 0.3 | 6 |
| 14 | 30 | 0.9 | 6 |
| 15 | 315 | 0.9 | 2 |
Step-by-Step Workflow
1
Preview the design
$ doe info --config use_cases/59_siem_alert_correlation/config.json
2
Generate the runner script
$ doe generate --config use_cases/59_siem_alert_correlation/config.json \
--output use_cases/59_siem_alert_correlation/results/run.sh --seed 42
3
Execute the experiments
$ bash use_cases/59_siem_alert_correlation/results/run.sh
4
Analyze results
$ doe analyze --config use_cases/59_siem_alert_correlation/config.json
5
Get optimization recommendations
$ doe optimize --config use_cases/59_siem_alert_correlation/config.json
6
Multi-objective optimization
With 2 competing responses, use --multi to find the best compromise via Derringer–Suich desirability.
$ doe optimize --config use_cases/59_siem_alert_correlation/config.json --multi
7
Generate the HTML report
$ doe report --config use_cases/59_siem_alert_correlation/config.json \
--output use_cases/59_siem_alert_correlation/results/report.html
Features Exercised
| Feature | Value |
| Design type | box_behnken |
| Factor types | continuous (all 3) |
| Arg style | double-dash |
| Responses | 2 (alert_reduction_pct ↑, missed_incident_rate ↓) |
| Total runs | 15 |
Analysis Results
Generated from actual experiment runs using the DOE Helper Tool.
Response: alert_reduction_pct
Top factors: similarity_threshold (47.0%), correlation_window_sec (36.1%), min_event_count (16.9%).
ANOVA
| Source | DF | SS | MS | F | p-value |
| Source | DF | SS | MS | F | p-value |
| correlation_window_sec | 2 | 843.8733 | 421.9366 | 7.705 | 0.0136 |
| similarity_threshold | 2 | 1438.1000 | 719.0500 | 13.130 | 0.0030 |
| min_event_count | 2 | 198.8133 | 99.4066 | 1.815 | 0.2239 |
| Lack | of | Fit | 6 | 1219.9561 | 203.3260 |
| Pure | Error | 2 | 109.5267 | | |
| Error | 8 | 1329.4828 | 54.7633 | | |
| Total | 14 | 3810.2693 | 272.1621 | | |
Pareto Chart
Main Effects Plot
Normal Probability Plot of Effects
Half-Normal Plot of Effects
Model Diagnostics
Response: missed_incident_rate
Top factors: similarity_threshold (38.2%), correlation_window_sec (32.9%), min_event_count (29.0%).
ANOVA
| Source | DF | SS | MS | F | p-value |
| Source | DF | SS | MS | F | p-value |
| correlation_window_sec | 2 | 23.3573 | 11.6787 | 34.796 | 0.0001 |
| similarity_threshold | 2 | 32.7878 | 16.3939 | 48.845 | 0.0000 |
| min_event_count | 2 | 18.1332 | 9.0666 | 27.013 | 0.0003 |
| Lack | of | Fit | 6 | 113.6247 | 18.9375 |
| Pure | Error | 2 | 0.6713 | | |
| Error | 8 | 114.2960 | 0.3356 | | |
| Total | 14 | 188.5743 | 13.4696 | | |
Pareto Chart
Main Effects Plot
Normal Probability Plot of Effects
Half-Normal Plot of Effects
Model Diagnostics
Response Surface Plots
3D surfaces fitted with quadratic RSM. Red dots are observed data points.
alert reduction pct correlation window sec vs min event count
alert reduction pct correlation window sec vs similarity threshold
alert reduction pct similarity threshold vs min event count
missed incident rate correlation window sec vs min event count
missed incident rate correlation window sec vs similarity threshold
missed incident rate similarity threshold vs min event count
Multi-Objective Optimization
When responses compete, Derringer–Suich desirability finds the best compromise.
Each response is scaled to a 0–1 desirability, then combined via a weighted geometric mean.
Overall Desirability
D = 0.6594
Per-Response Desirability
| Response | Weight | Desirability | Predicted | Dir |
alert_reduction_pct |
1.5 |
|
77.00 0.9545 77.00 % |
↑ |
missed_incident_rate |
1.0 |
|
8.51 0.3787 8.51 % |
↓ |
Recommended Settings
| Factor | Value |
correlation_window_sec | 600 sec |
similarity_threshold | 0.3 ratio |
min_event_count | 6 events |
Source: from observed run #10
Trade-off Summary
Sacrifice = how much worse than single-objective best.
| Response | Predicted | Best Observed | Sacrifice |
missed_incident_rate | 8.51 | 0.11 | +8.40 |
Top 3 Runs by Desirability
| Run | D | Factor Settings |
| #4 | 0.6528 | correlation_window_sec=315, similarity_threshold=0.9, min_event_count=10 |
| #6 | 0.6413 | correlation_window_sec=30, similarity_threshold=0.3, min_event_count=6 |
Model Quality
| Response | R² | Type |
missed_incident_rate | 0.3289 | linear |
Full Multi-Objective Output
============================================================
MULTI-OBJECTIVE OPTIMIZATION
Method: Derringer-Suich Desirability Function
============================================================
Overall desirability: D = 0.6594
Response Weight Desirability Predicted Direction
---------------------------------------------------------------------
alert_reduction_pct 1.5 0.9545 77.00 % ↑
missed_incident_rate 1.0 0.3787 8.51 % ↓
Recommended settings:
correlation_window_sec = 600 sec
similarity_threshold = 0.3 ratio
min_event_count = 6 events
(from observed run #10)
Trade-off summary:
alert_reduction_pct: 77.00 (best observed: 77.00, sacrifice: +0.00)
missed_incident_rate: 8.51 (best observed: 0.11, sacrifice: +8.40)
Model quality:
alert_reduction_pct: R² = 0.2435 (linear)
missed_incident_rate: R² = 0.3289 (linear)
Top 3 observed runs by overall desirability:
1. Run #10 (D=0.6594): correlation_window_sec=600, similarity_threshold=0.3, min_event_count=6
2. Run #4 (D=0.6528): correlation_window_sec=315, similarity_threshold=0.9, min_event_count=10
3. Run #6 (D=0.6413): correlation_window_sec=30, similarity_threshold=0.3, min_event_count=6
Full Analysis Output
=== Main Effects: alert_reduction_pct ===
Factor Effect Std Error % Contribution
--------------------------------------------------------------
similarity_threshold 23.6964 4.2596 47.0%
correlation_window_sec 18.1607 4.2596 36.1%
min_event_count 8.5143 4.2596 16.9%
=== ANOVA Table: alert_reduction_pct ===
Source DF SS MS F p-value
-----------------------------------------------------------------------------
correlation_window_sec 2 843.8733 421.9366 7.705 0.0136
similarity_threshold 2 1438.1000 719.0500 13.130 0.0030
min_event_count 2 198.8133 99.4066 1.815 0.2239
Lack of Fit 6 1219.9561 203.3260 3.713 0.2273
Pure Error 2 109.5267 54.7633
Error 8 1329.4828 54.7633
Total 14 3810.2693 272.1621
=== Summary Statistics: alert_reduction_pct ===
correlation_window_sec:
Level N Mean Std Min Max
------------------------------------------------------------
30 4 50.7000 20.8610 27.5000 77.0000
315 7 56.0857 13.7421 41.0000 75.8000
600 4 37.9250 13.2638 22.4000 53.0000
similarity_threshold:
Level N Mean Std Min Max
------------------------------------------------------------
0.3 4 51.0750 18.9167 32.7000 77.0000
0.6 7 40.7286 11.9991 22.4000 54.7000
0.9 4 64.4250 12.0384 53.0000 75.8000
min_event_count:
Level N Mean Std Min Max
------------------------------------------------------------
10 4 43.8000 23.6850 22.4000 73.8000
2 4 51.4250 16.2514 43.1000 75.8000
6 7 52.3143 13.7610 32.7000 77.0000
=== Main Effects: missed_incident_rate ===
Factor Effect Std Error % Contribution
--------------------------------------------------------------
similarity_threshold 3.5143 0.9476 38.2%
correlation_window_sec 3.0286 0.9476 32.9%
min_event_count 2.6668 0.9476 29.0%
=== ANOVA Table: missed_incident_rate ===
Source DF SS MS F p-value
-----------------------------------------------------------------------------
correlation_window_sec 2 23.3573 11.6787 34.796 0.0001
similarity_threshold 2 32.7878 16.3939 48.845 0.0000
min_event_count 2 18.1332 9.0666 27.013 0.0003
Lack of Fit 6 113.6247 18.9375 56.423 0.0175
Pure Error 2 0.6713 0.3356
Error 8 114.2960 0.3356
Total 14 188.5743 13.4696
=== Summary Statistics: missed_incident_rate ===
correlation_window_sec:
Level N Mean Std Min Max
------------------------------------------------------------
30 4 4.7250 3.7093 0.4200 8.5100
315 7 5.7686 3.9438 2.5900 13.3700
600 4 2.7400 3.1947 0.1100 7.0300
similarity_threshold:
Level N Mean Std Min Max
------------------------------------------------------------
0.3 4 4.1850 3.4471 0.5100 8.5100
0.6 7 3.5857 2.7824 0.1100 7.0300
0.9 4 7.1000 4.9561 2.9500 13.3700
min_event_count:
Level N Mean Std Min Max
------------------------------------------------------------
10 4 4.7575 6.1840 0.1100 13.3700
2 4 6.3525 2.6398 2.5900 8.7700
6 7 3.6857 2.4086 0.5100 8.5100
Optimization Recommendations
=== Optimization: alert_reduction_pct ===
Direction: maximize
Best observed run: #10
correlation_window_sec = 315
similarity_threshold = 0.3
min_event_count = 2
Value: 77.0
RSM Model (linear, R² = 0.1508, Adj R² = -0.0808):
Coefficients:
intercept +49.8067
correlation_window_sec +1.9375
similarity_threshold -3.4375
min_event_count -7.5000
RSM Model (quadratic, R² = 0.5383, Adj R² = -0.2929):
Coefficients:
intercept +57.3000
correlation_window_sec +1.9375
similarity_threshold -3.4375
min_event_count -7.5000
correlation_window_sec*similarity_threshold +8.4500
correlation_window_sec*min_event_count +6.6250
similarity_threshold*min_event_count -1.5750
correlation_window_sec^2 -8.0250
similarity_threshold^2 +6.5250
min_event_count^2 -12.5500
Curvature analysis:
min_event_count coef=-12.5500 concave (has a maximum)
correlation_window_sec coef=-8.0250 concave (has a maximum)
similarity_threshold coef=+6.5250 convex (has a minimum)
Notable interactions:
correlation_window_sec*similarity_threshold coef=+8.4500 (synergistic)
correlation_window_sec*min_event_count coef=+6.6250 (synergistic)
similarity_threshold*min_event_count coef=-1.5750 (antagonistic)
Predicted optimum (from linear model, at observed points):
correlation_window_sec = 315
similarity_threshold = 0.3
min_event_count = 2
Predicted value: 60.7442
Surface optimum (via L-BFGS-B, linear model):
correlation_window_sec = 600
similarity_threshold = 0.3
min_event_count = 2
Predicted value: 62.6817
Model quality: Weak fit — consider adding center points or using a different design.
Factor importance:
1. min_event_count (effect: 19.9, contribution: 48.8%)
2. similarity_threshold (effect: 11.4, contribution: 27.9%)
3. correlation_window_sec (effect: 9.5, contribution: 23.3%)
=== Optimization: missed_incident_rate ===
Direction: minimize
Best observed run: #13
correlation_window_sec = 315
similarity_threshold = 0.9
min_event_count = 10
Value: 0.11
RSM Model (linear, R² = 0.1462, Adj R² = -0.0866):
Coefficients:
intercept +4.6827
correlation_window_sec +0.9575
similarity_threshold -0.0475
min_event_count -1.5900
RSM Model (quadratic, R² = 0.5258, Adj R² = -0.3277):
Coefficients:
intercept +5.2967
correlation_window_sec +0.9575
similarity_threshold -0.0475
min_event_count -1.5900
correlation_window_sec*similarity_threshold +1.1175
correlation_window_sec*min_event_count +2.1725
similarity_threshold*min_event_count +0.5525
correlation_window_sec^2 +0.3879
similarity_threshold^2 +1.5079
min_event_count^2 -3.0471
Curvature analysis:
min_event_count coef=-3.0471 concave (has a maximum)
similarity_threshold coef=+1.5079 convex (has a minimum)
correlation_window_sec coef=+0.3879 convex (has a minimum)
Notable interactions:
correlation_window_sec*min_event_count coef=+2.1725 (synergistic)
correlation_window_sec*similarity_threshold coef=+1.1175 (synergistic)
similarity_threshold*min_event_count coef=+0.5525 (synergistic)
Predicted optimum (from linear model, at observed points):
correlation_window_sec = 600
similarity_threshold = 0.6
min_event_count = 2
Predicted value: 7.2302
Surface optimum (via L-BFGS-B, linear model):
correlation_window_sec = 30
similarity_threshold = 0.9
min_event_count = 10
Predicted value: 2.0877
Model quality: Weak fit — consider adding center points or using a different design.
Factor importance:
1. min_event_count (effect: 4.8, contribution: 56.6%)
2. correlation_window_sec (effect: 1.9, contribution: 22.7%)
3. similarity_threshold (effect: 1.7, contribution: 20.7%)