← All Use Cases
📊
Plackett-Burman Design

Query Engine Join Strategy

Plackett-Burman screening of 6 query engine parameters for join performance and memory pressure

Summary

This experiment investigates query engine join strategy. Plackett-Burman screening of 6 query engine parameters for join performance and memory pressure.

The design varies 6 factors: join algorithm, ranging from hash to sort_merge, hash table mb (MB), ranging from 256 to 4096, sort buffer mb (MB), ranging from 64 to 512, broadcast threshold mb (MB), ranging from 10 to 256, adaptive execution, ranging from off to on, and partitions (count), ranging from 50 to 400. The goal is to optimize 2 responses: query time s (s) (minimize) and peak memory gb (GB) (minimize). Fixed conditions held constant across all runs include engine = spark_sql, data size gb = 100.

A Plackett-Burman screening design was used to efficiently test 6 factors in only 8 runs. This design assumes interactions are negligible and focuses on identifying the most influential main effects.

Key Findings

For query time s, the most influential factors were hash table mb (46.4%), sort buffer mb (16.4%), adaptive execution (16.2%). The best observed value was 34.5 (at join algorithm = hash, hash table mb = 256, sort buffer mb = 512).

For peak memory gb, the most influential factors were join algorithm (29.9%), hash table mb (28.1%), broadcast threshold mb (16.3%). The best observed value was 1.8 (at join algorithm = hash, hash table mb = 256, sort buffer mb = 512).

Recommended Next Steps

Experimental Setup

Factors

FactorLowHighUnit
join_algorithmhashsort_merge
hash_table_mb2564096MB
sort_buffer_mb64512MB
broadcast_threshold_mb10256MB
adaptive_executionoffon
partitions50400count

Fixed: engine = spark_sql, data_size_gb = 100

Responses

ResponseDirectionUnit
query_time_s↓ minimizes
peak_memory_gb↓ minimizeGB

Configuration

use_cases/42_query_engine_join_strategy/config.json
{ "metadata": { "name": "Query Engine Join Strategy", "description": "Plackett-Burman screening of 6 query engine parameters for join performance and memory pressure" }, "factors": [ { "name": "join_algorithm", "levels": [ "hash", "sort_merge" ], "type": "categorical", "unit": "" }, { "name": "hash_table_mb", "levels": [ "256", "4096" ], "type": "continuous", "unit": "MB" }, { "name": "sort_buffer_mb", "levels": [ "64", "512" ], "type": "continuous", "unit": "MB" }, { "name": "broadcast_threshold_mb", "levels": [ "10", "256" ], "type": "continuous", "unit": "MB" }, { "name": "adaptive_execution", "levels": [ "off", "on" ], "type": "categorical", "unit": "" }, { "name": "partitions", "levels": [ "50", "400" ], "type": "continuous", "unit": "count" } ], "fixed_factors": { "engine": "spark_sql", "data_size_gb": "100" }, "responses": [ { "name": "query_time_s", "optimize": "minimize", "unit": "s" }, { "name": "peak_memory_gb", "optimize": "minimize", "unit": "GB" } ], "settings": { "operation": "plackett_burman", "test_script": "use_cases/42_query_engine_join_strategy/sim.sh" } }

Experimental Matrix

The Plackett-Burman Design produces 8 runs. Each row is one experiment with specific factor settings.

Runjoin_algorithmhash_table_mbsort_buffer_mbbroadcast_threshold_mbadaptive_executionpartitions
1sort_merge409651210off50
2hash256512256off50
3hash409664256off400
4sort_merge4096512256on400
5hash40966410on50
6sort_merge25664256on50
7hash25651210on400
8sort_merge2566410off400

Step-by-Step Workflow

1

Preview the design

Terminal
$ doe info --config use_cases/42_query_engine_join_strategy/config.json
2

Generate the runner script

Terminal
$ doe generate --config use_cases/42_query_engine_join_strategy/config.json \ --output use_cases/42_query_engine_join_strategy/results/run.sh --seed 42
3

Execute the experiments

Terminal
$ bash use_cases/42_query_engine_join_strategy/results/run.sh
4

Analyze results

Terminal
$ doe analyze --config use_cases/42_query_engine_join_strategy/config.json
5

Get optimization recommendations

Terminal
$ doe optimize --config use_cases/42_query_engine_join_strategy/config.json
6

Multi-objective optimization

With 2 competing responses, use --multi to find the best compromise via Derringer–Suich desirability.

Terminal
$ doe optimize --config use_cases/42_query_engine_join_strategy/config.json --multi
7

Generate the HTML report

Terminal
$ doe report --config use_cases/42_query_engine_join_strategy/config.json \ --output use_cases/42_query_engine_join_strategy/results/report.html

Features Exercised

FeatureValue
Design typeplackett_burman
Factor typescontinuous (4), categorical (2)
Arg styledouble-dash
Responses2 (query_time_s ↓, peak_memory_gb ↓)
Total runs8

Analysis Results

Generated from actual experiment runs using the DOE Helper Tool.

Response: query_time_s

Top factors: hash_table_mb (46.4%), sort_buffer_mb (16.4%), adaptive_execution (16.2%).

ANOVA

SourceDFSSMSFp-value
SourceDFSSMSFp-value
join_algorithm172.601272.60120.5650.4767
hash_table_mb1810.0312810.03126.3050.0403
sort_buffer_mb1101.5312101.53120.7900.4035
broadcast_threshold_mb13.25123.25120.0250.8781
adaptive_execution198.701298.70120.7680.4098
partitions16.30126.30120.0490.8311
join_algorithm*hash_table_mb1101.5312101.53120.7900.4035
join_algorithm*sort_buffer_mb1810.0312810.03126.3050.0403
join_algorithm*broadcast_threshold_mb198.701298.70120.7680.4098
join_algorithm*adaptive_execution13.25123.25120.0250.8781
join_algorithm*partitions1162.9012162.90121.2680.2973
hash_table_mb*sort_buffer_mb172.601272.60120.5650.4767
hash_table_mb*broadcast_threshold_mb16.30136.30130.0490.8311
hash_table_mb*adaptive_execution1162.9013162.90131.2680.2973
hash_table_mb*partitions13.25133.25130.0250.8781
sort_buffer_mb*broadcast_threshold_mb1162.9012162.90121.2680.2973
sort_buffer_mb*adaptive_execution16.30136.30130.0490.8311
sort_buffer_mb*partitions198.701298.70120.7680.4098
broadcast_threshold_mb*adaptive_execution172.601272.60120.5650.4767
broadcast_threshold_mb*partitions1810.0313810.03136.3050.0403
adaptive_execution*partitions1101.5313101.53130.7900.4035
Error(LenthPSE)7899.3381128.4769
Total71255.3187179.3312

Pareto Chart

Pareto chart for query_time_s

Main Effects Plot

Main effects plot for query_time_s

Normal Probability Plot of Effects

Normal probability plot for query_time_s

Half-Normal Plot of Effects

Half-normal plot for query_time_s

Model Diagnostics

Model diagnostics for query_time_s

Response: peak_memory_gb

Top factors: join_algorithm (29.9%), hash_table_mb (28.1%), broadcast_threshold_mb (16.3%).

ANOVA

SourceDFSSMSFp-value
SourceDFSSMSFp-value
join_algorithm1170.2012170.20129.2620.0188
hash_table_mb1150.5113150.51138.1900.0243
sort_buffer_mb16.30136.30130.3430.5765
broadcast_threshold_mb150.501350.50132.7480.1413
adaptive_execution112.251212.25120.6670.4411
partitions127.751227.75121.5100.2588
join_algorithm*hash_table_mb16.30126.30120.3430.5765
join_algorithm*sort_buffer_mb1150.5112150.51128.1900.0243
join_algorithm*broadcast_threshold_mb112.251212.25120.6670.4411
join_algorithm*adaptive_execution150.501250.50122.7480.1413
join_algorithm*partitions10.15130.15130.0080.9303
hash_table_mb*sort_buffer_mb1170.2012170.20129.2620.0188
hash_table_mb*broadcast_threshold_mb127.751227.75121.5100.2588
hash_table_mb*adaptive_execution10.15120.15120.0080.9303
hash_table_mb*partitions150.501250.50122.7480.1413
sort_buffer_mb*broadcast_threshold_mb10.15120.15120.0080.9303
sort_buffer_mb*adaptive_execution127.751227.75121.5100.2588
sort_buffer_mb*partitions112.251212.25120.6670.4411
broadcast_threshold_mb*adaptive_execution1170.2012170.20129.2620.0188
broadcast_threshold_mb*partitions1150.5112150.51128.1900.0243
adaptive_execution*partitions16.30136.30130.3430.5765
Error(LenthPSE)7128.638118.3769
Total7417.668759.6670

Pareto Chart

Pareto chart for peak_memory_gb

Main Effects Plot

Main effects plot for peak_memory_gb

Normal Probability Plot of Effects

Normal probability plot for peak_memory_gb

Half-Normal Plot of Effects

Half-normal plot for peak_memory_gb

Model Diagnostics

Model diagnostics for peak_memory_gb

Response Surface Plots

3D surfaces fitted with quadratic RSM. Red dots are observed data points.

peak memory gb broadcast threshold mb vs partitions

RSM surface: peak memory gb broadcast threshold mb vs partitions

peak memory gb hash table mb vs broadcast threshold mb

RSM surface: peak memory gb hash table mb vs broadcast threshold mb

peak memory gb hash table mb vs partitions

RSM surface: peak memory gb hash table mb vs partitions

peak memory gb hash table mb vs sort buffer mb

RSM surface: peak memory gb hash table mb vs sort buffer mb

peak memory gb sort buffer mb vs broadcast threshold mb

RSM surface: peak memory gb sort buffer mb vs broadcast threshold mb

peak memory gb sort buffer mb vs partitions

RSM surface: peak memory gb sort buffer mb vs partitions

query time s broadcast threshold mb vs partitions

RSM surface: query time s broadcast threshold mb vs partitions

query time s hash table mb vs broadcast threshold mb

RSM surface: query time s hash table mb vs broadcast threshold mb

query time s hash table mb vs partitions

RSM surface: query time s hash table mb vs partitions

query time s hash table mb vs sort buffer mb

RSM surface: query time s hash table mb vs sort buffer mb

query time s sort buffer mb vs broadcast threshold mb

RSM surface: query time s sort buffer mb vs broadcast threshold mb

query time s sort buffer mb vs partitions

RSM surface: query time s sort buffer mb vs partitions

Multi-Objective Optimization

When responses compete, Derringer–Suich desirability finds the best compromise. Each response is scaled to a 0–1 desirability, then combined via a weighted geometric mean.

Overall Desirability
D = 0.8408

Per-Response Desirability

ResponseWeightDesirabilityPredictedDir
query_time_s 1.5
1.0000
31.15 1.0000 31.15 s
peak_memory_gb 1.0
0.6482
9.92 0.6482 9.92 GB

Recommended Settings

FactorValue
join_algorithmsort_merge
hash_table_mb4065 MB
sort_buffer_mb73.95 MB
broadcast_threshold_mb50.15 MB
adaptive_executionon
partitions60.19 count

Source: from RSM model prediction

Trade-off Summary

Sacrifice = how much worse than single-objective best.

ResponsePredictedBest ObservedSacrifice
peak_memory_gb9.921.80+8.12

Top 3 Runs by Desirability

RunDFactor Settings
#50.7174join_algorithm=sort_merge, hash_table_mb=4096, sort_buffer_mb=512, broadcast_threshold_mb=10, adaptive_execution=off, partitions=50
#40.6940join_algorithm=sort_merge, hash_table_mb=256, sort_buffer_mb=64, broadcast_threshold_mb=10, adaptive_execution=off, partitions=400

Model Quality

ResponseType
peak_memory_gb0.9547linear

Full Multi-Objective Output

doe optimize --multi
============================================================ MULTI-OBJECTIVE OPTIMIZATION Method: Derringer-Suich Desirability Function ============================================================ Overall desirability: D = 0.8408 Response Weight Desirability Predicted Direction --------------------------------------------------------------------- query_time_s 1.5 1.0000 31.15 s ↓ peak_memory_gb 1.0 0.6482 9.92 GB ↓ Recommended settings: join_algorithm = sort_merge hash_table_mb = 4065 MB sort_buffer_mb = 73.95 MB broadcast_threshold_mb = 50.15 MB adaptive_execution = on partitions = 60.19 count (from RSM model prediction) Trade-off summary: query_time_s: 31.15 (best observed: 34.50, sacrifice: -3.35) peak_memory_gb: 9.92 (best observed: 1.80, sacrifice: +8.12) Model quality: query_time_s: R² = 0.9931 (linear) peak_memory_gb: R² = 0.9547 (linear) Top 3 observed runs by overall desirability: 1. Run #7 (D=0.8012): join_algorithm=hash, hash_table_mb=4096, sort_buffer_mb=64, broadcast_threshold_mb=10, adaptive_execution=on, partitions=50 2. Run #5 (D=0.7174): join_algorithm=sort_merge, hash_table_mb=4096, sort_buffer_mb=512, broadcast_threshold_mb=10, adaptive_execution=off, partitions=50 3. Run #4 (D=0.6940): join_algorithm=sort_merge, hash_table_mb=256, sort_buffer_mb=64, broadcast_threshold_mb=10, adaptive_execution=off, partitions=400

Full Analysis Output

doe analyze
=== Main Effects: query_time_s === Factor Effect Std Error % Contribution -------------------------------------------------------------- hash_table_mb -20.1250 4.7346 46.4% sort_buffer_mb 7.1250 4.7346 16.4% adaptive_execution 7.0250 4.7346 16.2% join_algorithm 6.0250 4.7346 13.9% partitions -1.7750 4.7346 4.1% broadcast_threshold_mb -1.2750 4.7346 2.9% === ANOVA Table: query_time_s === Source DF SS MS F p-value ----------------------------------------------------------------------------- join_algorithm 1 72.6012 72.6012 0.565 0.4767 hash_table_mb 1 810.0312 810.0312 6.305 0.0403 sort_buffer_mb 1 101.5312 101.5312 0.790 0.4035 broadcast_threshold_mb 1 3.2512 3.2512 0.025 0.8781 adaptive_execution 1 98.7012 98.7012 0.768 0.4098 partitions 1 6.3012 6.3012 0.049 0.8311 join_algorithm*hash_table_mb 1 101.5312 101.5312 0.790 0.4035 join_algorithm*sort_buffer_mb 1 810.0312 810.0312 6.305 0.0403 join_algorithm*broadcast_threshold_mb 1 98.7012 98.7012 0.768 0.4098 join_algorithm*adaptive_execution 1 3.2512 3.2512 0.025 0.8781 join_algorithm*partitions 1 162.9012 162.9012 1.268 0.2973 hash_table_mb*sort_buffer_mb 1 72.6012 72.6012 0.565 0.4767 hash_table_mb*broadcast_threshold_mb 1 6.3013 6.3013 0.049 0.8311 hash_table_mb*adaptive_execution 1 162.9013 162.9013 1.268 0.2973 hash_table_mb*partitions 1 3.2513 3.2513 0.025 0.8781 sort_buffer_mb*broadcast_threshold_mb 1 162.9012 162.9012 1.268 0.2973 sort_buffer_mb*adaptive_execution 1 6.3013 6.3013 0.049 0.8311 sort_buffer_mb*partitions 1 98.7012 98.7012 0.768 0.4098 broadcast_threshold_mb*adaptive_execution 1 72.6012 72.6012 0.565 0.4767 broadcast_threshold_mb*partitions 1 810.0313 810.0313 6.305 0.0403 adaptive_execution*partitions 1 101.5313 101.5313 0.790 0.4035 Error (Lenth PSE) 7 899.3381 128.4769 Total 7 1255.3187 179.3312 Note: Error estimated using Lenth's pseudo-standard-error (unreplicated design) === Interaction Effects: query_time_s === Factor A Factor B Interaction % Contribution ------------------------------------------------------------------------ join_algorithm sort_buffer_mb 20.1250 17.7% broadcast_threshold_mb partitions 20.1250 17.7% join_algorithm partitions 9.0250 7.9% hash_table_mb adaptive_execution -9.0250 7.9% sort_buffer_mb broadcast_threshold_mb 9.0250 7.9% join_algorithm hash_table_mb -7.1250 6.3% adaptive_execution partitions 7.1250 6.3% join_algorithm broadcast_threshold_mb 7.0250 6.2% sort_buffer_mb partitions 7.0250 6.2% hash_table_mb sort_buffer_mb -6.0250 5.3% broadcast_threshold_mb adaptive_execution 6.0250 5.3% hash_table_mb broadcast_threshold_mb 1.7750 1.6% sort_buffer_mb adaptive_execution -1.7750 1.6% join_algorithm adaptive_execution -1.2750 1.1% hash_table_mb partitions 1.2750 1.1% === Summary Statistics: query_time_s === join_algorithm: Level N Mean Std Min Max ------------------------------------------------------------ hash 4 42.8500 10.8709 35.1000 58.9000 sort_merge 4 48.8750 16.6151 34.5000 69.0000 hash_table_mb: Level N Mean Std Min Max ------------------------------------------------------------ 256 4 55.9250 12.1082 39.8000 69.0000 4096 4 35.8000 1.3491 34.5000 37.6000 sort_buffer_mb: Level N Mean Std Min Max ------------------------------------------------------------ 512 4 42.3000 11.2892 34.5000 58.9000 64 4 49.4250 16.0359 35.1000 69.0000 broadcast_threshold_mb: Level N Mean Std Min Max ------------------------------------------------------------ 10 4 46.5000 12.7046 35.1000 58.9000 256 4 45.2250 15.9984 34.5000 69.0000 adaptive_execution: Level N Mean Std Min Max ------------------------------------------------------------ off 4 42.3500 9.2324 36.0000 56.0000 on 4 49.3750 17.3292 34.5000 69.0000 partitions: Level N Mean Std Min Max ------------------------------------------------------------ 400 4 46.7500 12.4762 34.5000 58.9000 50 4 44.9750 16.1457 35.1000 69.0000 === Main Effects: peak_memory_gb === Factor Effect Std Error % Contribution -------------------------------------------------------------- join_algorithm -9.2250 2.7310 29.9% hash_table_mb 8.6750 2.7310 28.1% broadcast_threshold_mb 5.0250 2.7310 16.3% partitions -3.7250 2.7310 12.1% adaptive_execution -2.4750 2.7310 8.0% sort_buffer_mb -1.7750 2.7310 5.7% === ANOVA Table: peak_memory_gb === Source DF SS MS F p-value ----------------------------------------------------------------------------- join_algorithm 1 170.2012 170.2012 9.262 0.0188 hash_table_mb 1 150.5113 150.5113 8.190 0.0243 sort_buffer_mb 1 6.3013 6.3013 0.343 0.5765 broadcast_threshold_mb 1 50.5013 50.5013 2.748 0.1413 adaptive_execution 1 12.2512 12.2512 0.667 0.4411 partitions 1 27.7512 27.7512 1.510 0.2588 join_algorithm*hash_table_mb 1 6.3012 6.3012 0.343 0.5765 join_algorithm*sort_buffer_mb 1 150.5112 150.5112 8.190 0.0243 join_algorithm*broadcast_threshold_mb 1 12.2512 12.2512 0.667 0.4411 join_algorithm*adaptive_execution 1 50.5012 50.5012 2.748 0.1413 join_algorithm*partitions 1 0.1513 0.1513 0.008 0.9303 hash_table_mb*sort_buffer_mb 1 170.2012 170.2012 9.262 0.0188 hash_table_mb*broadcast_threshold_mb 1 27.7512 27.7512 1.510 0.2588 hash_table_mb*adaptive_execution 1 0.1512 0.1512 0.008 0.9303 hash_table_mb*partitions 1 50.5012 50.5012 2.748 0.1413 sort_buffer_mb*broadcast_threshold_mb 1 0.1512 0.1512 0.008 0.9303 sort_buffer_mb*adaptive_execution 1 27.7512 27.7512 1.510 0.2588 sort_buffer_mb*partitions 1 12.2512 12.2512 0.667 0.4411 broadcast_threshold_mb*adaptive_execution 1 170.2012 170.2012 9.262 0.0188 broadcast_threshold_mb*partitions 1 150.5112 150.5112 8.190 0.0243 adaptive_execution*partitions 1 6.3013 6.3013 0.343 0.5765 Error (Lenth PSE) 7 128.6381 18.3769 Total 7 417.6687 59.6670 Note: Error estimated using Lenth's pseudo-standard-error (unreplicated design) === Interaction Effects: peak_memory_gb === Factor A Factor B Interaction % Contribution ------------------------------------------------------------------------ hash_table_mb sort_buffer_mb 9.2250 14.7% broadcast_threshold_mb adaptive_execution -9.2250 14.7% join_algorithm sort_buffer_mb -8.6750 13.9% broadcast_threshold_mb partitions -8.6750 13.9% join_algorithm adaptive_execution 5.0250 8.0% hash_table_mb partitions -5.0250 8.0% hash_table_mb broadcast_threshold_mb 3.7250 5.9% sort_buffer_mb adaptive_execution -3.7250 5.9% join_algorithm broadcast_threshold_mb -2.4750 4.0% sort_buffer_mb partitions -2.4750 4.0% join_algorithm hash_table_mb 1.7750 2.8% adaptive_execution partitions -1.7750 2.8% join_algorithm partitions 0.2750 0.4% hash_table_mb adaptive_execution -0.2750 0.4% sort_buffer_mb broadcast_threshold_mb 0.2750 0.4% === Summary Statistics: peak_memory_gb === join_algorithm: Level N Mean Std Min Max ------------------------------------------------------------ hash 4 16.7000 6.3209 11.5000 25.9000 sort_merge 4 7.4750 6.5220 1.8000 15.7000 hash_table_mb: Level N Mean Std Min Max ------------------------------------------------------------ 256 4 7.7500 6.5200 1.8000 15.0000 4096 4 16.4250 6.8222 9.7000 25.9000 sort_buffer_mb: Level N Mean Std Min Max ------------------------------------------------------------ 512 4 12.9750 2.8535 9.7000 15.7000 64 4 11.2000 11.3569 1.8000 25.9000 broadcast_threshold_mb: Level N Mean Std Min Max ------------------------------------------------------------ 10 4 9.5750 4.9755 2.7000 14.4000 256 4 14.6000 9.8810 1.8000 25.9000 adaptive_execution: Level N Mean Std Min Max ------------------------------------------------------------ off 4 13.3250 9.7804 2.7000 25.9000 on 4 10.8500 6.2836 1.8000 15.7000 partitions: Level N Mean Std Min Max ------------------------------------------------------------ 400 4 13.9500 9.6338 2.7000 25.9000 50 4 10.2250 6.0961 1.8000 15.0000

Optimization Recommendations

doe optimize
=== Optimization: query_time_s === Direction: minimize Best observed run: #4 join_algorithm = hash hash_table_mb = 256 sort_buffer_mb = 512 broadcast_threshold_mb = 256 adaptive_execution = off partitions = 50 Value: 34.5 RSM Model (linear, R² = 0.9985, Adj R² = 0.9894): Coefficients: intercept +45.8625 join_algorithm -3.4125 hash_table_mb +2.2125 sort_buffer_mb +4.6875 broadcast_threshold_mb -4.3375 adaptive_execution +8.8875 partitions +4.5125 Predicted optimum (from linear model, at observed points): join_algorithm = hash hash_table_mb = 256 sort_buffer_mb = 512 broadcast_threshold_mb = 10 adaptive_execution = on partitions = 400 Predicted value: 69.4875 Surface optimum (via L-BFGS-B, linear model): join_algorithm = sort_merge hash_table_mb = 256 sort_buffer_mb = 64 broadcast_threshold_mb = 256 adaptive_execution = off partitions = 50 Predicted value: 17.8125 Model quality: Excellent fit — surface predictions are reliable. Factor importance: 1. adaptive_execution (effect: 17.8, contribution: 31.7%) 2. sort_buffer_mb (effect: -9.4, contribution: 16.7%) 3. partitions (effect: -9.0, contribution: 16.1%) 4. broadcast_threshold_mb (effect: -8.7, contribution: 15.5%) 5. join_algorithm (effect: -6.8, contribution: 12.2%) 6. hash_table_mb (effect: 4.4, contribution: 7.9%) === Optimization: peak_memory_gb === Direction: minimize Best observed run: #8 join_algorithm = hash hash_table_mb = 256 sort_buffer_mb = 512 broadcast_threshold_mb = 10 adaptive_execution = on partitions = 400 Value: 1.8 RSM Model (linear, R² = 0.9083, Adj R² = 0.3584): Coefficients: intercept +12.0875 join_algorithm +0.5625 hash_table_mb +1.6875 sort_buffer_mb -1.0875 broadcast_threshold_mb +4.7875 adaptive_execution -4.4875 partitions +0.1375 Predicted optimum (from linear model, at observed points): join_algorithm = hash hash_table_mb = 4096 sort_buffer_mb = 64 broadcast_threshold_mb = 256 adaptive_execution = off partitions = 400 Predicted value: 23.7125 Surface optimum (via L-BFGS-B, linear model): join_algorithm = hash hash_table_mb = 256 sort_buffer_mb = 512 broadcast_threshold_mb = 10 adaptive_execution = on partitions = 50 Predicted value: -0.6625 Model quality: Excellent fit — surface predictions are reliable. Factor importance: 1. broadcast_threshold_mb (effect: 9.6, contribution: 37.5%) 2. adaptive_execution (effect: -9.0, contribution: 35.2%) 3. hash_table_mb (effect: 3.4, contribution: 13.2%) 4. sort_buffer_mb (effect: 2.2, contribution: 8.5%) 5. join_algorithm (effect: 1.1, contribution: 4.4%) 6. partitions (effect: -0.3, contribution: 1.1%)
← Previous: Columnar Compression All Use Cases →