In practice an experiment is rarely designed from scratch. A researcher may already have data collected at certain conditions and want to add new observations to improve estimation — without discarding what has already been measured. The key question is: where can new points be placed so that the efficiency of the augmented design stays above an acceptable threshold?
optedr answers this question with two functions used in sequence:
get_augment_region() — computes the candidate region:
the set of design points whose addition keeps the D-efficiency of the
augmented design above a user-specified threshold
delta_val.augment_design() — adds a chosen point to the initial
design and rescales the weights.Both functions support the same optimality criteria as
opt_des() and work for any number of factors.
| Parameter | Role |
|---|---|
init_design |
Current design (data frame with
Point/Weight in 1D, or factor columns +
Weight in multi-D) |
alpha |
Fraction of total weight assigned to the new point after augmentation |
delta_val |
Minimum acceptable D-efficiency of the augmented design |
calc_optimal_design |
If TRUE, also computes the optimal design and uses it
as the reference for efficiency |
new_points |
Data frame of points to add (non-interactive mode); omit for interactive mode |
par_int |
Indices of parameters of interest (Ds-Optimality only) |
n_lhs |
Number of Latin-Hypercube candidates for the region search (multi-D) |
We start with a uniform three-point design for Antoine’s equation and look for points that keep the D-efficiency of the augmented design above 85 %.
init_des <- data.frame(
Point = c(30, 60, 90),
Weight = c(1/3, 1/3, 1/3)
)
region <- get_augment_region(
criterion = "D-Optimality",
init_design = init_des,
alpha = 0.25,
model = y ~ 10^(a - b / (c + x)),
parameters = c("a", "b", "c"),
par_values = c(8.07131, 1730.63, 233.426),
design_space = c(1, 100),
calc_optimal_design = FALSE,
delta_val = 0.85
)region$region is a data frame of candidate intervals.
Each row gives a lower and upper bound on the design space where the new
point can be placed.
new_pt <- mean(region$region[1:2])
augmented <- augment_design(
criterion = "D-Optimality",
init_design = init_des,
alpha = 0.25,
model = y ~ 10^(a - b / (c + x)),
parameters = c("a", "b", "c"),
par_values = c(8.07131, 1730.63, 233.426),
design_space = c(1, 100),
calc_optimal_design = FALSE,
delta_val = 0.85,
new_points = data.frame(Point = new_pt, Weight = 1)
)result_opt <- opt_des(
"D-Optimality",
y ~ 10^(a - b / (c + x)), c("a", "b", "c"),
c(8.07131, 1730.63, 233.426), c(1, 100)
)
#>
#> ℹ Stop condition not reached, max iterations performed
#>
⠙ Calculating optimal design 22 done (27/s) | 802ms
ℹ The lower bound for efficiency is 99.9986187558838%
#>
eff_before <- design_efficiency(init_des, result_opt)
#> ℹ The efficiency of the design is 38.5312233962882%
eff_after <- design_efficiency(augmented, result_opt)
#> ℹ The efficiency of the design is 34.2933573610529%
cat("Efficiency before augmenting:", round(eff_before * 100, 2), "%\n")
#> Efficiency before augmenting: 38.53 %
cat("Efficiency after augmenting: ", round(eff_after * 100, 2), "%\n")
#> Efficiency after augmenting: 34.29 %
cat("Gain: ", round((eff_after - eff_before) * 100, 2),
"percentage points\n")
#> Gain: -4.24 percentage pointscalc_optimal_design = TRUE)When calc_optimal_design = TRUE, the function internally
computes the optimal design and uses it to define the efficiency
threshold. This is the recommended mode when no optimal design has been
computed yet:
In multi-dimensional spaces get_augment_region() samples
candidate points with a Latin Hypercube (controlled by
n_lhs) and returns a data frame of candidates together with
their estimated efficiency gain. A heatmap of the efficiency function is
displayed automatically.
init_2d <- data.frame(
x1 = c(0.8, 10, 5),
x2 = c(10, 0.8, 5),
Weight = c(1/3, 1/3, 1/3)
)
result_2D <- opt_des(
criterion = "D-Optimality",
model = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
parameters = c("Vmax", "K1", "K2"),
par_values = c(1, 1, 1),
design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10))
)
#>
#> ℹ Stop condition not reached, max iterations performed
#>
⠙ Calculating optimal design 22 done (19/s) | 1.2s
ℹ The lower bound for efficiency is 99.9989441805938%
#>
region_2d <- get_augment_region(
criterion = "D-Optimality",
init_design = init_2d,
alpha = 0.25,
model = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
parameters = c("Vmax", "K1", "K2"),
par_values = c(1, 1, 1),
design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)),
calc_optimal_design = FALSE,
delta_val = 0.85
)#> ℹ 1893 candidate points with efficiency >= 0.85 (from LHS sample of 2000)
region_2d$region is a data frame of sampled candidates,
each with an efficiency column. Pick the candidate that
maximises efficiency:
best_2d <- region_2d$region[which.max(region_2d$region$efficiency), ]
eff_antes <- suppressMessages(design_efficiency(init_2d, result_2D))
aug_2d <- augment_design(
criterion = "D-Optimality",
init_design = init_2d,
alpha = 0.25,
model = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
parameters = c("Vmax", "K1", "K2"),
par_values = c(1, 1, 1),
design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)),
calc_optimal_design = FALSE,
delta_val = 0.85,
new_points = data.frame(x1 = best_2d$x1, x2 = best_2d$x2, Weight = 1)
)#> ℹ 1907 candidate points with efficiency >= 0.85 (from LHS sample of 2000)
#> Sample of candidate points:
#> x1 x2 efficiency
#> 1 5.0800948 0.565028 0.9395912
#> 2 6.9302761 7.020460 1.0890899
#> 3 1.0567895 3.294382 0.9257834
#> 4 3.8720654 3.746035 0.8635142
#> 5 2.1816247 5.556157 0.8668131
#> 6 1.9701608 4.457731 0.8617009
#> 7 3.9151828 8.208373 0.9847953
#> 8 9.0526676 9.713675 1.2208120
#> 9 0.4385586 6.797933 0.9286939
#> 10 6.7020570 4.107606 0.9600131
#> 11 5.0907975 3.073397 0.8737682
#> 12 9.4098767 8.629254 1.2035761
#> 13 8.8662268 7.580767 1.1613383
#> 14 9.5557108 9.520544 1.2282312
#> 15 1.8981463 5.497884 0.8705040
eff_despues <- suppressMessages(design_efficiency(aug_2d, result_2D))
cat("Efficiency before:", round(eff_antes * 100, 2), "%\n")
#> Efficiency before: 68.4 %
cat("Efficiency after: ", round(eff_despues * 100, 2), "%\n")
#> Efficiency after: 85.21 %
print(aug_2d)
#> x1 x2 Weight
#> 1 0.800000 10.000000 0.25
#> 2 10.000000 0.800000 0.25
#> 3 5.000000 5.000000 0.25
#> 4 9.985942 9.905381 0.25
For three or more factors the candidate region is displayed as a scatter-matrix coloured by candidate/non-candidate status, with the current design shown as triangles.
init_3d <- data.frame(
x1 = c(0.8, 10, 10, 0.8, 10),
x2 = c(10, 0.8, 10, 10, 0.8),
x3 = c(10, 10, 0.8, 0.8, 10),
Weight = rep(0.2, 5)
)
region_3d <- get_augment_region(
criterion = "D-Optimality",
init_design = init_3d,
alpha = 0.45,
model = y ~ Vmax * x1 * x2 * x3 / ((K1+x1) * (K2+x2) * (K3+x3)),
parameters = c("Vmax", "K1", "K2", "K3"),
par_values = c(1, 1, 1, 1),
design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10), x3 = c(0.1, 10)),
calc_optimal_design = FALSE,
delta_val = 0.93
)#> ℹ 948 candidate points with efficiency >= 0.93 (from LHS sample of 2000)
cat("Number of candidate points:", nrow(region_3d$region), "\n")
#> Number of candidate points: 948
plot(region_3d$plot)
When the goal is to augment while preserving estimation quality for a
subset of parameters, use
criterion = "Ds-Optimality" and pass
par_int:
region_ds <- get_augment_region(
criterion = "Ds-Optimality",
init_design = init_2d,
alpha = 0.25,
model = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
parameters = c("Vmax", "K1", "K2"),
par_values = c(1, 1, 1),
design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)),
calc_optimal_design = FALSE,
par_int = c(1),
delta_val = 0.85,
n_lhs = 5000
)#> ℹ 3493 candidate points with efficiency >= 0.85 (from LHS sample of 5000)
best_ds <- region_ds$region[which.max(region_ds$region$efficiency), ]
aug_ds <- augment_design(
criterion = "Ds-Optimality",
init_design = init_2d,
alpha = 0.25,
model = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
parameters = c("Vmax", "K1", "K2"),
par_values = c(1, 1, 1),
design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)),
calc_optimal_design = FALSE,
par_int = c(1),
delta_val = 0.85,
new_points = data.frame(x1 = best_ds$x1, x2 = best_ds$x2, Weight = 1),
n_lhs = 5000
)
#> ℹ 3483 candidate points with efficiency >= 0.85 (from LHS sample of 5000)
#> Sample of candidate points:
#> x1 x2 efficiency
#> 1 2.8650576 9.9589864 1.0325916
#> 2 3.9641865 0.3004131 0.9416510
#> 3 4.8414839 6.6097326 1.3811784
#> 4 8.8811449 3.6413103 1.2404576
#> 5 4.9945051 3.3432554 0.8884957
#> 6 8.9748253 8.1732178 2.5510765
#> 7 5.0966627 3.8485719 0.9822186
#> 8 3.9080466 1.0166707 0.8757124
#> 9 2.4809849 0.2282649 0.9329601
#> 10 5.7111956 8.1552418 1.8190827
#> 11 0.2588158 1.3963485 0.9684463
#> 12 7.8919712 3.4886202 1.1313313
#> 13 9.6786526 6.4234792 2.2182760
#> 14 9.4158933 3.3784509 1.1810324
#> 15 8.2251447 9.7007518 2.6831869
print(aug_ds)
#> x1 x2 Weight
#> 1 0.800000 10.000000 0.25
#> 2 10.000000 0.800000 0.25
#> 3 5.000000 5.000000 0.25
#> 4 9.890709 9.955085 0.25
Omitting new_points (and delta_val) from
both functions triggers an interactive session where the package plots
the candidate region and asks the user to type a point. This mode is
documented in ?augment_design.