missingmed runs SEM-based mediation across multiply imputed datasets and pools with Rubin’s rules. It is a thin orchestration layer: it fits each imputation with medfit and delegates inference to RMediation. The S7 pipeline is four verbs:
set_md_mediation() -> run() -> pool() -> infer()
MDMediationData MDMediationFit MDMediationResult CI / MBCO
We simulate a simple mediation model X -> M -> Y
with a confounder C, impose some MAR missingness, and
impute with mice.
library(missingmed)
set.seed(2026)
n <- 300
C <- rnorm(n)
X <- rbinom(n, 1, plogis(0.3 * C))
M <- 0.5 * X + 0.3 * C + rnorm(n)
Y <- 0.2 * X + 0.4 * M + 0.3 * C + rnorm(n)
d <- data.frame(X = X, M = M, Y = Y, C = C)
d$M[sample(n, 45)] <- NA
d$Y[sample(n, 30)] <- NA
imp <- mice::mice(d, m = 20, method = "norm", printFlag = FALSE)set_md_mediation() records the outcome and mediator
formulas plus the treatment/mediator roles.
md <- set_md_mediation(
imp,
formula_y = Y ~ X + M + C,
formula_m = M ~ X + C,
treatment = "X",
mediator = "M"
)
md
#> <MDMediationData>
#> estimator (method): mi | mechanism: mar
#> imputations (m) : 20
#> treatment / mediator: X / M
#> outcome model : Y ~ X + M + C
#> mediator model: M ~ X + C
#> engine: glmrun() fits every imputed dataset with
medfit::fit_mediation(), yielding a list of
named medfit::MediationData objects.
pool() combines the per-imputation estimates and
variance-covariance into a single named pooled
medfit::MediationData.
res <- pool(fit)
summary(res)
#> Pooled mediation result (Rubin's rules)
#> imputations (m): 20 | engine: glm
#>
#> term estimate std_error var_w var_b var_tot
#> m_(Intercept) -0.16522317 0.08238139 0.005746617 0.0009905497 0.006786694
#> m_X 0.93925967 0.11601493 0.012281822 0.0011215648 0.013459465
#> m_C 0.28887608 0.06017413 0.003393052 0.0002170228 0.003620926
#> y_(Intercept) -0.06574674 0.08649197 0.006525312 0.0009100461 0.007480860
#> y_X 0.25996102 0.14419632 0.017044636 0.0035694682 0.020792577
#> y_M 0.36924532 0.06955480 0.003766726 0.0010201381 0.004837871
#> y_C 0.34579037 0.06714072 0.004105318 0.0003833889 0.004507877
#> a 0.93925967 0.11601493 0.012281822 0.0011215648 0.013459465
#> b 0.36924532 0.06955480 0.003766726 0.0010201381 0.004837871
#> c_prime 0.25996102 0.14419632 0.017044636 0.0035694682 0.020792577
#>
#> a*b = 0.3468The Monte-Carlo / distribution-of-the-product CI is computed by RMediation on the pooled object:
infer(res, type = "mc")
#> $CI
#> [1] 0.2040696 0.5107304
#>
#> $Estimate
#> [1] 0.34706
#>
#> $SE
#> [1] 0.07851401
#>
#> $MC.Error
#> [1] 0.0002482831For the MBCO likelihood-ratio test of
H0: a*b = 0, use the fit object (see
vignette("mbco-mi") for why pooling does not commute with
MBCO):
The S4 entry points (set_sem(), run_sem(),
pool_sem()) are deprecated in favour of
the S7 verbs above. They still work for one release cycle but emit a
deprecation warning.