The self-adapting mixture prior (SAMprior) package is designed to enhance the effectiveness and practicality of clinical trials by leveraging historical information or real-world data [1]. The package incorporate historical data into a new trial using an informative prior constructed based on historical data while mixing a non-informative prior to enhance the robustness of information borrowing. It utilizes a data-driven way to determine a self-adapting mixture weight that dynamically favors the informative (non-informative) prior component when there is little (substantial) evidence of prior-data conflict. Operating characteristics are evaluated and compared to the robust Meta-Analytic-Predictive (rMAP) prior [2], which assigns a fixed weight of 0.5.
SAM prior is constructed by mixing an informative prior π1(θ),
constructed based on historical data, with a non-informative prior π0(θ) using the
mixture weight w determined by
SAM_weight
function to achieve the degree
of prior-data conflict [1]. The following sections describe how to
construct SAM prior in details.
We assume three historical data as follows:
study | n | mean | se |
---|---|---|---|
1 | 40 | 0.136 | 0.426 |
2 | 50 | 0.172 | 0.381 |
3 | 60 | -0.416 | 0.398 |
To construct informative priors based on the aforementioned three
historical data, we apply gMAP
function
from RBesT to perform meta-analysis. This informative prior results in a
representative form from a large MCMC samples, and it can be converted
to a parametric representation with the
automixfit
function using
expectation-maximization (EM) algorithm [3]. This informative prior is
also called MAP prior.
sigma = 3
# load R packages
library(ggplot2)
theme_set(theme_bw()) # sets up plotting theme
set.seed(22)
map_mcmc <- gMAP(cbind(mean, se) ~ 1 | study,
weights=n,data=dat,
family=gaussian,
beta.prior=cbind(0, sigma),
tau.dist="HalfNormal",tau.prior=cbind(0,sigma/2))
## Warning: There were 3 divergent transitions after warmup. See
## https://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup
## to find out why this is a problem and how to eliminate them.
## Warning: Examine the pairs() plot to diagnose sampling problems
## Warning: Bulk Effective Samples Size (ESS) is too low, indicating posterior means and medians may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#bulk-ess
## Warning: Tail Effective Samples Size (ESS) is too low, indicating posterior variances and tail quantiles may be unreliable.
## Running the chains for more iterations may help. See
## https://mc-stan.org/misc/warnings.html#tail-ess
## Warning in gMAP(cbind(mean, se) ~ 1 | study, weights = n, data = dat, family = gaussian, : In total 3 divergent transitions occured during the sampling phase.
## Please consider increasing adapt_delta closer to 1 with the following command prior to gMAP:
## options(RBesT.MC.control=list(adapt_delta=0.999))
## EM for Normal Mixture Model
## Log-Likelihood = -578.91
##
## Univariate normal mixture
## Reference scale: 2.831279
## Mixture Components:
## comp1 comp2 comp3
## w 0.44346849 0.38384724 0.17268428
## m 0.13614139 -0.07204568 0.35734348
## s 0.75399840 0.21438353 1.95206197
The resulting MAP prior is approximated by a mixture of conjugate priors.
Let θ and θh denote the treatment effects associated with the current arm data D and historical Dh, respectively. Let δ denote the clinically significant difference such that is |θh − θ| ≥ δ, then θh is regarded as clinically distinct from θ, and it is therefore inappropriate to borrow any information from Dh. Consider two hypotheses:
H0 : θ = θh, H1 : θ = θh + δ or θ = θh − δ. H0 represents that Dh and D are consistent (i.e., no prior-data conflict) and thus information borrowing is desirable, whereas H1 represents that the treatment effect of D differs from Dh to such a degree that no information should be borrowed.
The SAM prior uses the likelihood ratio test (LRT) statistics R to quantify the degree of prior-data conflict and determine the extent of information borrowing. $$ R = \frac{P(D | H_0, \theta_h)}{P(D | H_1, \theta_h)} = \frac{P(D | \theta = \theta_h)}{\max \{ P(D | \theta = \theta_h + \delta), P(D | \theta = \theta_h - \delta) \}} , $$ where P(D|⋅) denotes the likelihood function. An alternative Bayesian choice is the posterior probability ratio (PPR): $$ R = \frac{P(D | H_0, \theta_h)}{P(D | H_1, \theta_h)} = \frac{P(H_0)}{P(H_1)} \times BF , $$ where P(H0) and P(H1) is the prior probabilities of H0 and H1 being true. BF is the Bayes Factor that in this case is the same as LRT.
The SAM prior, denoted as πsam(θ), is then defined as a mixture of an informative prior π1(θ), constructed based on Dh, with a non-informative prior π0(θ): πsam(θ) = wπ1(θ) + (1 − w)π0(θ) where the mixture weight w is calculated as: $$w = \frac{R}{1 + R}.$$ As the level of prior-data conflict increases, the likelihood ratio R decreases, resulting in a decrease in the weight w assigned to the informative prior and a decrease in information borrowing. As a result, πsam(θ) is data-driven and has the ability to self-adapt the information borrowing based on the degree of prior-data conflict.
To calculate the SAM weight w, we first assume the sample size
enrolled in the control arm is n = 30, with θ = 0.4 and σ = 3. Additionally, we assume the
effective size is $d = \frac{\theta -
\theta_h}{\sigma} = 0.5$, then we can apply function
SAM_weight
in SAMprior as follows:
set.seed(234)
sigma <- 3 ## Standard deviation in the current trial
data.crt <- rnorm(30, mean = 0.4, sd = sigma)
wSAM <- SAM_weight(if.prior = map_automix,
delta = 0.5 * sigma,
data = data.crt)
cat('SAM weight: ', wSAM)
## SAM weight: 0.96382
The default method to calculate w is using LRT, which is fully data-driven. However, if investigators want to incorporate prior information on prior-data conflict to determine the mixture weight w, this can be achieved by using PPR method as follows:
wSAM <- SAM_weight(if.prior = map_automix,
delta = 0.5 * sigma,
method.w = 'PPR',
prior.odds = 3/7,
data = data.crt)
cat('SAM weight: ', wSAM)
## SAM weight: 0.9194651
The prior.odds
indicates the prior
probability of H0
over the prior probability of H1. In this case (e.g.,
prior.odds = 3/7
), the prior information
favors the presence prior-data conflict and it results in a decreased
mixture weight.
When historical information is congruent with the current control arm, SAM weight reaches to the highest peak. As the level of prior-data conflict increases, SAM weight decreases. This demonstrates that SAM prior is data-driven and self-adapting, favoring the informative (non-informative) prior component when there is little (substantial) evidence of prior-data conflict.
To construct the SAM prior, we mix the derived informative prior
π1(θ) with
a vague prior π0(θ) using
pre-determined mixture weight by SAM_prior
function in SAMprior as follows:
unit_prior <- mixnorm(nf.prior = c(1, summary(map_automix)['mean'], sigma))
SAM.prior <- SAM_prior(if.prior = map_automix,
nf.prior = unit_prior,
weight = wSAM, sigma = sigma)
SAM.prior
## Univariate normal mixture
## Reference scale: 2.831279
## Mixture Components:
## comp1 comp2 comp3 nf.prior
## w 0.40775378 0.35293412 0.15877716 0.08053494
## m 0.13614139 -0.07204568 0.35734348 0.09442748
## s 0.75399840 0.21438353 1.95206197 3.00000000
where the non-informative prior π0(θ) follows an unit-information prior.
In this section, we aim to investigate the operating characteristics of the SAM prior, constructed based on the historical data, via simulation. The incorporation of historical information is expected to be beneficial in reducing the required sample size for the current arms. To achieve this, we assume a 1:2 ratio between the control and treatment arms.
We compare the operating characteristics of the SAM prior and rMAP prior with pre-specified fixed weight under various scenarios. Specifically, we will evaluate the relative bias and relative mean squared error (MSE) of these methods. The relative bias and relative MSE are defined as the differences between the bias/MSE of a given method and the bias/MSE obtained when using a non-informative prior.
Additionally, we investigate the type I error and power of the methods under different degrees of prior-data conflicts. The decision regarding whether a treatment is superior or inferior to a standard control will be based on the condition: Pr (θt − θ > 0) > 0.95.
In SAMprior, the operating characteristics can be considered in following steps:
Specify priors: This step involves constructing informative prior based on historical data and non-informative prior.
Specify the decision criterion: The
decision2S
function is used to initialize
the decision criterion. This criterion determines whether a treatment is
considered superior or inferior to a standard control.
Specify design parameters for the
get_OC
function: This step involves
defining the design parameters for evaluating the operating
characteristics. These parameters include the clinically significant
difference (CSD) used in SAM prior calculation, the method used to
determine the mixture weight for the SAM prior, the sample sizes for the
control and treatment arms, the number of trials used for simulation,
the choice of weight for the robust MAP prior used as a benchmark, and
the vector of mean for both the control and treatment arms.
To compute the type I error, we consider four scenarios, with the first and last two scenarios representing minimal and substantial prior-data conflicts, respectively. In general, the results show that both methods effectively control the type I error.
set.seed(123)
# weak_prior <- mixnorm(c(1, summary(map_automix)[1], 1e4))
TypeI <- get_OC(if.prior = map_automix, ## MAP prior from historical data
nf.prior = unit_prior, ## Weak-informative prior for treatment arm
delta = 0.5*sigma, ## CSD for SAM prior
n = 35, n.t = 70, ## Sample size for control and treatment arms
## Decisions
decision = decision2S(0.95, 0, lower.tail=FALSE),
ntrial = 1000, ## Number of trials simulated
if.MAP = TRUE, ## Output robust MAP prior for comparison
weight = 0.5, ## Weight for robust MAP prior
## Mean for control and treatment arms
theta = c(0, 0, -2, 4),
theta.t = c(0, -0.1, -2, 4),
sigma = sigma
)
kable(TypeI)
scenarios | Bias_SAM | Bias_rMAP | RMSE_SAM | RMSE_rMAP | wSAM | res_SAM | res_rMAP | res_NP |
---|---|---|---|---|---|---|---|---|
1 | -0.0107078 | 0.0012848 | -0.1036480 | -0.1168595 | 0.8007164 | 0.040 | 0.033 | 0.045 |
2 | -0.0111150 | 0.0012622 | -0.1075805 | -0.1220022 | 0.7893358 | 0.028 | 0.021 | 0.040 |
3 | 0.0084721 | 0.0980393 | 0.0121919 | 0.0844466 | 0.0113622 | 0.039 | 0.032 | 0.039 |
4 | 0.0028979 | -0.0111321 | 0.0003541 | 0.0071032 | 0.0000034 | 0.058 | 0.060 | 0.058 |
For power evaluation, we also consider four scenarios, with the first and last two scenarios representing minimal and substantial prior-data conflicts, respectively. In general, it is observed that the SAM prior achieves comparable or better power compared to rMAP.
set.seed(123)
Power <- get_OC(if.prior = map_automix, ## MAP prior based on historical data
nf.prior = unit_prior, ## Non-informative prior for treatment arm
delta = 0.5*sigma, ## CSD for SAM prior
n = 35, n.t = 70, ## Sample size for control and treatment arms
## Decisions
decision = decision2S(0.95, 0, lower.tail=FALSE),
ntrial = 1000, ## Number of trials simulated
if.MAP = TRUE, ## Output robust MAP prior for comparison
weight = 0.5, ## Weight for robust MAP prior
## Mean for control and treatment arms
theta = c(0, 0.1, 0.5, -3),
theta.t = c(1, 1.1, 2.0, -1.5),
sigma = sigma
)
kable(Power)
scenarios | Bias_SAM | Bias_rMAP | RMSE_SAM | RMSE_rMAP | wSAM | res_SAM | res_rMAP | res_NP |
---|---|---|---|---|---|---|---|---|
1 | -0.0107078 | 0.0012848 | -0.1036480 | -0.1168595 | 0.8007164 | 0.620 | 0.586 | 0.490 |
2 | -0.0375138 | -0.0269056 | -0.1030944 | -0.1171336 | 0.7958508 | 0.653 | 0.613 | 0.495 |
3 | -0.1158161 | -0.1189963 | -0.0174726 | -0.0533351 | 0.6941143 | 0.845 | 0.861 | 0.786 |
4 | 0.0025318 | 0.0262925 | 0.0009209 | 0.0202713 | 0.0001947 | 0.766 | 0.745 | 0.766 |
Finally, we present an example of how to make a final decision on
whether the treatment is superior or inferior to a standard control once
the trial has been completed and data has been collected. This step can
be accomplished using the postmix
function
from RBesT, as shown below:
## Simulate data for treatment arm
data.trt <- rnorm(60, mean = 3, sd = sigma)
## first obtain posterior distributions...
post_SAM <- postmix(priormix = SAM.prior, ## SAM Prior
data = data.crt)
post_trt <- postmix(priormix = unit_prior, ## Non-informative prior
data = data.trt)
## Define the decision function
decision = decision2S(0.95, 0, lower.tail=FALSE)
## Decision-making
decision(post_trt, post_SAM)
## [1] 1
[1] Yang P. et al., Biometrics, 2023; 00, 1–12. https://doi.org/10.1111/biom.13927
[2] Schmidli H. et al., Biometrics, 2014;
70(4):1023-1032.
[3] Neuenschwander B. et al., Clin Trials, 2010; 7(1):5-18.
## R version 4.4.1 (2024-06-14)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.1 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
## LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.26.so; LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=C
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## time zone: Etc/UTC
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] knitr_1.48 SAMprior_1.1.1 ggplot2_3.5.1 Metrics_0.1.4
## [5] checkmate_2.3.2 assertthat_0.2.1 RBesT_1.7-3 rmarkdown_2.28
##
## loaded via a namespace (and not attached):
## [1] gtable_0.3.5 tensorA_0.36.2.1 xfun_0.48
## [4] bslib_0.8.0 QuickJSR_1.4.0 inline_0.3.19
## [7] vctrs_0.6.5 tools_4.4.1 generics_0.1.3
## [10] stats4_4.4.1 curl_5.2.3 parallel_4.4.1
## [13] tibble_3.2.1 fansi_1.0.6 highr_0.11
## [16] pkgconfig_2.0.3 distributional_0.5.0 RcppParallel_5.1.9
## [19] lifecycle_1.0.4 farver_2.1.2 compiler_4.4.1
## [22] stringr_1.5.1 munsell_0.5.1 codetools_0.2-20
## [25] htmltools_0.5.8.1 sys_3.4.3 buildtools_1.0.0
## [28] sass_0.4.9 bayesplot_1.11.1 yaml_2.3.10
## [31] Formula_1.2-5 crayon_1.5.3 pillar_1.9.0
## [34] jquerylib_0.1.4 cachem_1.1.0 StanHeaders_2.32.10
## [37] abind_1.4-8 posterior_1.6.0 rstan_2.32.6
## [40] tidyselect_1.2.1 digest_0.6.37 mvtnorm_1.3-1
## [43] stringi_1.8.4 dplyr_1.1.4 reshape2_1.4.4
## [46] maketools_1.3.1 labeling_0.4.3 fastmap_1.2.0
## [49] grid_4.4.1 colorspace_2.1-1 cli_3.6.3
## [52] magrittr_2.0.3 loo_2.8.0 pkgbuild_1.4.4
## [55] utf8_1.2.4 withr_3.0.1 scales_1.3.0
## [58] backports_1.5.0 matrixStats_1.4.1 gridExtra_2.3
## [61] evaluate_1.0.1 V8_5.0.1 rstantools_2.4.0
## [64] rlang_1.1.4 Rcpp_1.0.13 glue_1.8.0
## [67] jsonlite_1.8.9 R6_2.5.1 plyr_1.8.9