Spyro

Bayesian Analysis of Blood Pressure Change Using Conjugate Priors

Bayesian design and prespecified interim analyses

A Bayesian trial design was proposed that allows for prespecified interim analyses to take place in addition to leveraging pilot study data. For SPYRAL HTN-OFF MED Pivotal, assuming 15% attrition, interim analyses are expected to occur first term at approximately 210 and second-term when approximately 240 subjects have evaluable data, with a maximum sample size of 300 evaluable subjects if the trial does not stop at either the first or second interim look.

The time from randomization of the first cohort to the second cohort and final cohort, if applicable, is lengthy due to stringent eligibility criteria and subsequent slow randomization rates.

For SPYRAL HTN-ON MED Expansion, the expected sample sizes are 149 and 187 evaluable subjects, with a maximum sample size of 221 evaluable subjects. Actual numbers of evaluable subjects will be determined by the actual attrition, which is currently expected to be 15%. At each prespecified interim analysis, enrollment may be stopped for efficacy or expected futility.

For SPYRAL HTN-OFF MED Pivotal, the primary and secondary efficacy endpoints will be evaluated during these prespecified interim looks, and enrollment will only stop at an interim analysis if both endpoints meet the following stopping criteria. For SPYRAL HTN-ON MED Expansion, the primary efficacy endpoint will be evaluated at prespecified interim looks. A distinction between the stopping for efficacy and futility is that the former is based solely on the observed/evaluable data, whereas the latter involves the observed/evaluable data, as well as imputation for subjects without evaluable data or not yet randomized.

The underlying model is a Bayesian analogue of the analysis of covariance (ANCOVA) model for SBP change from baseline, š‘¦š‘–, adjusted for baseline blood pressure and treatment arm. Due to a Bayesian power prior approach being used, a non-standard parameterization for the ANCOVA model is used to allow for informative prior distributions to be placed separately on the RDN and control arm effects:

Model Assumptions

Hereā€™s a breakdown of the Bayesian clinical trial methodology described in the article, along with guidance on how to build an R simulation to demonstrate it.

Model

  • Bayesian ANCOVA: The core model is a Bayesian version of Analysis of Covariance (ANCOVA). Itā€™s expressed in a slightly different form to facilitate the use of informative prior distributions: y_i = mu_t * I(i, āˆˆ, t) + mu_c * I(i, āˆˆ, c) + x_i * beta + epsilon_i
    • y_i: Change from baseline in blood pressure at follow-up for subject ā€˜iā€™.
    • I(i, āˆˆ, t): Indicator whether subject ā€˜iā€™ belongs to the treatment group (1 if yes, 0 if no).
    • I(i, āˆˆ, c): Indicator whether subject ā€˜iā€™ belongs to the control group.
    • mu_t: Baseline-adjusted treatment effect for the renal denervation group.
    • mu_c: Baseline-adjusted treatment effect for the sham control group.
    • x_i: Mean-centered baseline blood pressure for subject ā€˜iā€™.
    • beta: Regression coefficient adjusting for baseline blood pressure.
    • epsilon_i: Error term, assumed to be normally distributed.
  • Priors:
    • Pilot Study Data: Data from previous, similar trials are integrated using Bayesian power priors.
    • Power Parameters: alpha_t and alpha_c control how heavily the pilot data is weighted. These can be fixed or estimated dynamically.
    • Other Priors: Diffuse Normal for beta, flat prior for log(sigma).

Bayesian ANCOVA Model Simulation

This simulation will reflect the Bayesian ANCOVA model structure you provided. Weā€™ll simulate treatment effect analysis incorporating both baseline and treatment effects, reflecting your method of adjusting for mean-centered baseline blood pressure.

Example usage with hypothetical p-values for treatment and control groups

[1] 0.6324573

understanding discount function

Type I Error Assessment

  • Extensive Simulation: The described methodology relies heavily on simulation studies to assess Type I error rates (false positives) under various scenarios.

Discount Function

  • Purpose: The discount function downweights the influence of pilot data if thereā€™s evidence of dissimilarity between the pilot and the current trial.
  • Weibull Function: A Weibull function is used for discounting, with shape and scale parameters carefully chosen through simulations.

R Simulation: Key Steps

  1. Data Generation:
    • Create a function to simulate realistic blood pressure change data, including baseline, treatment allocation, and potential covariate effects.
    • Set parameters for the true treatment effect, variability, and other relevant factors.
  2. Prior Specification:
    • Obtain pilot study data (if available).
    • Define functions to create prior distributions for mu_t, mu_c, beta, and log(sigma).
    • Implement methods for calculating or setting power parameters.
  3. Bayesian Model Fitting:
    • Choose a Bayesian modeling package in R (such as rstanarm or brms).
    • Define the ANCOVA model using the specified parameterization.
    • Set up the model to include prior distributions and handle discount parameters.
  4. Interim Analyses:
    • Specify points at which interim analyses will occur based on subject numbers.
    • Implement Bayesian decision rules:
      • Calculate the posterior probability of mu < 0.
      • Stop for efficacy if the probability exceeds a threshold.
      • Stop for futility if the probability falls below a threshold.
  5. Simulations:
    • Create a loop to run the simulation multiple times under varying conditions:
      • Different true treatment effects
      • Variations in pilot data similarity
    • Calculate Type I error, power, and other operating characteristics.

Let me know if youā€™d like specific R code examples for any of these steps. Building this simulation can be quite involved, so itā€™s best to break it down into parts.

  1. Likelihood (Data Model): The change in blood pressure, denoted as \((\Delta BP)\), is assumed to be normally distributed with an unknown mean \((\mu\)\) and known variance \((\sigma^2)\).
  2. Prior: The prior for \((\mu)\) is normal, expressed as \((N(\mu_0, \tau^2)\)\) where \((\mu_0\)\) is the prior mean (based on previous studies) and (^2) is the prior variance.

Bayesian Updating

The posterior distribution for \(\mu)\), after observing data with a sample mean \((\bar{x}\)\) and sample variance \((s^2)\) from \((n)\) observations, remains normal: \[ \mu \mid \text{data} \sim N\left(\frac{\frac{\bar{x}}{s^2/n} + \frac{\mu_0}{\tau^2}}{\frac{1}{s^2/n} + \frac{1}{\tau^2}}, \frac{1}{\frac{1}{s^2/n} + \frac{1}{\tau^2}}\right)\]

Simulation


    One-armed bdp normal

data:
  Current treatment: mu_t = -3.6, sigma_t = 18.9466468512361, N_t = 204
  Historical treatment: mu0_t = -4.9, sigma0_t = -19.9906104312602, N0_t = 70
Stochastic comparison (p_hat) - treatment (current vs. historical data): 0.4575
Discount function value (alpha) - treatment: 0.5352
95 percent CI: 
 -6.1963  -1.272
posterior sample estimate:
mean of treatment group
 -3.7796

Two sample


    Two-armed bdp normal

data:
  Current treatment: mu_t = -4.6, sigma_t = 9.49966570358137, N_t = 112
  Current control: mu_c = -0.8, sigma_c = 7.53739117484396, N_c = 95
  Historical treatment: mu0_t = -5.3, sigma0_t = 9.76153164211437, N0_t = 35
  Historical control: mu0_c = -0.74, sigma0_c = 9.72, N0_c = 36
Stochastic comparison (p_hat) - treatment (current vs. historical data): 0.4663
Stochastic comparison (p_hat) - control (current vs. historical data): 0.4961
Discount function value (alpha) - treatment: 0.5557
Discount function value (alpha) - control: 0.6235
alternative hypothesis: two.sided
95 percent CI: 
 -6.0515  -1.7134
posterior sample estimates:
treatment group  control group
          -4.69          -0.79

Two sample

Discount

Power Analysis

In a Bayesian adaptive trial, the ā€œpower parameterā€ is not related to statistical power. Instead, itā€™s a term used to describe how much the data from a prior study (like a pilot study) should influence the analysis of the current trial. Itā€™s a measure of ā€œborrowing strengthā€ from the prior data.

  • If the pilot and current pivotal data are very similar, you would use more of the pilot data in your analysis (power parameter closer to 1).

  • If they are quite different, you would use less of the pilot data (power parameter closer to 0).

  • Adaptive allocation ruleā€”change in the randomization procedure to modify the allocation proportion.

  • (2)

    Adaptive sampling ruleā€”change in the number of study subjects (sample size) or change in study population: entry criteria for the patients change.

  • (3)

    Adaptive stopping ruleā€”during the course of the trial, a data-dependent rule dictates whether and when to stop for harm/futility/efficacy.

  • (4)

    Adaptive enrichment: during the trial, treatments are added or dropped.

  • The significance level in a clinical trial is the threshold at which the results of the trial are deemed statistically significant. It represents the probability of rejecting the null hypothesis when it is actually true (a Type I error, or false positive). In conventional frequentist statistics, common significance levels are 0.05 or 0.01. For the SPYRAL HTN-OFF MED Pivotal trial, a one-sided Type I error rate of 2.9% is specified, which means they use a significance level of 0.029 for their primary efficacy endpoint.

    In the Bayesian context, the significance level can be used to determine the decision thresholds for interim analyses. For instance, a posterior probability below a certain significance level may indicate futility, or if it is sufficiently high above 1 minus the significance level, it may indicate efficacy.

Similatiry

1.  **Power Prior**: In a Bayesian framework, prior information (like results from pilot studies) is typically incorporated into the analysis through the prior distribution. A power prior is a method to adjust the influence of this prior information on the current analysis.

2.  **Similarity Statistic**: Before the pilot data is fully incorporated into the pivotal trial analysis, a measure of similarity between the pilot and pivotal data is calculated. This helps to understand how comparable the two datasets are.

3.  **Power Parameter**: The power parameter (denoted typically by Ī± in Bayesian statistics) adjusts how much weight is given to the pilot data in the analysis. If the pilot and pivotal data are very similar, the power parameter is closer to 1, meaning that more information from the pilot study is retained. If the data are dissimilar, the power parameter is closer to 0, meaning the pilot data is down-weighted.

4.  **Transformation using Weibull Function**: The similarity statistic is transformed into a power parameter using a Weibull function. The shape and scale of this function determine how the similarity statistic is translated into the power parameter. This transformation is based on predefined parameters that are determined through simulations to optimize the trial's design.

The purpose of this is to dynamically borrow information from the pilot study based on the level of consistency with the ongoing pivotal trial data. This process allows the analysis to be adaptive and more informative by integrating the amount of evidence that is considered appropriate from the prior study into the current analysis.

In summary, the significance level in the context of the SPYRAL trials is used as a decision-making threshold during interim analyses, while the similarity statistic and its transformation into a power parameter determine the degree to which prior data informs the current analysis.
[1] 0.753125
[1] 0

Power simluation

Power Estimation (94% to detect -4.0 mm Hg difference): This means that when they simulated 8,000 trials assuming the true effect is -4.0 mm Hg, in 94% of those simulations, they were able to reject the null hypothesis of no effect (or an effect of 0 mm Hg) at the 2.9% significance level. Type I Error Estimation (2.9% for the primary endpoint): This is the probability of rejecting the null hypothesis when it is true, and it was estimated using 15,000 simulations under the null hypothesis of no effect. Only in 2.9% of those simulations should they have rejected the null hypothesis, given that there was actually no difference.

They assumed a true treatment effect and simulated data accordingly.

Hereā€™s whatā€™s happening in their methodology:

They used a statistical method (likely a Bayesian approach, given the context) to analyze the simulated data. They checked against a significance level (2.9% one-sided Type I error rate) to see if the null hypothesis could be rejected. They repeated this process across many simulations and calculated what percentage of those simulations resulted in rejecting the null hypothesis. If itā€™s around 94%, thatā€™s the power estimate.

In both cases, each simulation is binary in outcome: either the null is rejected (counted as a success for power, or as a false positive for Type I error) or it is not. The final reported percentages (power of 94%, Type I error of 2.9%) are the proportions of these binary outcomes across all simulations.

Simulation

  sample_size    power
1         210 0.772375
2         240 0.827000
3         300 0.882125
[1] 0.564875

Type 1 error: