By Michael Willis, MSc, Senior Statistician, and Lisa Gibbons, MSc, Senior Statistician
Introduction
The landscape of clinical trials is changing, with sponsors looking for more ways to make trials quicker and more efficient, thus shortening drug development timelines and bringing important new treatments to patients faster. Innovations exist in the study set-up, recruitment, delivery, and the programming and statistical analyses that underpin the trial, but also in the design of clinical trials.
We are seeing an evolution away from the traditional two group, randomized controlled trial, in favour of adopting methodologies that reduce study timelines and maximize the knowledge gained from data.
Innovative trial designs include adaptive designs; use of historical data; multiple arms; novel endpoints; targeted or stratified trials; and Bayesian methods.
Three adaptive design methodologies were discussed at the first Contemporary Clinical Trials Methodology Meeting hosted by the Imperial Clinical Trials Unit and the NIHR RSS Imperial College London and Partners Hub on 25th April in London.
Group sequential designs, response adaptive randomization, and sample size re-estimation all offer opportunities to learn from trials as they are ongoing, to modify trial elements during conduct whilst preserving statistical integrity.
Our expert statisticians were in attendance to ensure their continued and exceptional professional learning and development and to contribute to the discussion. The article below summarizes the key findings and discussion points from the day and provides an overview of the state-of-the-art in innovative clinical trial methodologies.
Key Takeaways from the Contemporary Clinical Trials Meeting
Group Sequential Design: Enhancing Approvals of Well Performing Interventions
The session began with an introduction to Group Sequential Designs by Leila Janani of ICTU along with an overview of considerations to make when implementing these designs in practice.
When utilized well, GSDs provide an opportunity to expedite study results using a reduced sample size by analysing data incrementally at prespecified information fractions whilst maintaining the desired type-1 error through a variety of alpha-spending methods. These methods produce boundaries at each increment/interim which determine the effect size that needs to be observed to recommend stopping (efficacy/futility) or continuing the trial. This in turn allows for accelerated approval of an overperforming intervention, or, conversely, reduced exposure to an underperforming intervention.
Stuart Pocock of LSHTM delved deeper into the differences between commonly used boundaries (such as Pocock, O’Brien-Fleming, Peto) and their role in DMC/DSMB recommendations, insisting that boundaries are to be treated as guidelines to aid DMC/DSMB recommendations along with a totality of evidence and not as strict rules. After a tongue-in-cheek critique of his own boundary, explaining why boundaries such as O’Brien-Fleming are much more commonly used in practice, Stuart then shared applied examples of GSDs from his wealth of experience in Cardiology trials.
During the final part of the session, Laurent Billot of The George Institute for Global Health spoke on the other considerations that need to be made when designing GSDs (i.e. number of interim analyses, timing of interim analyses) and shared example simulations demonstrating that it is possible to optimize these considerations to reduce the required sample size further. A useful tool for an exploratory look into the effect of altering these considerations is RPACT’s publicly accessible Trial Design App.
The speakers also discussed the Bayesian alternative of GSDs which incorporates prior knowledge of the expected treatment effect. Bayesian Adaptive Designs (BAD) offer another opportunity to optimize the study design, however whether the benefits of the design (reduced cost/sample size) outweigh the complexity of the design (to produce, run, and explain to others) was a topic for debate.
The panel agreed that regardless of personal preference, the discussion of Frequentist vs Bayesian designs was invaluable when initially planning the study to build a better understanding of the scenario and decisions that are being made.
Want to learn more about the use of Frequentist vs Bayesian designs? See work by one of our experts here!
Response Adaptive Randomization: Maximizing Effective Treatments
The second session began with Lukas Pin of the MRC Biostatistics Unit, Cambridge, introducing response adaptive randomization (RAR). In clinical trials, the most frequently used randomization scheme uses fixed allocation that is often, but not always, equal. This randomization scheme remains unchanged throughout the lifetime of the study.
These objectives may be, for example, to maximize the power of a specific treatment comparison, or to assign more patients to an effective treatment. RAR trials have received an abundance of theoretical attention but there are few high impact applications of the methodology in clinical trials.
he simplest example of an RAR is called “Play the Winner” where the first patient enters the trial with a 50:50 chance of being allocated to either of two treatment groups. If you observe a success, then the next patient is allocated to the same arm. If you observe a failure, then the next patient is allocation to the alternative arm. This randomization scheme assigns, on average, more patients to the superior arm, but there are examples of where this method can lead to extreme scenarios. Take, for example, a 1985 study looking at extracorporeal membrane oxygenation randomized 11 out of 12 infants to the intervention arm. [1]
Alternative RAR schemes include Bayesian approaches which were discussed by the second speaker, Sofia Villar, also from the MRC Biostatistics Unit, who took us through the ARREST trial. [2] This Phase II trial was designed to assess the safety and efficacy of extracorporeal membrane oxygenation (ECMO)-facilitated resuscitation versus standard advanced cardiac life support (ACLS) treatment and included a Bayesian stopping rule based on the posterior probability of superiority at interim analyses. If an interim analysis failed to decide to stop the study, the Posterior probability of superiority would be used to update the allocation probabilities.
It was clear from this presentation that one of the biggest barriers to the widespread uptake of RAR methods is the computational intensity of these techniques and the need for readily available and accessible code or apps. This point was made by the discussant, John Norrie, from the University of Edinburgh. He also highlighted the need to carefully consider the outcome for response adaptation, which needs to be quickly available so that timely modifications can be made. For example, 5-year mortality would be a bad choice of outcome as you would likely have randomized all participants before you observed your first outcome. Similarly, care needs to be taken when deciding how often you should look at the data to update randomization allocation probabilities.
A final takeaway discussion from this session concerned the ethics of such methods. It is clear to see the benefits of RAR when considering rare diseases where your study sample could be comprised of a large proportion of all current patients with a particular condition. Here, it would be advantageous to get patients superior treatment as quickly as possible. If, on the other hand, you are considering the treatment of a more widespread condition, such as diabetes, where the study sample makes up just a tiny proportion of those affected, there are arguments that these methodologies, compared to fixed ones, affect the ethical nature of the clinical research and that for two arm groups, a fixed randomization allocation remains the most efficient.
Sample Size Re-estimation: Recalculating Sample Size for Optimum Outcomes
In the final session of the day, the topic of interest was sample size re-estimation (SSR). Here, each of the talks provided a practical insight into applying this methodology in various clinical trial settings.
The session began with Alan Montgomery of NCTU, providing an overview of the SSR conducted in the HI-Light vitiligo trial, a 3-arm randomized clinical trial. [3] It was highlighted that when conducting a sample size calculation for a 3-arm study it is important to consider not only the global comparison across arms but also the pairwise comparisons between arms, and the overall impact this can have if a pairwise comparison is of interest but has not been accounted for within the original calculation.
During the HI-Light trial it was agreed to perform a SSR to check the assumptions used within the original sample size calculations prior to the end of recruitment, resulting in an increased sample size. Emphasizing that SSR ideally requires outcomes which are timely to be able to influence the overall sample size before the trial ends.
Dominique Couturier of University of Cambridge presented modelling possibilities for a blinded and unblinded SSR for Days Alive and at Home (DAH). Rather than considering overall time spent in hospital the model examines time in hospital split into various subcategories; post intervention protocol stays, extended initial stay, and length of readmission. By separating the time spent in hospital into subcategories, this provided an interesting discussion on where to apply the treatment effect within the model. Monte Carlo simulation was used within the blinded and unblinded SSR procedure, with the unblinded model maintaining a form of blinding by allowing the treatment allocation without revealing which treatment group is which.
Ruth Goodall of UCL closed out the event, presenting an overview of the SSR conducted within Stage 2 of the STREAM trial. [4] Due to the results of Stage 1 and changes to the protocol, two of the four treatment arms were dropped, leading to a revised sample size. A blinded SSR was conducted to maintain integrity of the study, pooling data from both treatment arms without considering treatment assignment. The sample size was to only increase should the estimate of the pooled favourable efficacy outcome fall below a pre-defined cut-off. Additionally, Ruth discussed the simulation study which was performed to assess various scenarios of the SSR and to optimize the pre-defined cut-off used.
Conclusion
In conclusion, the adoption of innovative clinical trial designs is already revolutionizing clinical trials by enhancing efficiency, ethical conduct, and the scientific robustness of trials. These methodologies, including GSDs, RAR, and SSR, offer substantial improvements over traditional RCTs, particularly in their ability to learn from and adapt to ongoing trial data.
The discussions and insights from the meeting underscore the critical role of these adaptive designs in the future of clinical trials, aiming to maximize the knowledge gained from data while preserving statistical integrity and ethical standards.
References:
[1] Bartlett RH, Roloff DW, Cornell RG, Andrews AF, Dillon PW, Zwischenberger JB. Extracorporeal circulation in neonatal respiratory failure: a prospective randomized study. Pediatrics. 1985;76(4):479-87.
[2] Yiannopoulos D et al. Advanced reperfusion strategies for patients with out-of-hospital cardiac arrest and refractory ventricular fibrillation (ARREST): a phase 2, single centre, open-label, randomised controlled trial. The Lancet. 2020;396(10265):1807-1816.
[3] Thomas KS, et al; UK Dermatology Clinical Trials Network’s HI-Light Vitiligo Trial Team. Randomized controlled trial of topical corticosteroid and home-based narrowband ultraviolet B for active and limited vitiligo: results of the HI-Light Vitiligo Trial. Br J Dermatol. 2021;184(5):828-839.
[4] Goodall RL et al. Evaluation of two short standardised regimens for the treatment of rifampicin-resistant tuberculosis (STREAM stage 2): an open-label, multicentre, randomised, non-inferiority trial. The Lancet. 200;400(10366):1858-1868.
Take the Next Step in Advancing Clinical Trials
Stay at the forefront of clinical trial innovation and learn how to implement adaptive designs, utilize AI, and improve data analytics in your trials.