View from the summit: The Best Laid Plans...Navigating Uncertainty During COVID-19

Day 2 of PHASTAR’s Life Science Summit began with an insightful and engaging plenary panel discussion with Dr. Janet Wittes, Prof. Stephen Senn and Prof. Frank Harrell, focusing on “The Best Laid Plans...Navigating Uncertainty During COVID-19”. Each panelist gave a short presentation, followed by a lively Q&A session chaired by PHASTAR’s Head of Statistical Research, Prof. Jennifer Rogers.

Dr. Wittes kicked off day 2 delivering a candid assessment of how the COVID-19 pandemic has changed her thinking on a variety of aspects of clinical trials while pondering on if, and how, future clinical trials will ever look post-COVID-19. She highlighted the fact that presently there are over 500 COVID-19-related clinical trials and the number is growing. A key aspect of her talk focused on how we might use this unfortunate situation to learn and adapt our approaches so in the post-COVID-19 settingclinical trials might be more efficient and better designed.

Discussing how we should be thinking about current clinical trials impacted by the pandemic, Dr. Wittes stressed the importance of assessing the level of information collected in a trial when deciding how to proceed. She presented on some recent examples of affected clinical trials that she has been involved with, including one trial in an oncology setting in which 95% of data were collected at the start of the pandemic allowing for the decision to stop the trial earlyA second example she described is in an orphan disease setting in which the outcome is ventilator-free survival, which is an outcome expected to be impacted in the US due to lack of ventilators, which occurred early in the pandemic. A third example is a trial for a serious disease that has a continuous outcome12-month follow-up, and being conducted in a population for which depression occurs frequently. With the social distancing and isolation enacted to curb the spread of COVID-19, Dr. Wittes commented on how social factors may introduce bias into assessment in this trial. Increasing the 12-month window to 15 months and using  multiple imputation are some approaches suggested to deal with missing data.

A fourth example was for a CV outcome trial with a first occurrence of MACE+ outcome for which less than 20% events had occurred when COVID-19. The trial enrollment was paused; however, data collection in enrolled patients continues. The biggest concern over how the pandemic might impact this trial is the possible undercounting of CV-related deaths as deaths might be assessed differently during the pandemic.

Dr. Wittes also raised the points that during the COVID-19 period, adherence to study drug and procedures might be low and data collection might vary (compared to the pre-and post-COVID-19 periods) complicating data inference. Overall, she recommends conducting and detailing many sensitivity analyses when analyzing clinical trials impacted by this pandemic.  

Following Dr. Wittes presentation, Prof. Stephen Senn opened his talk, “Clinical trials: quo vadis in age of COVID-19?” by remarking that any damned fool can analyse a randomized clinical trial, and frequently does. He began by discussing trade-offs between one-and two-sample tests in the analyses of a single arm study - one sample test nominates a target response rate based on historical data whereas the two-sample test finds a set of historical controls and treats them as if they were a concurrent control. In the one sample test approach, he discussed how he views this approach as cheating as the comparison against a historical control assumes a large concurrent control group. For the two-sample test, given that the parallel group analogy fails, it might be better to think of the historical controls as having come from a cluster RCT as patients from the same center are treated the same, but may differ from other centers. His thoughts on planning and integration of historical controls include need to pre-specify a suitable historical control set with a finalized analysis plan prior to study start, use of a hierarchical model and emphasis of number of historical trials over number of patients.  

The second part of Prof. Senn’s presentation covered examples with relevance to COVID-19 touching on some recent studies coming out of the press. These include the single arm HCQ study that has been critiqued for incorrect use of statistical methods, the RECOVERY trial and a 2009 study about use of face masks on virus transmission. He concluded that not all studies during COVID-19 have been done well and that there is a continued need for careful thinking, control and decision making in current and future research around COVID-19.

Prof. Frank Harrell was the final of the three plenary speakers on day 2, speaking on “Bayes for Flexibility in Urgent TimesProf. Harrell began with his view of the big picture which is that efficacy is not a hypothesis but a matter of degree, hypothesis testing and thresholds hurt science, and that probabilities conditioning backward in time or information are often not actionable. He asks, ‘would we rather know the chance of making an assertion of efficacy when the treatment has no effect, or the chance the  
treatment is effective?’.

Specific to COVID-19, Prof. Harrell spoke about the fact that in the beginning little was known about the virus leading to the need for rapid decision making for researchers and DSMBs. This led to his insightful presentation on the advantages of Bayes particularly in sequential or adaptive trial designs. Using a Bayes approach in a fully sequential design, Prof. Harrell demonstrated in a simulation of 50,000 different trials that the posterior probabilities were nearly the same as proportions of trials with means greater than 0 when stopped early for efficacy or futility.  He summarized the advantages of Bayesian approaches as ability to compute probabilities on an actionable scale based on probabilities, flexibility, provides simultaneous totality of evidence, and allows experimentation until sufficient evidence has accumulated.

Prof. Harrell stated that he believes the number one hindrance in adopting Bayes is the fear of not preserving the type I error, but he explained Type I error is not an error but a probability (independent of the data) that the assertion of efficacy equal 0. Type I error is not ‘regulator’s regret’ or approving a drug that does not work, but rather an assertion of probability assuming the drug does not work.  

Questions submitted to Dr. Wittes, Prof. Senn and Prof. Harrell: 

Q: How how might we, as researchers and statisticians, better collaborate and coordinate clinical trials for COVID-19 pandemic 

A: Prof. Harrell responded that there is new initiated at the NIH, intended to function as an oversees of ongoing COVID-19 RCTs to provide dashboard findings/side-by-side. Dr. Wittes addressed this question by noting that treatment effect/benefit may present at different stages of the disease and vary according to patient population, which is important to keep in mind when interpreting results across trials. Dr. Senn mentioned that given the large number of ongoing clinical trials in COVID-19, it is hard to keep track of it all as a researcher. That said, there is an increased acknowledgement that trial replication is essential when deciding whether a treatment is effective. He also mentioned that, in general, he is less concerned about trials that are stopped early for futility than those stopped early for efficacy.  

Q: Dr. Senn then commented on Prof. Harrell’s presentation noting that it is incorrect to say that Bayes analyses do not need adapt for randomization.  

A: Prof. Harrell noted that the biggest issue his talk ignored is that of exchangeability specially issues relating to challenges with later enrolled patients in  trial having shorter follow-up compared to the first enrolled patients, shift in baseline demographics and characteristics as the trial goes one such that the earlier enrolled patients tend to be sicker than the later enrolled ones, and that there tends to be shift over the course of a trial in treatment effect due to changes in care.  

Q: Please comment on the use of propensity scores to adjust for baseline imbalances in a clinical trial. 

A: Dr. Senn: PSs for this purpose is a mistake. PSs adjust for things that are predictive of allocation but misses things regarding baseline factors. It is therefore better to model and adjust for these imbalances rather than use PSs.  

Q: Please comment on if you think that changes to the primary endpoint and analyses due to COVID-19 are appropriate.  

A: Dr. Wittes: There are certainly situations when this is necessary. What is important here is who is making the change and why. For example, change in the primary endpoint might be necessary because of new challenges in measuring the original endpoint, or changes in the analyses to account for censoring. Any changes require careful review, documentation, and description. Prof. Harrell commented that changes in the endpoint, e.g. change in level of ordinal scale, might be needed as data accumulate during the pandemic.  

Q: What are your thoughts as to how the COVID-19 pandemic will negatively impact ongoing RCTs ? 

A: Dr. Senn commented that he is more concerned about the potential for bias versus changes to the effect size. Thus, I have no reason to think that type I error rate would be impacted; however, I do anticipate loss of power in certain situations. He also commented that there is need for more public engagement with people involved with the mathematical modeling of this pandemic to better understand and communicate what the models are telling us.