Technology, AI and real world evidence

Technology, AI and real world evidence

Real world evidence (RWE) in medicine is the clinical evidence regarding the use and potential benefits or risks of a medical product derived from the analysis of real-world data (RWD). RWD are effectively data collected from outside of a clinical trial and that relate data to the patient health status and/or the delivery of health care. RWD is routinely collected through different digital health sources for example electronic health records (EHRs), product/disease registries, patient-generated data, medical claims/billing databases, mobile devices etc.

Increasing volumes of RWD are being produced following the development of specialist devices and sophisticated data collection techniques.  Together with technological advancements including computing power and storage, there is an opportunity for powerful artificial intelligence (AI) approaches to be applied to these data to process and provide valuable insights for patient benefit. In the context of drug development, the application of AI to RWD and subsequent generation of RWE has huge potential with examples including analysis of patient treatment pathways, risk of disease development for patients, tracking patient behaviour’s and adherence.

We can consider two aspects of AI being particularly important for RWD/RWE; natural language processing (NLP) and machine learning (ML). NLP offers an automated way to effectively process unstructured text, which is particularly useful given that large amounts of RWD is unstructured yet potentially rich in information, for example in the form of clinician notes, patient diary entries or even social media.  Processing the unstructured text in this way can be useful for many different applications including preparing the data to be used by an ML algorithm to predict an outcome or result.

ML is a computer algorithm that can build a mathematical model based on a set of training data in order to make predictions on unseen data (test data) without being explicitly programmed. There are different categories of ML including supervised (where the desired output is known) and unsupervised (where the desired output is not known) and different types of models within these categories.  The category and model employed in an ML approach is dependent upon the problem, data and constraints.

At PHASTAR we have applied NLP and ML on the MIMIC III (Medical Information Mart for Intensive Care) dataset, which is a large, single-center database comprising information relating to patients admitted to critical care units at a large tertiary hospital.  We looked at whether we could predict readmission to the critical care unit using the patient’s discharge notes; notes which consist of unstructured blocks of text. By applying simple NLP methods, we were able to represent the clinician notes in such a way that they were used as input into a machine learning algorithm. With this approach we were able to build predictive models of hospital readmission and provide initial insights into the key aspects of the text that were important to that prediction.

RWD and RWE potentially offer enormous value to drug development and healthcare and at PHASTAR we would welcome the opportunity to work with you to understand how RWD and RWE may benefit your business.