A majority of studies investigating the use of patient reported outcomes (PROs) for prognostic models in oncology report an independent and statistically significant predictive value of their inclusion for overall survival. This frequently outperforms physician recorded performance status, and generalises across cancer clinical domain as well as study methodology.
Although the evidence supporting their value for prognostication is compelling, the mechanistic relationships between PROs and these outcomes of interest are nuanced, multivariate, and frequently latent. Understanding these interactions requires the characterisation of interactions across different time scales and domains, both within and external to the health system. We also see a wide chasm between the theoretical impact of PRO-based predictive modelling and what is practically deliverable in routine care due to both data and engagement barriers, meaning that this promise is yet to be borne out in practice. It follows that the effective use of patient reported outcomes for predictive modelling, and furthermore translation of these models into care, requires a maturation of data capture, harmonisation, and analytic practices. These improvements must transcend challenges both inherent to the general use of routinely collected observational data to drive care, as well as specific to the integration of PROs in clinical workflows and systems.
In this session we will present generalised requirements for the data supply-chain specific to the use of PROs in machine learning and AI models. We will specifically address issues affected by the dual models of PRO capture (i.e. prospectively captured registry data vs. real-time clinically deployed instruments) and their respective use-cases. We will also detail the relationship between patient reported variables and the content of clinical letters, notes, and extended naturalistic data capture through the advancements in natural language processing functionality.