Sunday 27 July 2025
Scientists have developed a new approach to predicting clinical risks based on electronic health records (EHRs). The method, called Temporal-Hierarchical Causal Modelling with Conformal Calibration (THCM-CAL), can accurately identify patients at risk of developing serious conditions such as sepsis and pneumonia.
Traditional methods for analyzing EHRs rely on simplistic fusion strategies that ignore the complex causal relationships between different patient data. THCM-CAL, on the other hand, constructs a multimodal causal graph that takes into account both structured diagnostic codes and unstructured narrative notes from patients’ medical records.
The model uses hierarchical causal discovery to identify three clinically relevant interactions: intra-slice same-modality sequencing, intra-slice cross-modality triggers, and inter-slice risk propagation. This allows it to capture the nuanced relationships between different patient data points over time.
To enhance prediction reliability, THCM-CAL extends conformal prediction to multi-label ICD coding, calibrating per-code confidence intervals under complex co-occurrences. This ensures that the model’s predictions are not only accurate but also trustworthy and transparent.
The researchers tested THCM-CAL on two large datasets: MIMIC-III and MIMIC-IV. The results showed that the model outperformed existing approaches in predicting clinical risks, with significant improvements in areas under the receiver operating characteristic curve (AUROC), precision at 10 and 20 codes (P@10 and P@20), and recall at 10 and 20 codes (R@10 and R@20).
The implications of THCM-CAL are far-reaching. It has the potential to improve patient outcomes by enabling healthcare professionals to make more informed decisions about treatment and care. Additionally, it could reduce healthcare costs by identifying high-risk patients earlier, allowing for targeted interventions.
In a related development, researchers have also developed new thresholding strategies for calculating F1 scores in ICD code prediction tasks. These strategies aim to provide a fair comparison between different models by accounting for the varying methods used to convert model scores into binary labels.
Cite this article: “Predictive Model Enhances Clinical Risk Identification with Electronic Health Records”, The Science Archive, 2025.
Electronic Health Records, Clinical Risks, Temporal-Hierarchical Causal Modelling, Conformal Calibration, Sepsis, Pneumonia, Multi-Label Icd Coding, Receiver Operating Characteristic Curve, Healthcare Costs, Patient Outcomes