Scientific publications

Development of a predictive model of venous thromboembolism recurrence in anticoagulated cancer patients using machine learning

Aug 1, 2023 | Magazine: Thrombosis Research

Andres J Muñoz  1 , Juan Carlos Souto  2 , Ramón Lecumberri  3 , Berta Obispo  4 , Antonio Sanchez  5 , Jorge Aparicio  6 , Cristina Aguayo  7 , David Gutierrez  8 , Andrés García Palomo  9 , Victor Fanjul  10 , Carlos Del Rio-Bermudez  10 , María Carmen Viñuela-Benéitez  11 , Miguel Ángel Hernández-Presa  12

Introduction: Patients with cancer and venous thromboembolism (VTE) show a high risk of VTE recurrence during anticoagulant treatment. This study aimed to develop a predictive model to assess the risk of VTE recurrence within 6 months at the moment of primary VTE diagnosis in these patients.

Materials and methods: Using the EHRead® technology, based on Natural Language Processing (NLP) and machine learning (ML), the unstructured data in electronic health records from 9 Spanish hospitals between 2014 and 2018 were extracted. Both clinically- and ML-driven feature selection were performed to identify predictors for VTE recurrence. Logistic regression (LR), decision tree (DT), and random forest (RF) algorithms were used to train different prediction models, which were subsequently validated in a hold-out data set.

Results: A total of 16,407 anticoagulated cancer patients with diagnosis of VTE were identified (54.4 % male and median age 70). Deep vein thrombosis, pulmonary embolism and metastases were observed in 67.2 %, 26.6 %, and 47.7 % of the patients, respectively. During the study follow-up, 11.4 % of the patients developed a recurrent VTE, being more frequent in patients with lung cancer. Feature selection and model training based on ML identified primary pulmonary embolism, deep vein thrombosis, metastasis, adenocarcinoma, hemoglobin and serum creatinine levels, platelet and leukocyte count, family history of VTE, and patients' age as predictors of VTE recurrence within 6 months of VTE diagnosis. The LR model had an AUC-ROC (95 % CI) of 0.66 (0.61, 0.70), the DT of 0.69 (0.65, 0.72) and the RF of 0.68 (0.63, 0.72).

Conclusions: This is the first ML-based predictive model designed to predict 6-months VTE recurrence in patients with cancer. These results hold great potential to assist clinicians to identify the high-risk patients and improve their clinical management.

CITATION  Thromb Res. 2023 Aug:228:181-188.  doi: 10.1016/j.thromres.2023.06.015

Our authors