Text mining and word embedding for classification of decision making variables in breast cancer surgery
G Catanuto 1 , N Rocco 2 , A Maglia 3 , P Barry 4 , A Karakatsanis 5 , G Sgroi 6 , G Russo 7 , F Pappalardo 7 , M B Nava 8 , ETHOS Collaborative Group
Introduction: Decision making in surgical oncology of the breast has increased its complexity over the last twenty years. This Delphi survey investigates the opinion of an expert panel about the decision making process in surgical procedures on the breast for oncological purposes.
Methods: Twenty-seven experts were invited to partake into a Delphi Survey. At the first round they have been asked to provide a list of features involved in the decision making process (patient's characteristics; disease characteristics; surgical techniques, outcomes) and comment on it. Using text-mining techniques we extracted a list of mono-bi-trigrams potentially representative of decision drivers.
A technique of "natural language processing" called Word2vec was used to validate changes to texts using synonyms and plesionyms. Word2Vec was also used to test the semantic relevance of n-grams within a corpus of knowledge made up of books edited by panel members. The final list of variables extracted was submitted to the judgement of the panel for final validation at the second round of the Delphi using closed ended questions.
Results: 52 features out of 59 have been approved by the panel. The overall consensus was 87.1%
CONCLUSIONS: Text mining and natural language processing allowed the extraction of a number of decision drivers and outcomes as part of the decision making process in surgical oncology on the breast. This result was obtained transforming narrative texts into structured data. The high level of consensus among experts provided validation to this process.