Prospective validation of a new imaging scorecard to assess leptomeningeal metastasis: a joint EORTC BTG and RANO effort
Emilie Le Rhun 1 2 , Patrick Devos 3 , Sebastian Winklhofer 4 , Hafida Lmalen 5 , Dieta Brandsma 6 , Priya Kumthekar 7 , Antonella Castellano 8 , Annette Compter 6 , Frederic Dhermain 9 , Enrico Franceschi 10 , Peter Forsyth 11 , Julia Furtner 12 , Norbert Galldiks 13 , Jaime Gállego Pérez-Larraya 14 , Jens Gempt 15 , Elke Hattingen 16 , Johann Martin Hempel 17 , Slavka Lukacova 18 , Giuseppe Minniti 19 , Barbara O'Brien 20 , Tjeerd J Postma 21 , Patrick Roth 1 , Roberta Rudà 22 , Niklas Schaefer 23 , Nils O Schmidt 24 , Tom J Snijders 25 , Steffi Thust 26 , Martin van den Bent 27 , Anouk van der Hoorn 28 , Guillaume Vogin 29 , Marion Smits 27 30 , Joerg C Tonn 31 , Kurt Jaeckle 32 , Matthias Preusser 33 , Michael Glantz 34 , Patrick Y Wen 35 , Martin Bendzsus 36 , Michael Weller 1
Background: Validation of the 2016 RANO MRI scorecard for leptomeningeal metastasis failed for multiple reasons. Accordingly, this joint EORTC Brain Tumor Group and RANO effort sought to prospectively validate a revised MRI scorecard for response assessment in leptomeningeal metastasis.
Methods: Coded paired cerebrospinal MRI of 20 patients with leptomeningeal metastases from solid cancers at baseline and follow-up after treatment and instructions for assessment were provided via the EORTC imaging platform. The Kappa coefficient was used to evaluate the inter-observer pairwise agreement.
Results: Thirty-five raters participated, including 9 neuroradiologists, 17 neurologists, 4 radiation oncologists, 3 neurosurgeons and 2 medical oncologists. Among single leptomeningeal metastases-related imaging findings at baseline, the best median concordance was noted for hydrocephalus (Kappa=0.63), and the worst median concordance for spinal linear enhancing disease (Kappa=0.46).
The median concordance of raters for the overall response assessment was moderate (Kappa=0.44). Notably, the interobserver agreement for the presence of parenchymal brain metastases at baseline was fair (Kappa=0.29) and virtually absent for their response to treatment. 394 of 700 ratings (20 patients x 35 raters, 56%) were fully completed. In 308 of 394 fully completed ratings (78%), the overall response assessment perfectly matched the summary interpretation of the single ratings as proposed in the scorecard instructions.
Conclusion: This study confirms the principle utility of the new scorecard, but also indicates the need for training of MRI assessment with a dedicated reviewer panel in clinical trials. Electronic case report forms with "blocking options" may be required to enforce completeness and quality of scoring.