Top ten errors of statistical analysis in observational studies for cancer research
A Carmona-Bayonas 1 , P Jimenez-Fonseca 2 , A Fernández-Somoano 3 4 , F Álvarez-Manceñido 5 , E Castañón 6 , A Custodio 7 , F A de la Peña 8 , R M Payo 9 , L P Valiente 10
Observational studies using registry data make it possible to compile quality information and can surpass clinical trials in some contexts. However, data heterogeneity, analytical complexity, and the diversity of aspects to be taken into account when interpreting results makes it easy for mistakes to be made and calls for mastery of statistical methodology.
Some questionable research practices that include poor analytical data management are responsible for the low reproducibility of some results; yet, there is a paucity of information in the literature regarding specific statistical pitfalls of cancer studies.
In addition to proposing how to avoid or solve them, this article seeks to expose ten common problematic situations in the analysis of cancer registries: convenience, dichotomization, stratification, regression to the mean, impact of sample size, competing risks, immortal time and survivor bias, management of missing values, and data dredging.