Unbiased data analytic strategies to improve biomarker discovery in precision medicine.


Endocrine and Diabetes Platform, Department of Physiology, University of Toronto, Medical Sciences Building, Room 3352, 1 King's College Circle, Toronto, ON M5S 1A8, Canada; Advanced Diagnostics, Metabolism, Toronto General Hospital Research Institute, Toronto, ON, Canada. Electronic address: [Email]


Omics technologies promised improved biomarker discovery for precision medicine. The foremost problem of discovered biomarkers is irreproducibility between patient cohorts. From a data analytics perspective, the main reason for these failures is bias in statistical approaches and overfitting resulting from batch effects and confounding factors. The keys to reproducible biomarker discovery are: proper study design, unbiased data preprocessing and quality control analyses, and a knowledgeable application of statistics and machine learning algorithms. In this review, we discuss study design and analysis considerations and suggest standards from an expert point-of-view to promote unbiased decision-making in biomarker discovery in precision medicine.