Abstract
Machine learning (ML) is an analytical approach that has been on increasing importance in this field. In this chapter, we would like to highlight the use of ML for disease risk prediction and prognosis to identify the scope of successful applications to date. Despite the enthusiasm, we feel that the evaluation of ML methods in real data sets has been limited thus far. We also feel that machine learning approaches can serve as methods of choice for the integration of the ever more complex data sets being generated in the era of next-generation sequencing.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bureau A, Dupuis J, Falls K, et al. Identifying SNPs predictive of phenotype using random forests. Genet Epidemiol. 2005;28:171–82.
Cortes C, Vapnik V. Support vector networks. Mach Learn. 1995;20:273–97.
Cruz JA, Wishart DS. Applications of machine learning in cancer prediction and prognosis. Cancer Informat. 2006;2:59–77.
Fernald GH, Capriotti E, Daneshjou R, Karczewski KJ, Altman RB. Bioinformatics challenges for personalized medicine. Bioinformatics. 2011;27(13):1741–8.
Jiao Y, Chen R, Ke X, Cheng L, Chu K, Lu Z, Herskovits EH. Single nucleotide polymorphisms predict symptom severity of autism spectrum disorder. J Autism Dev Disord. 2012;42(6):971–83.
Listgarten J, Damaraju S, Poulin B, et al. Predictive models for breast cancer susceptibility from single nucleotide polymorphisms. Clin Cancer Res. 2004;10:2725–37.
Moore JH, Williams SM. Epistasis and its implications for personal genetics. Am J Hum Genet. 2009;85:309–20.
Moore JH, Asselbergs FW, William SM. Bioinformatics challenges for genome-wide association studies. Bioinformatics. 2010;26(4):445–56.
Motsinger-Reif A, Dudek SM, Hahn LW, et al. Comparison of approaches for machine-learningoptimization of neural networks for detecting gene-gene interactions in genetic epidemiology. Genet Epidemiol. 2008;32:325–40.
Ritchie MD, Hahn LW, Roodi N, et al. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet. 2001;69:138–47.
Schulte JH, Schowe B, Mestdagh P, Kaderali L, Kalaghatgi P, Schlierf S, Vermeulen J, Brockmeyer B, Pajtler K, Thor T, de Preter K, Speleman F, Morik K, Eggert A, Vandesompele J, Schramm A. Accurate prediction of neuroblastoma outcome based on miRNA expression profiles. Int J Cancer. 2010;127(10):2374–85.
Somorjai RL, Nikulin A. The curse of small sample sizes in medical diagnosis via MR spectroscopy. In: Proceedings of the society for magnetic resonance in medicine. Twelfth annual scientific meeting, New York; 1993. pp. 685.
Somorjai RL, Dolenko B, Baumgartner R. Class prediction and discovery using gene microarray and proteomics mass spectroscopy data: curses, caveats, cautions. Bioinformatics. 2003;19:1484–91.
Szymczak S, Biernacka JM, Cordell HJ, González-Recio O, König IR, Zhang H, Sun YV. Machine learning in genome-wide association studies. Genet Epidemiol. 2009;33:S51–7.
Upstll-Goddard R, Eccles D, Fliege J, Collins A. Machine learning approaches for the discovery of gene-gene interactions in disease data. Brief Bioinform. 2012;14:251. https://doi.org/10.1093/bib/bbs024.
Wan XB, Zhao Y, Fan XJ, Cai HM, Zhang Y, Chen MY, Xu J, Wu XY, Li HB, Zeng YX, Hong MH, Liu QT. Molecular prognostic prediction for locally advanced nasopharyngeal carcinoma by support vector machine integrated approach. PLoS One. 2012;7(3):e31989.
Wang Y, Li Y, Cao H, Xiong M, Shugart YY, Jin L. Efficient test for nonlinear dependence of two continuous variables. BMC Bioinformatics. 2015;16(1):260. https://doi.org/10.1186/s12859-015-0697-7.
Yu W, Valdez R, Gwinn M, Khoury MJ. Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes. BMC Med Inform Decis Mak. 2010;10:16.
Zeggini E, Scott LJ, Saxena R, Voight BF, Marchini JL, Hu T, et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat Genet. 2008;40(5):638–45.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Collins, A., Yao, Y. (2018). Machine Learning Approaches: Data Integration for Disease Prediction and Prognosis. In: Yao, Y. (eds) Applied Computational Genomics. Translational Bioinformatics, vol 13. Springer, Singapore. https://doi.org/10.1007/978-981-13-1071-3_10
Download citation
DOI: https://doi.org/10.1007/978-981-13-1071-3_10
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-1070-6
Online ISBN: 978-981-13-1071-3
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)