Skip to main content

Machine Learning Approaches: Data Integration for Disease Prediction and Prognosis

  • Chapter
  • First Online:
Applied Computational Genomics

Part of the book series: Translational Bioinformatics ((TRBIO,volume 13))

Abstract

Machine learning (ML) is an analytical approach that has been on increasing importance in this field. In this chapter, we would like to highlight the use of ML for disease risk prediction and prognosis to identify the scope of successful applications to date. Despite the enthusiasm, we feel that the evaluation of ML methods in real data sets has been limited thus far. We also feel that machine learning approaches can serve as methods of choice for the integration of the ever more complex data sets being generated in the era of next-generation sequencing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  • Bureau A, Dupuis J, Falls K, et al. Identifying SNPs predictive of phenotype using random forests. Genet Epidemiol. 2005;28:171–82.

    Article  PubMed  Google Scholar 

  • Cortes C, Vapnik V. Support vector networks. Mach Learn. 1995;20:273–97.

    Google Scholar 

  • Cruz JA, Wishart DS. Applications of machine learning in cancer prediction and prognosis. Cancer Informat. 2006;2:59–77.

    Article  Google Scholar 

  • Fernald GH, Capriotti E, Daneshjou R, Karczewski KJ, Altman RB. Bioinformatics challenges for personalized medicine. Bioinformatics. 2011;27(13):1741–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jiao Y, Chen R, Ke X, Cheng L, Chu K, Lu Z, Herskovits EH. Single nucleotide polymorphisms predict symptom severity of autism spectrum disorder. J Autism Dev Disord. 2012;42(6):971–83.

    Article  PubMed  PubMed Central  Google Scholar 

  • Listgarten J, Damaraju S, Poulin B, et al. Predictive models for breast cancer susceptibility from single nucleotide polymorphisms. Clin Cancer Res. 2004;10:2725–37.

    Article  CAS  PubMed  Google Scholar 

  • Moore JH, Williams SM. Epistasis and its implications for personal genetics. Am J Hum Genet. 2009;85:309–20.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Moore JH, Asselbergs FW, William SM. Bioinformatics challenges for genome-wide association studies. Bioinformatics. 2010;26(4):445–56.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Motsinger-Reif A, Dudek SM, Hahn LW, et al. Comparison of approaches for machine-learningoptimization of neural networks for detecting gene-gene interactions in genetic epidemiology. Genet Epidemiol. 2008;32:325–40.

    Article  PubMed  Google Scholar 

  • Ritchie MD, Hahn LW, Roodi N, et al. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet. 2001;69:138–47.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Schulte JH, Schowe B, Mestdagh P, Kaderali L, Kalaghatgi P, Schlierf S, Vermeulen J, Brockmeyer B, Pajtler K, Thor T, de Preter K, Speleman F, Morik K, Eggert A, Vandesompele J, Schramm A. Accurate prediction of neuroblastoma outcome based on miRNA expression profiles. Int J Cancer. 2010;127(10):2374–85.

    Article  CAS  PubMed  Google Scholar 

  • Somorjai RL, Nikulin A. The curse of small sample sizes in medical diagnosis via MR spectroscopy. In: Proceedings of the society for magnetic resonance in medicine. Twelfth annual scientific meeting, New York; 1993. pp. 685.

    Google Scholar 

  • Somorjai RL, Dolenko B, Baumgartner R. Class prediction and discovery using gene microarray and proteomics mass spectroscopy data: curses, caveats, cautions. Bioinformatics. 2003;19:1484–91.

    Article  CAS  PubMed  Google Scholar 

  • Szymczak S, Biernacka JM, Cordell HJ, González-Recio O, König IR, Zhang H, Sun YV. Machine learning in genome-wide association studies. Genet Epidemiol. 2009;33:S51–7.

    Article  PubMed  Google Scholar 

  • Upstll-Goddard R, Eccles D, Fliege J, Collins A. Machine learning approaches for the discovery of gene-gene interactions in disease data. Brief Bioinform. 2012;14:251. https://doi.org/10.1093/bib/bbs024.

    Article  CAS  Google Scholar 

  • Wan XB, Zhao Y, Fan XJ, Cai HM, Zhang Y, Chen MY, Xu J, Wu XY, Li HB, Zeng YX, Hong MH, Liu QT. Molecular prognostic prediction for locally advanced nasopharyngeal carcinoma by support vector machine integrated approach. PLoS One. 2012;7(3):e31989.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Wang Y, Li Y, Cao H, Xiong M, Shugart YY, Jin L. Efficient test for nonlinear dependence of two continuous variables. BMC Bioinformatics. 2015;16(1):260. https://doi.org/10.1186/s12859-015-0697-7.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Yu W, Valdez R, Gwinn M, Khoury MJ. Application of support vector machine modeling for prediction of common diseases: the case of diabetes and pre-diabetes. BMC Med Inform Decis Mak. 2010;10:16.

    Article  PubMed  PubMed Central  Google Scholar 

  • Zeggini E, Scott LJ, Saxena R, Voight BF, Marchini JL, Hu T, et al. Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes. Nat Genet. 2008;40(5):638–45.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yin Yao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Collins, A., Yao, Y. (2018). Machine Learning Approaches: Data Integration for Disease Prediction and Prognosis. In: Yao, Y. (eds) Applied Computational Genomics. Translational Bioinformatics, vol 13. Springer, Singapore. https://doi.org/10.1007/978-981-13-1071-3_10

Download citation

Publish with us

Policies and ethics