Skip to main content

Hybrid Methodology Based on Bayesian Optimization and GA-PARSIMONY for Searching Parsimony Models by Combining Hyperparameter Optimization and Feature Selection

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2017)

Abstract

This paper presents a hybrid methodology that combines Bayesian Optimization (BO) with a constrained version of the GA-PARSIMONY method to obtain parsimonious models. The proposal is designed to reduce the computational efforts associated to the use of GA-PARSIMONY alone. The method is initialized with BO to obtain favorable initial model parameters. With these parameters, a constrained GA-PARSIMONY is implemented to generate accurate parsimonious models using feature reduction, data transformation and parsimonious model selection. Finally, a second BO is run again with the selected features. Experiments with Extreme Gradient Boosting Machines (XGBoost) and six UCI databases demonstrate that the hybrid methodology obtains analogous models than the GA-PARSIMONY but with a significant reduction on the execution time in five of the six datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Antonanzas-Torres, F., Urraca, R., Antonanzas, J., Fernandez-Ceniceros, J., de Pison, F.M.: Generation of daily global solar irradiation with support vector machines for regression. Energy Convers. Manag. 96, 277–286 (2015)

    Article  Google Scholar 

  2. Bergstra, J., Komer, B., Eliasmith, C., Yamins, D., Cox, D.D.: Hyperopt: a python library for model selection and hyperparameter optimization. Comput. Sci. Discov. 8(1), 014008 (2015)

    Article  Google Scholar 

  3. Bischl, B., Lang, M., Kotthoff, L., Schiffner, J., Richter, J., Studerus, E., Casalicchio, G., Jones, Z.M.: MLR: machine learning in r. J. Mach. Learn. Res. 17(170), 1–5 (2016)

    MATH  MathSciNet  Google Scholar 

  4. Caamaño, P., Bellas, F., Becerra, J.A., Duro, R.J.: Evolutionary algorithm characterization in real parameter optimization problems. Appl. Soft Comput. 13(4), 1902–1921 (2013)

    Article  Google Scholar 

  5. Chen, N., Ribeiro, B., Vieira, A., Duarte, J., Neves, J.C.: A genetic algorithm-based approach to cost-sensitive bankruptcy prediction. Expert Syst. Appl. 38(10), 12939–12945 (2011)

    Article  Google Scholar 

  6. Chen, T., He, T., Benesty, M.: xgboost: extreme gradient boosting (2015). https://github.com/dmlc/xgboost, rpackageversion0.4-3

  7. Corchado, E., Wozniak, M., Abraham, A., de Carvalho, A.C.P.L.F., Snásel, V.: Recent trends in intelligent data analysis. Neurocomputing 126, 1–2 (2014)

    Article  Google Scholar 

  8. Dhiman, R., Saini, J.: Priyanka: genetic algorithms tuned expert model for detection of epileptic seizures from EEG signatures. Appl. Soft Comput. 19, 8–17 (2014)

    Article  Google Scholar 

  9. Ding, S.: Spectral and wavelet-based feature selection with particle swarm optimization for hyperspectral classification. J. Softw. 6(7), 1248–1256 (2011)

    Article  Google Scholar 

  10. Fernandez-Ceniceros, J., Sanz-Garcia, A., Antonanzas-Torres, F., de Pison, F.M.: A numerical-informational approach for characterising the ductile behaviour of the t-stub component. part 2: Parsimonious soft-computing-based metamodel. Eng. Struct. 82, 249–260 (2015)

    Article  Google Scholar 

  11. Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  12. Gorissen, D., Couckuyt, I., Demeester, P., Dhaene, T., Crombecq, K.: A surrogate modeling and adaptive sampling toolbox for computer based design. J. Mach. Learn. Res. 11, 2051–2055 (2010)

    Google Scholar 

  13. Hashem, I.A., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Ullah Khan, S.: The rise of big data on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)

    Article  Google Scholar 

  14. Huang, C.L., Dun, J.F.: A distributed PSO-SVM hybrid system with feature selection and parameter optimization. Appl. Soft Comput. 8(4), 1381–1391 (2008)

    Article  Google Scholar 

  15. Huang, C.J., Chen, Y.J., Chen, H.M., Jian, J.J., Tseng, S.C., Yang, Y.J., Hsu, P.A.: Intelligent feature extraction and classification of anuran vocalizations. Appl. Soft Comput. 19, 1–7 (2014)

    Article  Google Scholar 

  16. Michalewicz, Z., Janikow, C.Z.: Handling constraints in genetic algorithms. In: ICGA, pp. 151–157 (1991)

    Google Scholar 

  17. Olson, R.S., Bartley, N., Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO 2016, NY, USA, pp. 485–492. ACM, New York (2016)

    Google Scholar 

  18. Perner, P.: Improving the accuracy of decision tree induction by feature preselection. Appl. Artif. Intell. 15(8), 747–760 (2001)

    Article  Google Scholar 

  19. Martinez-de Pison, F.J., Fraile-Garcia, E., Ferreiro-Cabello, J., Gonzalez, R., Pernia, A.: Searching parsimonious solutions with GA-PARSIMONY and XGBoost in high-dimensional databases, pp. 201–210. Springer International Publishing, Cham (2017)

    Google Scholar 

  20. Core Team, R.: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2013)

    Google Scholar 

  21. Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005)

    Google Scholar 

  22. Reif, M., Shafait, F., Dengel, A.: Meta-learning for evolutionary parameter optimization of classifiers. Mach. Learn. 87(3), 357–380 (2012)

    Article  MathSciNet  Google Scholar 

  23. Sanz-Garcia, A., Fernandez-Ceniceros, J., Antonanzas-Torres, F., Pernia-Espinoza, A., Martinez-de Pison, F.J.: GA-PARSIMONY: a GA-SVR approach with feature selection and parameter optimization to obtain parsimonious solutions for predicting temperature settings in a continuous annealing furnace. Appl. Soft Comput. 35, 13–28 (2015)

    Article  Google Scholar 

  24. Sanz-Garcia, A., Fernández-Ceniceros, J., Fernández-Martínez, R., Martínez-De-Pisón, F.J.: Methodology based on genetic optimisation to develop overall parsimony models for predicting temperature settings on annealing furnace. Ironmak. Steelmak. 41(2), 87–98 (2014)

    Article  Google Scholar 

  25. Sanz-García, A., Fernández-Ceniceros, J., Antoñanzas-Torres, F., Martínez-de Pisón, F.J.: Parsimonious support vector machines modelling for set points in industrial processes based on genetic algorithm optimization. In: International Joint Conference SOCO13-CISIS13-ICEUTE13, Advances in Intelligent Systems and Computing, vol. 239, pp. 1–10. Springer International Publishing, Heidelberg (2014)

    Google Scholar 

  26. Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Technical report, Universities of Harvard, Oxford, Toronto, and Google DeepMind (2015)

    Google Scholar 

  27. Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 2951–2959. Curran Associates Inc., Red Hook (2012)

    Google Scholar 

  28. Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Gaussian process bandits without regret: an experimental design approach (2009). CoRR arXiv:abs/0912.3995

  29. Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, NY, USA. ACM, New York (2013)

    Google Scholar 

  30. Urraca, R., Sanz-Garcia, A., Fernandez-Ceniceros, J., Sodupe-Ortega, E., Martinez-de-Pison, F.J.: Improving hotel room demand forecasting with a hybrid GA-SVR methodology based on skewed data transformation, feature selection and parsimony tuning. In: Onieva, E., Santos, I., Osaba, E., Quintián, H., Corchado, E. (eds.) HAIS 2015. LNCS (LNAI), vol. 9121, pp. 632–643. Springer, Cham (2015). doi:10.1007/978-3-319-19644-2_52

    Chapter  Google Scholar 

  31. Vieira, S.M., Mendonza, L.F., Farinha, G.J., Sousa, J.M.: Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients. Appl. Softw. Comput. 13(8), 3494–3504 (2013)

    Article  Google Scholar 

  32. Winkler, S.M., Affenzeller, M., Kronberger, G., Kommenda, M., Wagner, S., Jacak, W., Stekel, H.: Analysis of selected evolutionary algorithms in feature selection and parameter optimization for data based tumor marker modeling. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2011. LNCS, vol. 6927, pp. 335–342. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27549-4_43

    Chapter  Google Scholar 

  33. Xue, B., Zhang, M., Browne, W.N.: Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl. Soft Comput. 18, 261–276 (2014)

    Article  Google Scholar 

Download references

Acknowledgements

We are greatly indebted to Banco Santander for the APPI16/05 fellowship and to the University of La Rioja for the EGI16/19 fellowship. This work used the Beronia cluster (Universidad de La Rioja), which is supported by FEDER-MINECO grant number UNLR-094E-2C-225.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco Javier Martinez-de-Pison .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Martinez-de-Pison, F.J., Gonzalez-Sendino, R., Aldama, A., Ferreiro, J., Fraile, E. (2017). Hybrid Methodology Based on Bayesian Optimization and GA-PARSIMONY for Searching Parsimony Models by Combining Hyperparameter Optimization and Feature Selection. In: Martínez de Pisón, F., Urraca, R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2017. Lecture Notes in Computer Science(), vol 10334. Springer, Cham. https://doi.org/10.1007/978-3-319-59650-1_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-59650-1_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-59649-5

  • Online ISBN: 978-3-319-59650-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics