Hybrid Methodology Based on Bayesian Optimization and GA-PARSIMONY for Searching Parsimony Models by Combining Hyperparameter Optimization and Feature Selection

Martinez-de-Pison, Francisco Javier; Gonzalez-Sendino, Ruben; Aldama, Alvaro; Ferreiro, Javier; Fraile, Esteban

doi:10.1007/978-3-319-59650-1_5

Francisco Javier Martinez-de-Pison¹⁷,
Ruben Gonzalez-Sendino¹⁷,
Alvaro Aldama¹⁷,
Javier Ferreiro¹⁷ &
…
Esteban Fraile¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10334))

Included in the following conference series:

International Conference on Hybrid Artificial Intelligence Systems

2620 Accesses

Abstract

This paper presents a hybrid methodology that combines Bayesian Optimization (BO) with a constrained version of the GA-PARSIMONY method to obtain parsimonious models. The proposal is designed to reduce the computational efforts associated to the use of GA-PARSIMONY alone. The method is initialized with BO to obtain favorable initial model parameters. With these parameters, a constrained GA-PARSIMONY is implemented to generate accurate parsimonious models using feature reduction, data transformation and parsimonious model selection. Finally, a second BO is run again with the selected features. Experiments with Extreme Gradient Boosting Machines (XGBoost) and six UCI databases demonstrate that the hybrid methodology obtains analogous models than the GA-PARSIMONY but with a significant reduction on the execution time in five of the six datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Antonanzas-Torres, F., Urraca, R., Antonanzas, J., Fernandez-Ceniceros, J., de Pison, F.M.: Generation of daily global solar irradiation with support vector machines for regression. Energy Convers. Manag. 96, 277–286 (2015)
Article Google Scholar
Bergstra, J., Komer, B., Eliasmith, C., Yamins, D., Cox, D.D.: Hyperopt: a python library for model selection and hyperparameter optimization. Comput. Sci. Discov. 8(1), 014008 (2015)
Article Google Scholar
Bischl, B., Lang, M., Kotthoff, L., Schiffner, J., Richter, J., Studerus, E., Casalicchio, G., Jones, Z.M.: MLR: machine learning in r. J. Mach. Learn. Res. 17(170), 1–5 (2016)
MATH MathSciNet Google Scholar
Caamaño, P., Bellas, F., Becerra, J.A., Duro, R.J.: Evolutionary algorithm characterization in real parameter optimization problems. Appl. Soft Comput. 13(4), 1902–1921 (2013)
Article Google Scholar
Chen, N., Ribeiro, B., Vieira, A., Duarte, J., Neves, J.C.: A genetic algorithm-based approach to cost-sensitive bankruptcy prediction. Expert Syst. Appl. 38(10), 12939–12945 (2011)
Article Google Scholar
Chen, T., He, T., Benesty, M.: xgboost: extreme gradient boosting (2015). https://github.com/dmlc/xgboost, rpackageversion0.4-3
Corchado, E., Wozniak, M., Abraham, A., de Carvalho, A.C.P.L.F., Snásel, V.: Recent trends in intelligent data analysis. Neurocomputing 126, 1–2 (2014)
Article Google Scholar
Dhiman, R., Saini, J.: Priyanka: genetic algorithms tuned expert model for detection of epileptic seizures from EEG signatures. Appl. Soft Comput. 19, 8–17 (2014)
Article Google Scholar
Ding, S.: Spectral and wavelet-based feature selection with particle swarm optimization for hyperspectral classification. J. Softw. 6(7), 1248–1256 (2011)
Article Google Scholar
Fernandez-Ceniceros, J., Sanz-Garcia, A., Antonanzas-Torres, F., de Pison, F.M.: A numerical-informational approach for characterising the ductile behaviour of the t-stub component. part 2: Parsimonious soft-computing-based metamodel. Eng. Struct. 82, 249–260 (2015)
Article Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
Article MATH MathSciNet Google Scholar
Gorissen, D., Couckuyt, I., Demeester, P., Dhaene, T., Crombecq, K.: A surrogate modeling and adaptive sampling toolbox for computer based design. J. Mach. Learn. Res. 11, 2051–2055 (2010)
Google Scholar
Hashem, I.A., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Ullah Khan, S.: The rise of big data on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)
Article Google Scholar
Huang, C.L., Dun, J.F.: A distributed PSO-SVM hybrid system with feature selection and parameter optimization. Appl. Soft Comput. 8(4), 1381–1391 (2008)
Article Google Scholar
Huang, C.J., Chen, Y.J., Chen, H.M., Jian, J.J., Tseng, S.C., Yang, Y.J., Hsu, P.A.: Intelligent feature extraction and classification of anuran vocalizations. Appl. Soft Comput. 19, 1–7 (2014)
Article Google Scholar
Michalewicz, Z., Janikow, C.Z.: Handling constraints in genetic algorithms. In: ICGA, pp. 151–157 (1991)
Google Scholar
Olson, R.S., Bartley, N., Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO 2016, NY, USA, pp. 485–492. ACM, New York (2016)
Google Scholar
Perner, P.: Improving the accuracy of decision tree induction by feature preselection. Appl. Artif. Intell. 15(8), 747–760 (2001)
Article Google Scholar
Martinez-de Pison, F.J., Fraile-Garcia, E., Ferreiro-Cabello, J., Gonzalez, R., Pernia, A.: Searching parsimonious solutions with GA-PARSIMONY and XGBoost in high-dimensional databases, pp. 201–210. Springer International Publishing, Cham (2017)
Google Scholar
Core Team, R.: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2013)
Google Scholar
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005)
Google Scholar
Reif, M., Shafait, F., Dengel, A.: Meta-learning for evolutionary parameter optimization of classifiers. Mach. Learn. 87(3), 357–380 (2012)
Article MathSciNet Google Scholar
Sanz-Garcia, A., Fernandez-Ceniceros, J., Antonanzas-Torres, F., Pernia-Espinoza, A., Martinez-de Pison, F.J.: GA-PARSIMONY: a GA-SVR approach with feature selection and parameter optimization to obtain parsimonious solutions for predicting temperature settings in a continuous annealing furnace. Appl. Soft Comput. 35, 13–28 (2015)
Article Google Scholar
Sanz-Garcia, A., Fernández-Ceniceros, J., Fernández-Martínez, R., Martínez-De-Pisón, F.J.: Methodology based on genetic optimisation to develop overall parsimony models for predicting temperature settings on annealing furnace. Ironmak. Steelmak. 41(2), 87–98 (2014)
Article Google Scholar
Sanz-García, A., Fernández-Ceniceros, J., Antoñanzas-Torres, F., Martínez-de Pisón, F.J.: Parsimonious support vector machines modelling for set points in industrial processes based on genetic algorithm optimization. In: International Joint Conference SOCO13-CISIS13-ICEUTE13, Advances in Intelligent Systems and Computing, vol. 239, pp. 1–10. Springer International Publishing, Heidelberg (2014)
Google Scholar
Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Technical report, Universities of Harvard, Oxford, Toronto, and Google DeepMind (2015)
Google Scholar
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 2951–2959. Curran Associates Inc., Red Hook (2012)
Google Scholar
Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Gaussian process bandits without regret: an experimental design approach (2009). CoRR arXiv:abs/0912.3995
Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, NY, USA. ACM, New York (2013)
Google Scholar
Urraca, R., Sanz-Garcia, A., Fernandez-Ceniceros, J., Sodupe-Ortega, E., Martinez-de-Pison, F.J.: Improving hotel room demand forecasting with a hybrid GA-SVR methodology based on skewed data transformation, feature selection and parsimony tuning. In: Onieva, E., Santos, I., Osaba, E., Quintián, H., Corchado, E. (eds.) HAIS 2015. LNCS (LNAI), vol. 9121, pp. 632–643. Springer, Cham (2015). doi:10.1007/978-3-319-19644-2_52
Chapter Google Scholar
Vieira, S.M., Mendonza, L.F., Farinha, G.J., Sousa, J.M.: Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients. Appl. Softw. Comput. 13(8), 3494–3504 (2013)
Article Google Scholar
Winkler, S.M., Affenzeller, M., Kronberger, G., Kommenda, M., Wagner, S., Jacak, W., Stekel, H.: Analysis of selected evolutionary algorithms in feature selection and parameter optimization for data based tumor marker modeling. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2011. LNCS, vol. 6927, pp. 335–342. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27549-4_43
Chapter Google Scholar
Xue, B., Zhang, M., Browne, W.N.: Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl. Soft Comput. 18, 261–276 (2014)
Article Google Scholar

Download references

Acknowledgements

We are greatly indebted to Banco Santander for the APPI16/05 fellowship and to the University of La Rioja for the EGI16/19 fellowship. This work used the Beronia cluster (Universidad de La Rioja), which is supported by FEDER-MINECO grant number UNLR-094E-2C-225.

Author information

Authors and Affiliations

EDMANS Group, University of La Rioja, Logroño, Spain
Francisco Javier Martinez-de-Pison, Ruben Gonzalez-Sendino, Alvaro Aldama, Javier Ferreiro & Esteban Fraile

Authors

Francisco Javier Martinez-de-Pison
View author publications
You can also search for this author in PubMed Google Scholar
Ruben Gonzalez-Sendino
View author publications
You can also search for this author in PubMed Google Scholar
Alvaro Aldama
View author publications
You can also search for this author in PubMed Google Scholar
Javier Ferreiro
View author publications
You can also search for this author in PubMed Google Scholar
Esteban Fraile
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Francisco Javier Martinez-de-Pison .

Editor information

Editors and Affiliations

University of La Rioja , Logroño, La Rioja, Spain
Francisco Javier Martínez de Pisón
University of La Rioja , Logroño, La Rioja, Spain
Rubén Urraca
University of A Coruña , Ferrol, La Coruña, Spain
Héctor Quintián
University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Martinez-de-Pison, F.J., Gonzalez-Sendino, R., Aldama, A., Ferreiro, J., Fraile, E. (2017). Hybrid Methodology Based on Bayesian Optimization and GA-PARSIMONY for Searching Parsimony Models by Combining Hyperparameter Optimization and Feature Selection. In: Martínez de Pisón, F., Urraca, R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2017. Lecture Notes in Computer Science(), vol 10334. Springer, Cham. https://doi.org/10.1007/978-3-319-59650-1_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-59650-1_5
Published: 02 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59649-5
Online ISBN: 978-3-319-59650-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hybrid Methodology Based on Bayesian Optimization and GA-PARSIMONY for Searching Parsimony Models by Combining Hyperparameter Optimization and Feature Selection