Abstract
This paper presents a hybrid methodology that combines Bayesian Optimization (BO) with a constrained version of the GA-PARSIMONY method to obtain parsimonious models. The proposal is designed to reduce the computational efforts associated to the use of GA-PARSIMONY alone. The method is initialized with BO to obtain favorable initial model parameters. With these parameters, a constrained GA-PARSIMONY is implemented to generate accurate parsimonious models using feature reduction, data transformation and parsimonious model selection. Finally, a second BO is run again with the selected features. Experiments with Extreme Gradient Boosting Machines (XGBoost) and six UCI databases demonstrate that the hybrid methodology obtains analogous models than the GA-PARSIMONY but with a significant reduction on the execution time in five of the six datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Antonanzas-Torres, F., Urraca, R., Antonanzas, J., Fernandez-Ceniceros, J., de Pison, F.M.: Generation of daily global solar irradiation with support vector machines for regression. Energy Convers. Manag. 96, 277–286 (2015)
Bergstra, J., Komer, B., Eliasmith, C., Yamins, D., Cox, D.D.: Hyperopt: a python library for model selection and hyperparameter optimization. Comput. Sci. Discov. 8(1), 014008 (2015)
Bischl, B., Lang, M., Kotthoff, L., Schiffner, J., Richter, J., Studerus, E., Casalicchio, G., Jones, Z.M.: MLR: machine learning in r. J. Mach. Learn. Res. 17(170), 1–5 (2016)
Caamaño, P., Bellas, F., Becerra, J.A., Duro, R.J.: Evolutionary algorithm characterization in real parameter optimization problems. Appl. Soft Comput. 13(4), 1902–1921 (2013)
Chen, N., Ribeiro, B., Vieira, A., Duarte, J., Neves, J.C.: A genetic algorithm-based approach to cost-sensitive bankruptcy prediction. Expert Syst. Appl. 38(10), 12939–12945 (2011)
Chen, T., He, T., Benesty, M.: xgboost: extreme gradient boosting (2015). https://github.com/dmlc/xgboost, rpackageversion0.4-3
Corchado, E., Wozniak, M., Abraham, A., de Carvalho, A.C.P.L.F., Snásel, V.: Recent trends in intelligent data analysis. Neurocomputing 126, 1–2 (2014)
Dhiman, R., Saini, J.: Priyanka: genetic algorithms tuned expert model for detection of epileptic seizures from EEG signatures. Appl. Soft Comput. 19, 8–17 (2014)
Ding, S.: Spectral and wavelet-based feature selection with particle swarm optimization for hyperspectral classification. J. Softw. 6(7), 1248–1256 (2011)
Fernandez-Ceniceros, J., Sanz-Garcia, A., Antonanzas-Torres, F., de Pison, F.M.: A numerical-informational approach for characterising the ductile behaviour of the t-stub component. part 2: Parsimonious soft-computing-based metamodel. Eng. Struct. 82, 249–260 (2015)
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Ann. Stat. 29(5), 1189–1232 (2001)
Gorissen, D., Couckuyt, I., Demeester, P., Dhaene, T., Crombecq, K.: A surrogate modeling and adaptive sampling toolbox for computer based design. J. Mach. Learn. Res. 11, 2051–2055 (2010)
Hashem, I.A., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Ullah Khan, S.: The rise of big data on cloud computing: review and open research issues. Inf. Syst. 47, 98–115 (2015)
Huang, C.L., Dun, J.F.: A distributed PSO-SVM hybrid system with feature selection and parameter optimization. Appl. Soft Comput. 8(4), 1381–1391 (2008)
Huang, C.J., Chen, Y.J., Chen, H.M., Jian, J.J., Tseng, S.C., Yang, Y.J., Hsu, P.A.: Intelligent feature extraction and classification of anuran vocalizations. Appl. Soft Comput. 19, 1–7 (2014)
Michalewicz, Z., Janikow, C.Z.: Handling constraints in genetic algorithms. In: ICGA, pp. 151–157 (1991)
Olson, R.S., Bartley, N., Urbanowicz, R.J., Moore, J.H.: Evaluation of a tree-based pipeline optimization tool for automating data science. In: Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO 2016, NY, USA, pp. 485–492. ACM, New York (2016)
Perner, P.: Improving the accuracy of decision tree induction by feature preselection. Appl. Artif. Intell. 15(8), 747–760 (2001)
Martinez-de Pison, F.J., Fraile-Garcia, E., Ferreiro-Cabello, J., Gonzalez, R., Pernia, A.: Searching parsimonious solutions with GA-PARSIMONY and XGBoost in high-dimensional databases, pp. 201–210. Springer International Publishing, Cham (2017)
Core Team, R.: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2013)
Rasmussen, C.E., Williams, C.K.I.: Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, Cambridge (2005)
Reif, M., Shafait, F., Dengel, A.: Meta-learning for evolutionary parameter optimization of classifiers. Mach. Learn. 87(3), 357–380 (2012)
Sanz-Garcia, A., Fernandez-Ceniceros, J., Antonanzas-Torres, F., Pernia-Espinoza, A., Martinez-de Pison, F.J.: GA-PARSIMONY: a GA-SVR approach with feature selection and parameter optimization to obtain parsimonious solutions for predicting temperature settings in a continuous annealing furnace. Appl. Soft Comput. 35, 13–28 (2015)
Sanz-Garcia, A., Fernández-Ceniceros, J., Fernández-Martínez, R., Martínez-De-Pisón, F.J.: Methodology based on genetic optimisation to develop overall parsimony models for predicting temperature settings on annealing furnace. Ironmak. Steelmak. 41(2), 87–98 (2014)
Sanz-García, A., Fernández-Ceniceros, J., Antoñanzas-Torres, F., Martínez-de Pisón, F.J.: Parsimonious support vector machines modelling for set points in industrial processes based on genetic algorithm optimization. In: International Joint Conference SOCO13-CISIS13-ICEUTE13, Advances in Intelligent Systems and Computing, vol. 239, pp. 1–10. Springer International Publishing, Heidelberg (2014)
Shahriari, B., Swersky, K., Wang, Z., Adams, R.P., de Freitas, N.: Taking the human out of the loop: a review of bayesian optimization. Technical report, Universities of Harvard, Oxford, Toronto, and Google DeepMind (2015)
Snoek, J., Larochelle, H., Adams, R.P.: Practical bayesian optimization of machine learning algorithms. In: Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems 25, pp. 2951–2959. Curran Associates Inc., Red Hook (2012)
Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Gaussian process bandits without regret: an experimental design approach (2009). CoRR arXiv:abs/0912.3995
Thornton, C., Hutter, F., Hoos, H.H., Leyton-Brown, K.: Auto-weka: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, NY, USA. ACM, New York (2013)
Urraca, R., Sanz-Garcia, A., Fernandez-Ceniceros, J., Sodupe-Ortega, E., Martinez-de-Pison, F.J.: Improving hotel room demand forecasting with a hybrid GA-SVR methodology based on skewed data transformation, feature selection and parsimony tuning. In: Onieva, E., Santos, I., Osaba, E., Quintián, H., Corchado, E. (eds.) HAIS 2015. LNCS (LNAI), vol. 9121, pp. 632–643. Springer, Cham (2015). doi:10.1007/978-3-319-19644-2_52
Vieira, S.M., Mendonza, L.F., Farinha, G.J., Sousa, J.M.: Modified binary PSO for feature selection using SVM applied to mortality prediction of septic patients. Appl. Softw. Comput. 13(8), 3494–3504 (2013)
Winkler, S.M., Affenzeller, M., Kronberger, G., Kommenda, M., Wagner, S., Jacak, W., Stekel, H.: Analysis of selected evolutionary algorithms in feature selection and parameter optimization for data based tumor marker modeling. In: Moreno-Díaz, R., Pichler, F., Quesada-Arencibia, A. (eds.) EUROCAST 2011. LNCS, vol. 6927, pp. 335–342. Springer, Heidelberg (2012). doi:10.1007/978-3-642-27549-4_43
Xue, B., Zhang, M., Browne, W.N.: Particle swarm optimisation for feature selection in classification: novel initialisation and updating mechanisms. Appl. Soft Comput. 18, 261–276 (2014)
Acknowledgements
We are greatly indebted to Banco Santander for the APPI16/05 fellowship and to the University of La Rioja for the EGI16/19 fellowship. This work used the Beronia cluster (Universidad de La Rioja), which is supported by FEDER-MINECO grant number UNLR-094E-2C-225.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Martinez-de-Pison, F.J., Gonzalez-Sendino, R., Aldama, A., Ferreiro, J., Fraile, E. (2017). Hybrid Methodology Based on Bayesian Optimization and GA-PARSIMONY for Searching Parsimony Models by Combining Hyperparameter Optimization and Feature Selection. In: Martínez de Pisón, F., Urraca, R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2017. Lecture Notes in Computer Science(), vol 10334. Springer, Cham. https://doi.org/10.1007/978-3-319-59650-1_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-59650-1_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59649-5
Online ISBN: 978-3-319-59650-1
eBook Packages: Computer ScienceComputer Science (R0)