Skip to main content

A Shallow Approach to Gradient Boosting (XGBoosts) for Prediction of the Box Office Revenue of a Movie

  • Conference paper
  • First Online:
Proceedings of International Conference on Innovations in Software Architecture and Computational Systems

Abstract

In the recent past, machine learning paradigms like the ensemble approaches have been used effectively to predict revenue from large volumes of sales data that helped the decision-making process in many businesses. The proposed work in this paper proposes a modified approach of ensemble algorithms to predict box office revenues of upcoming movies. A shallow version of the gradient boosting (XGBoosts) has been proposed to predict the box office revenue of movies based on several primary and derived features related to the movies in particular. Further studies have found that features such as budget, runtime, budget year ratio can also be considered as some of the more important estimators of the box office revenue. These features along with some other features have been used as an input to the proposed model in this proposed work to make significantly good predictions about the box office collection of a movie. The results are reported by testing and forecasting based on simulation on a standard data set. The precision of the model is tested using popular metrics such as R2, MSLE. The results reported gives efficacy of the proposed approach that can be further used in other business models words.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Variety.com (2018) Worldwide box office hits record as Disney dominates. https://variety.com/2019/film/news/box-office-record-disney-dominates-1203098075. Last accessed 05 Nov 2020

  2. Litman BR (1998) The motion picture mega-industry. Allyn & Bacon

    Google Scholar 

  3. Valenti J (1978) Motion pictures and their impact on society in the year 2001. Midwest Research Institute

    Google Scholar 

  4. Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139

    Article  MathSciNet  Google Scholar 

  5. Friedman JH (2001) Greedy function approximation: a gradient boosting machine. Ann Stat 1189–1232

    Google Scholar 

  6. Chen T, Guestrin C (2016) Xgboost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pp 785–794

    Google Scholar 

  7. GitHub. https://github.com/dmlc/xgboost. Last accessed 05 Nov 2020

  8. Sreenivasan S (2013) Quantitative analysis of the evolution of novelty in cinema through crowdsourced keywords. Sci Rep 3(1):1–11

    Article  Google Scholar 

  9. Sharda R, Delen D (2006) Predicting box-office success of motion pictures with neural networks. Expert Syst Appl 30(2):243–254

    Article  Google Scholar 

  10. Lash MT, Zhao K (2016) Early predictions of movie success: the who, what, and when of profitability. J Manag Inf Syst 33(3):874–903

    Article  Google Scholar 

  11. Asur S, Huberman BA (2010) Predicting the future with social media. In: 2010 IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology, vol 1. IEEE, pp 492–499

    Google Scholar 

  12. Mestyán M, Yasseri T, Kertész J (2013). Early prediction of movie box office success based on Wikipedia activity big data. PloS ONE, 8(8):e71226

    Google Scholar 

  13. Berkely Edu, Domestic gross of movies. https://www.stat.berkeley.edu/~aldous/Research/Ugrad/Xiaoyu_Hu.pdf. Last accessed 05 Nov 2020

  14. Eliashberg J, Hui SK, Zhang ZJ (2014) Assessing box office performance using movie scripts: a kernel-based approach. IEEE Trans Knowl Data Eng 26(11):2639–2648

    Article  Google Scholar 

  15. Delen D, Sharda R, Kumar P (2007) Movie forecast Guru: a web-based DSS for Hollywood managers. Decis Support Syst 43(4):1151–1170

    Article  Google Scholar 

  16. Pope LS, Jason E (eds) (2017) The movie business book. Routledge (A Focal Press Book), New York, pp. xxiii, 628. ISBN 978-1-138-65629-1

    Google Scholar 

  17. The Movie Database API. https://developers.themoviedb.org. Last accessed 05 Nov 2020

  18. Kaggle TMDB box office prediction. https://www.kaggle.com/c/tmdb-box-office-prediction/data. Last accessed 05 Nov 2020

  19. Rahm E, Do HH (2000) Data cleaning: problems and current approaches. IEEE Data Eng Bull 23(4):3–13

    Google Scholar 

  20. EDSA The Essentials of Data Analytics and Machine Learning. https://courses.edsa-project.eu/pluginfile.php/1332/mod_resource/content/0/Module%205%20-%20Feature%20transformation_V1.pdf. Last accessed 05 Nov 2020

  21. Guyon I, Gunn S, Nikravesh M, Zadeh LA (eds) (2008) Feature extraction: foundations and applications, vol 207. Springer

    Google Scholar 

  22. Geurts P, Ernst D, Wehenkel L (2006) Extremely randomized trees. Mach Learn 63(1):3–42

    Article  Google Scholar 

  23. Chai T, Draxler RR (2014) Root mean square error (RMSE) or mean absolute error (MAE)?–arguments against avoiding RMSE in the literature. Geosci Model Dev 7(3):1247–1250

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dutta, S., Dasgupta, K. (2021). A Shallow Approach to Gradient Boosting (XGBoosts) for Prediction of the Box Office Revenue of a Movie. In: Mandal, J.K., Mukhopadhyay, S., Unal, A., Sen, S.K. (eds) Proceedings of International Conference on Innovations in Software Architecture and Computational Systems. Studies in Autonomic, Data-driven and Industrial Computing. Springer, Singapore. https://doi.org/10.1007/978-981-16-4301-9_16

Download citation

  • DOI: https://doi.org/10.1007/978-981-16-4301-9_16

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-16-4300-2

  • Online ISBN: 978-981-16-4301-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics