Skip to main content

Bengali Accent Classification from Speech Using Different Machine Learning and Deep Learning Techniques

  • Conference paper
  • First Online:
Soft Computing Techniques and Applications

Abstract

The work starts with a question “Does human vocal folds produce different wavelength when they speak in different accent of same language?” Generally, when humans hear the language, they can easily classify the accent and region from the language. But the challenge was how we give this capability to the machine. By calculating discrete Fourier transform, Mel-spaced filter-bank and log filter-bank energies, we got Mel-frequency cepstral coefficients (MFCCs) of a voice which is the numeric representation of an analog signal. And then, we used different machine learning and deep learning algorithms to find the best possible accuracy. By detecting the region of speaker from voice, we can help security agencies and e-commerce marketing. Working with human natural language is a part of Natural Language Processing (NLP) which is branch of artificial intelligence. For feature extraction, we used MFCCs, and for classification, we used linear regression, decision tree, gradient boosting, random forest and neural network. And we got max 86% accuracy on 9303 data. The data was collected from eight different regions (Dhaka, Khulna, Barisal, Rajshahi, Sylhet, Chittagong, Mymensingh and Noakhali) of Bangladesh. We follow a simple workflow for getting the ultimate result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Mamun, R.K., Abujar, S., Islam, R., Badruzzaman, K.B.M., Hasan, M.: Bangla speaker accent variation detection by MFCC using recurrent neural network algorithm: a distinct approach. In: Saini, H., Sayal, R., Buyya, R., Aliseri, G. (eds.), Innovations in computer science and engineering. Lecture notes in networks and systems, vol. 103 (2020). Springer, Singapore

    Google Scholar 

  2. Bengali language. https://en.wikipedia.org/wiki/Bengali_language. Accessed on 4 Apr 2020

  3. Lin, F., Wu, Y., Zhuang, Y., Long, X., Xu, W.: Human Gender Classification: A Review (2015)

    Google Scholar 

  4. Jiao, Y., Tu, M., Berisha, V., Liss, J.: Accent identification by combining deep neural networks and recurrent neural networks trained on long and short term features. Proc. Interspeech 2016, 2388–2392 (2016)

    Article  Google Scholar 

  5. Patel, I., Kulkarni, R., Yarravarapu, S.R.: Automatic non-native dialect and accent voice detection of south Indian English. Adv. Image Video Process 5. https://doi.org/10.14738/aivp.51.2749

  6. Droua-Hamdani G: Classification of regional accent using speechrhythm metrics. In: Salah, A. A., et al. (eds.), SPECOM 2019, LNAI 11658, pp. 75–81 (2019)

    Google Scholar 

  7. Salau, A.O., Olowoyoand, T.D., Akinola, S.O.: Accent Classification of the Three Major Nigerian Indigenous Languages Using 1DCNN LSTM Network Model, (2020). Springer Nature, Singapore Pte Ltd

    Google Scholar 

  8. Abdullah, R., Muthusamy, H., Vijean, V., Abdullah, Z., Kassim, F.N.C.: Real and complex wavelet transform approaches for malaysian speaker and accent recognition. Pertanika J. Sci. Technol. 27(2), 737–752 (2019)

    Google Scholar 

  9. ] Weninger, F., Sun, Y., Park, Y., Willett, D., Zhan, P.: Deep Learning based Mandarin Accent Identification for Accent Robust ASR (2019) ISCA

    Google Scholar 

  10. Jiao, Y., Tu, M., Berisha, V., Liss, J.: Accent identification by combining deep neural networks and recurrent neural networks trained on long and short term features (2019) ISCA

    Google Scholar 

  11. Music Feature Extraction in Python (2018). https://towardsdatascience.com/extract-features-of-music-75a3f9bc265d. Accessed on 4 Apr 2020

  12. Gouyon, F., Pachet, F., Delerue, O., et al.: On the use of zero-crossing rate for an application of classification of percussive sounds. In: Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00) (2000). Verona, Italy

    Google Scholar 

  13. Kattel, M., Nepal, A., Shah, A., Shrestha, D.: Chroma Feature Extraction (2019). https://www.researchgate.net/publication

  14. Reith, H.: Why are male and female voices distinctive? (2016) 330796993 Chroma Feature Extraction. https://www.quora.com/Why-are-male-and-femalevoices-distinctive. Accessed on 21 Sept 2019

  15. The mel frequency scale and coefficients (2013). http://kom.aau.dk/group/04gr742/pdf/MFCCworksheet.pdf. Accessed on 27 Aug 2019

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sheikh Abujar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Badhon, S.M.S.I., Rahaman, H., Rupon, F.R., Abujar, S. (2021). Bengali Accent Classification from Speech Using Different Machine Learning and Deep Learning Techniques. In: Borah, S., Pradhan, R., Dey, N., Gupta, P. (eds) Soft Computing Techniques and Applications. Advances in Intelligent Systems and Computing, vol 1248. Springer, Singapore. https://doi.org/10.1007/978-981-15-7394-1_46

Download citation

Publish with us

Policies and ethics