Bengali Accent Classification from Speech Using Different Machine Learning and Deep Learning Techniques

Badhon, S. M. Saiful Islam; Rahaman, Habibur; Rupon, Farea Rehnuma; Abujar, Sheikh

doi:10.1007/978-981-15-7394-1_46

S. M. Saiful Islam Badhon¹⁸,
Habibur Rahaman¹⁸,
Farea Rehnuma Rupon¹⁸ &
…
Sheikh Abujar¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1248))

488 Accesses
3 Citations

Abstract

The work starts with a question “Does human vocal folds produce different wavelength when they speak in different accent of same language?” Generally, when humans hear the language, they can easily classify the accent and region from the language. But the challenge was how we give this capability to the machine. By calculating discrete Fourier transform, Mel-spaced filter-bank and log filter-bank energies, we got Mel-frequency cepstral coefficients (MFCCs) of a voice which is the numeric representation of an analog signal. And then, we used different machine learning and deep learning algorithms to find the best possible accuracy. By detecting the region of speaker from voice, we can help security agencies and e-commerce marketing. Working with human natural language is a part of Natural Language Processing (NLP) which is branch of artificial intelligence. For feature extraction, we used MFCCs, and for classification, we used linear regression, decision tree, gradient boosting, random forest and neural network. And we got max 86% accuracy on 9303 data. The data was collected from eight different regions (Dhaka, Khulna, Barisal, Rajshahi, Sylhet, Chittagong, Mymensingh and Noakhali) of Bangladesh. We follow a simple workflow for getting the ultimate result.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Mamun, R.K., Abujar, S., Islam, R., Badruzzaman, K.B.M., Hasan, M.: Bangla speaker accent variation detection by MFCC using recurrent neural network algorithm: a distinct approach. In: Saini, H., Sayal, R., Buyya, R., Aliseri, G. (eds.), Innovations in computer science and engineering. Lecture notes in networks and systems, vol. 103 (2020). Springer, Singapore
Google Scholar
Bengali language. https://en.wikipedia.org/wiki/Bengali_language. Accessed on 4 Apr 2020
Lin, F., Wu, Y., Zhuang, Y., Long, X., Xu, W.: Human Gender Classification: A Review (2015)
Google Scholar
Jiao, Y., Tu, M., Berisha, V., Liss, J.: Accent identification by combining deep neural networks and recurrent neural networks trained on long and short term features. Proc. Interspeech 2016, 2388–2392 (2016)
Article Google Scholar
Patel, I., Kulkarni, R., Yarravarapu, S.R.: Automatic non-native dialect and accent voice detection of south Indian English. Adv. Image Video Process 5. https://doi.org/10.14738/aivp.51.2749
Droua-Hamdani G: Classification of regional accent using speechrhythm metrics. In: Salah, A. A., et al. (eds.), SPECOM 2019, LNAI 11658, pp. 75–81 (2019)
Google Scholar
Salau, A.O., Olowoyoand, T.D., Akinola, S.O.: Accent Classification of the Three Major Nigerian Indigenous Languages Using 1DCNN LSTM Network Model, (2020). Springer Nature, Singapore Pte Ltd
Google Scholar
Abdullah, R., Muthusamy, H., Vijean, V., Abdullah, Z., Kassim, F.N.C.: Real and complex wavelet transform approaches for malaysian speaker and accent recognition. Pertanika J. Sci. Technol. 27(2), 737–752 (2019)
Google Scholar
] Weninger, F., Sun, Y., Park, Y., Willett, D., Zhan, P.: Deep Learning based Mandarin Accent Identification for Accent Robust ASR (2019) ISCA
Google Scholar
Jiao, Y., Tu, M., Berisha, V., Liss, J.: Accent identification by combining deep neural networks and recurrent neural networks trained on long and short term features (2019) ISCA
Google Scholar
Music Feature Extraction in Python (2018). https://towardsdatascience.com/extract-features-of-music-75a3f9bc265d. Accessed on 4 Apr 2020
Gouyon, F., Pachet, F., Delerue, O., et al.: On the use of zero-crossing rate for an application of classification of percussive sounds. In: Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00) (2000). Verona, Italy
Google Scholar
Kattel, M., Nepal, A., Shah, A., Shrestha, D.: Chroma Feature Extraction (2019). https://www.researchgate.net/publication
Reith, H.: Why are male and female voices distinctive? (2016) 330796993 Chroma Feature Extraction. https://www.quora.com/Why-are-male-and-femalevoices-distinctive. Accessed on 21 Sept 2019
The mel frequency scale and coefficients (2013). http://kom.aau.dk/group/04gr742/pdf/MFCCworksheet.pdf. Accessed on 27 Aug 2019

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Daffodil International University, Dhaka, Bangladesh
S. M. Saiful Islam Badhon, Habibur Rahaman, Farea Rehnuma Rupon & Sheikh Abujar

Authors

S. M. Saiful Islam Badhon
View author publications
You can also search for this author in PubMed Google Scholar
Habibur Rahaman
View author publications
You can also search for this author in PubMed Google Scholar
Farea Rehnuma Rupon
View author publications
You can also search for this author in PubMed Google Scholar
Sheikh Abujar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sheikh Abujar .

Editor information

Editors and Affiliations

Sikkim Manipal Institute of Technology, Majhitar, Sikkim, India
Samarjeet Borah
Sikkim Manipal Institute of Technology, Majhitar, Sikkim, India
Ratika Pradhan
Department of Computer Science and Engineering, JIS University, Kolkata, West Bengal, India
Nilanjan Dey
GLA University, Mathura, Uttar Pradesh, India
Phalguni Gupta

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Badhon, S.M.S.I., Rahaman, H., Rupon, F.R., Abujar, S. (2021). Bengali Accent Classification from Speech Using Different Machine Learning and Deep Learning Techniques. In: Borah, S., Pradhan, R., Dey, N., Gupta, P. (eds) Soft Computing Techniques and Applications. Advances in Intelligent Systems and Computing, vol 1248. Springer, Singapore. https://doi.org/10.1007/978-981-15-7394-1_46

Download citation

DOI: https://doi.org/10.1007/978-981-15-7394-1_46
Published: 28 November 2020
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-7393-4
Online ISBN: 978-981-15-7394-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics