Skip to main content
Log in

Supervised classifiers with TF-IDF features for sentiment analysis of Marathi tweets

  • Original Article
  • Published:
Social Network Analysis and Mining Aims and scope Submit manuscript

Abstract

In today’s digital era, the proliferation of vernacular languages such as Hindi, Marathi, Bengali, Tamil, and Malayalam cannot be overlooked. The social media sites like Facebook and Twitter are great sources of opinionated content for these languages. The work to analyze public opinions has been concentrated on English, with very few Sentiment Analysis studies of minor or morphologically rich languages like Marathi. Moreover, it is a challenge to investigate the results of Sentiment Analysis with the local context for the under-resourced languages. This paper presents the sentiment prediction work over Twitter for the Marathi language using supervised machine learning techniques. The first-ever attempt experiments on the dataset for the Marathi political tweets. The benchmark dataset of 4248 tweets for the four major political parties of Maharashtra (India) is created. The Multinomial Naïve Bayes, Support Vector Machines with both linear and RBF kernel, Logistic Regression, and Random Forest are used to train classifiers considering the Term Frequency vs. Inverse Document Frequency (TF-IDF) as features to classify the tweets as positive or negative. The performance of the Sentiment Analysis model is evaluated using the standard measures viz., accuracy, precision, recall, and f1-score. The experimental results evidence that the Multinomial Naïve Bayes outperforms among all the classifiers with the maximum accuracy of 87.29% in the prediction of the Indian State Assembly Election 2019. The proposed model ranks first in the list of Naïve Bayes classifiers employed for the current state-of-the-art sentiment analysis of Indian text.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. Census of India, Government of India, 2001.

  2. https://en.wikipedia.org/wiki/Languages_with_official_status_in_India.

  3. https://assets.kpmg/content/dam/kpmg/in/pdf/2017/04/Indian-languages-Defining-Indias-Internet.pdf.

  4. https://en.wikipedia.org/w/index.php?title=Marathi_language&oldid=1011547010.

  5. Twitter Archiver Tool. https://gsuite.google.com/marketplace/app/tweet_archiver/976886281542.

  6. https://en.wikipedia.org/w/index.php?title=Bharatiya_Janata_Party&oldid=1010698258.

  7. https://en.wikipedia.org/wiki/Indian_National_Congress.

  8. https://en.wikipedia.org/wiki/Nationalist_Congress_Party.

  9. https://en.wikipedia.org/wiki/Shiv_Sena.

  10. https://en.wikipedia.org/wiki/2019_Maharashtra_Legislative_Assembly_election.

References

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rupali S. Patil.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Patil, R.S., Kolhe, S.R. Supervised classifiers with TF-IDF features for sentiment analysis of Marathi tweets. Soc. Netw. Anal. Min. 12, 51 (2022). https://doi.org/10.1007/s13278-022-00877-w

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s13278-022-00877-w

Keywords

Navigation