Skip to main content

SQL or NoSQL? Contrasting Approaches to the Storage, Manipulation and Analysis of Spatio-temporal Online Social Network Data

  • Conference paper
Computational Science and Its Applications – ICCSA 2014 (ICCSA 2014)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8579))

Included in the following conference series:

Abstract

Researchers are now accessing millions of Online Social Network (OSN) interactions. These are available at no or low cost through Application Programming Interfaces (APIs) or data custodians including DataSift and GNIP. Records held in Extensible Markup Language (XML) or JavaScript Object Notation (JSON) are well structured but often inconveniently formatted for use in popular Relational Database Management Systems (RDBMS) or Geographic Information Systems (GIS) software. In contrast, emerging NoSQL (Not-only Structured Query Language) technologies are specially designed to ‘ingest’ unstructured data. Extract/Transform/Load (ETL) procedures for the storage and subsequent analysis of two OSN datasets in SQL/NoSQL databases are examined. The fixed data model of the relational approach may prove problematic when loading unpredictable document-based structures arising from extended periods of data collection. Although relational databases are far from obsolete the spatial analysis community seems likely to benefit from experimentation with new software explicitly designed for handling spatio-temporal Big Data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. JISC: The Value and Benefit of Text Mining to UK Further and Higher Education. Digital Infrastructure (2012)

    Google Scholar 

  2. Campbell, S.W., Kwak, N.: Political Involvement in “Mobilized” Society: The Interactive Relationships Among Mobile Communication, Network Characteristics, and Political Participation. J. Commun. 61, 1005–1024 (2011)

    Article  Google Scholar 

  3. Lee, C.-H.: Mining spatio-temporal information on microblogging streams using a density-based online clustering method. Expert Syst. Appl. 39, 9623–9641 (2012)

    Article  Google Scholar 

  4. Bahir, E., Peled, A.: Identifying and Tracking Major Events Using Geo-Social Networks. Soc. Sci. Comput. Rev. 31, 458–470 (2013)

    Article  Google Scholar 

  5. Licoppe, C.: Merging mobile communication studies and urban research: Mobile locative media,“onscreen encounters” and the reshaping of the interaction order in public places. Mob. Media Commun. 1, 122–128 (2013)

    Article  Google Scholar 

  6. Humphreys, L.: Mobile social media: Future challenges and opportunities. Mob. Media Commun. 1, 20–25 (2013)

    Article  Google Scholar 

  7. Wilken, R.: Locative media: From specialized preoccupation to mainstream fascination. Converg. Int. J. Res. into New Media Technol. 18, 243–247 (2012)

    Article  Google Scholar 

  8. W3C: Extensible Markup Language (XML), http://www.w3.org/XML/

  9. JSON, http://www.json.org/

  10. ECMA International: ECMA-404 The JSON Data Interchange Format, Geneva (2013)

    Google Scholar 

  11. Pew Research Center’s Project for Excellence in Journalism: McCain vs. Obama on the Web: A Study of the Presidential Candidate Web Sites, http://www.journalism.org/node/12772

  12. Greengard, S.: The first internet president. Commun. ACM 52, 16–18 (2009)

    Article  Google Scholar 

  13. Levenshus, A.: Online Relationship Management in a Presidential Campaign: A Case Study of the Obama Campaign’s Management of Its Internet-Integrated Grassroots Effort. J. Public Relations Res. 22, 313–335 (2010)

    Article  Google Scholar 

  14. Towner, T.L.: All Political Participation Is Socially Networked? New Media and the 2012 Election. Soc. Sci. Comput. Rev., 1–15 (2013)

    Google Scholar 

  15. Polat, R.K.: The Internet and Political Participation: Exploring the Explanatory Links. Eur. J. Commun. 20, 435–459 (2005)

    Article  Google Scholar 

  16. Mutz, D.C., Young, L.: Communication and Public Opinion: Plus Ca Change? Public Opin. Q. 75, 1018–1044 (2011)

    Article  Google Scholar 

  17. Hong, S.: Online news on Twitter: Newspapers’ social media adoption and their online readership. Inf. Econ. Policy 24, 69–74 (2012)

    Article  Google Scholar 

  18. Kim, Y.: The contribution of social network sites to exposure to political difference: The relationships among SNSs, online political messaging, and exposure to cross-cutting perspectives. Comput. Human Behav. 27, 971–977 (2011)

    Article  Google Scholar 

  19. Nooralahzadeh, F., Arunachalam, V., Chiru, C.: Presidential Elections on Twitter – An Analysis of How the US and French Election were Reflected in Tweets. In: 2013 19th Int. Conf. Control Syst. Comput. Sci., pp. 240–246 (2012)

    Google Scholar 

  20. Campbell, H.: Barack Obama and Twenty-First Century Politics: A Revolutionary Moment in the USA. Pluto Press, London (2010)

    Google Scholar 

  21. Takaragawa, S., Carty, V.: The 2008 US Presidential Election and New Digital Technologies: Political Campaigns as Social Movements and the Significance of Collective Identity. Tamara J. Crit. Organ. Inq. 10, 73–89 (2012)

    Google Scholar 

  22. Facebook: Key Facts - Facebook Newsroom, http://newsroom.fb.com/content/default.aspx?NewsAreaId=22

  23. Tsukayama, H.: Twitter turns 7: Users send over 400 million tweets per day (2013), http://articles.washingtonpost.com/2013-03-21/business/37889387_1_tweets-jack-dorsey-twitter

  24. Chamley, C., Scaglione, A., Li, L.: Models for the Diffusion of Beliefs in Social Networks: An Overview. IEEE Signal Process. Mag. 30, 16–29 (2013)

    Article  Google Scholar 

  25. McGregor, R.: Obama campaign sharpens tech edge (2011), http://www.ft.com/cms/s/0/b2e7043c-2284-11e1-923d-00144feabdc0.html

  26. Lees-Marshment, J., Lilleker, D.G.: Knowledge sharing and lesson learning: consultants’ perspectives on the international sharing of political marketing strategy. Contemp. Polit. 18, 343–354 (2012)

    Article  Google Scholar 

  27. Boyd, D., Crawford, K.: Critical Questions for Big Data. Information, Commun. Soc. 15, 662–679 (2012)

    Article  Google Scholar 

  28. Bond, R.M., Fariss, C.J., Jones, J.J., Kramer, A.D.I., Marlow, C., Settle, J.E., Fowler, J.H.: A 61-million-person experiment in social influence and political mobilization. Nature 489, 295–298 (2012)

    Article  Google Scholar 

  29. Crampton, J.W., Graham, M., Poorthuis, A., Shelton, T., Wilson, M.W., Zook, M.: Beyond the geotag: situating “big data” and leveraging the potential of the geoweb. Cartogr. Geogr. Inf. Sci. 40, 130–139 (2013)

    Article  Google Scholar 

  30. Leetaru, K., Wang, S., Cao, G., Padmanabhan, A., Shook, E.: Mapping the global Twitter heartbeat: The geography of Twitter. First Monday 18 (2013)

    Google Scholar 

  31. Kosala, R., Adi, E.: Harvesting Real Time Traffic Information from Twitter. Procedia Eng. 50, 1–11 (2012)

    Article  Google Scholar 

  32. Wilson, M.W.: Location-based services, conspicuous mobility, and the location-aware future. Geoforum 43, 1266–1275 (2012)

    Article  Google Scholar 

  33. Spinsanti, L., Ostermann, F.: Automated geographic context analysis for volunteered information. Appl. Geogr. 43, 36–44 (2013)

    Article  Google Scholar 

  34. Goodchild, M.F., Glennon, J.A.: Crowdsourcing geographic information for disaster response: a research frontier. Int. J. Digit. Earth 3, 231–241 (2010)

    Article  Google Scholar 

  35. Warf, B., Sui, D.: From GIS to neogeography: ontological implications and theories of truth. Ann. GIS. 16, 197–209 (2010)

    Article  Google Scholar 

  36. Batty, M., Hudson-Smith, A., Milton, R., Crooks, A.: Map mashups, Web 2.0 and the GIS revolution. Ann. GIS 16, 1–13 (2010)

    Article  Google Scholar 

  37. Andrienko, N., Andrienko, G., Gatalsky, P.: Exploratory spatio-temporal visualization: an analytical review. J. Vis. Lang. Comput. 14, 503–541 (2003)

    Article  Google Scholar 

  38. Stieglitz, S., Kaufhold, C.: Automatic Full Text Analysis in Public Social Media – Adoption of a Software Prototype to Investigate Political Communication. Procedia Comput. Sci. 5, 776–781 (2011)

    Article  Google Scholar 

  39. Morstatter, F., Pfeffer, J., Liu, H., Carley, K.: Is the sample good enough? comparing data from twitter’s streaming api with twitter’s firehose. In: Proc. ICWSM (2013)

    Google Scholar 

  40. Twitter: How do I get firehose access? | Twitter Developers, https://dev.twitter.com/discussions/2752

  41. DataSift: Language Guide | DataSift Developers, http://dev.datasift.com/csdl

  42. Twitter: Overview: Version 1.1 of the Twitter API | Twitter Developers, https://dev.twitter.com/docs/api/1.1/overview

  43. Facebook: JSON with Unity, https://developers.facebook.com/docs/unity/reference/current/Json/

  44. Firefox: JSONView:: Add-ons for Firefox, https://addons.mozilla.org/en-US/firefox/addon/jsonview/

  45. Codd, E.F.: A Relational Model of Data for Large Shared Data Banks. Commun. ACM 13, 377–387 (1970)

    Article  MATH  Google Scholar 

  46. Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., Hung Byers, A.: Big data: The next frontier for innovation, competition, and productivity (2011)

    Google Scholar 

  47. Foley, J.: OracleVoice: Extreme Big Data: Beyond Zettabytes And Yottabytes - Forbes, http://www.forbes.com/sites/oracle/2013/10/09/extreme-big-data-beyond-zettabytes-and-yottabytes/

  48. Rabl, T., Poess, M., Baru, C., Jacobsen, H.-A. (eds.): WBDB 2012. LNCS, vol. 8163. Springer, Heidelberg (2014)

    Google Scholar 

  49. Chang, F.A.Y., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A Distributed Storage System for Structured Data. ACM Trans. Comput. Syst. 26, 4:2–4:26 (2008)

    Article  Google Scholar 

  50. Apache: HBase - Apache HBaseTM Home, http://hbase.apache.org/

  51. Apache: Welcome to ApacheTM Hadoop®!, http://hadoop.apache.org/

  52. Borthakur, D., Rash, S., Schmidt, R., Aiyer, A., Gray, J., Sarma, J., Sen, M.K., Spiegelberg, N., Kuang, H., Ranganathan, K., Molkov, D., Menon, A.: Apache hadoop goes realtime at Facebook. In: Proc. 2011 Int. Conf. Manag. Data - SIGMOD 2011, vol. 1071 (2011)

    Google Scholar 

  53. Shekhar, S., Evans, M.R., Gunturi, V., Yang, K., Cugler, D.C.: Benchmarking Spatial Big Data. In: Rabl, T., Poess, M., Baru, C., Jacobsen, H.-A. (eds.) WBDB 2012. LNCS, vol. 8163, pp. 81–93. Springer, Heidelberg (2014)

    Chapter  Google Scholar 

  54. Bernstein, P., Brodie, M., Ceri, S., DeWitt, D., Franklin, M., Garcia-Molina, H., Gray, J., Held, J., Hellerstein, J., Jagadish, H.V.: others: The Asilomar report on database research. ACM Sigmod Rec. 27, 74–80 (1998)

    Article  Google Scholar 

  55. D’Souza, D.F., Wills, A.C.: Objects, components, and frameworks with UML: the catalysis approach. Addison-Wesley, Reading (1998)

    Google Scholar 

  56. Axelos: About PRINCE2® | PRINCE2®, http://www.prince-officialsite.com/AboutPRINCE2/AboutPRINCE2.aspx

  57. Microsoft: Microsoft Download Center, http://www.microsoft.com/en-us/download/details.aspx?id=36843

  58. Murray, S.: Import UTF-8 Unicode Special Characters with SQL Server Integration Services, http://www.mssqltips.com/sqlservertip/3119/import-utf8-unicode-special-characters-with-sql-server-integration-services/

  59. Goldberg, D., Nichols, D., Oki, B.M., Terry, D.: Using collaborative filtering to weave an information tapestry. Commun. ACM 35, 61–70 (1992)

    Article  Google Scholar 

  60. Edlich, S.: NOSQL Databases, http://nosql-database.org/

  61. Cutting, D.: The Apache Hadoop Ecosystem, http://assets.en.oreilly.com/1/event/75/TheApacheHadoopEcosystemPresentation.pdf

  62. MongoDB: MongoDB, http://www.mongodb.org/

  63. MarkLogic: Enterprise NoSQL Database | MarkLogic, http://www.marklogic.com/

  64. Walmsley, P.: XQuery. O’Reilly (2009)

    Google Scholar 

  65. MarkLogic: MarkLogic 7 — MarkLogic Developer Community, http://developer.marklogic.com/products

  66. MarkLogic: Using MarkLogic Content Pump (Loading Content Into MarkLogic Server) — MarkLogic 7 Product Documentation, http://docs.marklogic.com/guide/ingestion/content-pump

  67. Till, B.C., Longo, J., Dobell, A.R., Driessen, P.F.: Self-organizing maps for latent semantic analysis of free-form text in support of public policy analysis. Wiley Interdiscip. Rev. Data Min. Knowl. Discov. 4, 71–86 (2014)

    Article  Google Scholar 

  68. Lee, K.K.-Y., Tang, W.-C., Choi, K.-S.: Alternatives to relational database: comparison of NoSQL and XML approaches for clinical data storage. Comput. Methods Programs Biomed. 110, 99–109 (2013)

    Article  Google Scholar 

  69. Cunningham, H., Tablan, V., Roberts, A., Bontcheva, K.: Getting more out of biomedical documents with GATE’s full lifecycle open source text analytics. PLoS Comput 9, e1002854 (2013)

    Article  Google Scholar 

  70. Lin, J., Ryaboy, D.: Scaling big data mining infrastructure: the twitter experience. ACM SIGKDD Explor. Newsl. 14, 6–19 (2013)

    Article  Google Scholar 

  71. Wang, S.: CyberGIS: blueprint for integrated and scalable geospatial software ecosystems. Int. J. Geogr. Inf. Sci. 27, 2119–2121 (2013)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Tear, A. (2014). SQL or NoSQL? Contrasting Approaches to the Storage, Manipulation and Analysis of Spatio-temporal Online Social Network Data. In: Murgante, B., et al. Computational Science and Its Applications – ICCSA 2014. ICCSA 2014. Lecture Notes in Computer Science, vol 8579. Springer, Cham. https://doi.org/10.1007/978-3-319-09144-0_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09144-0_16

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09143-3

  • Online ISBN: 978-3-319-09144-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics