Viewing computer science through citation analysis: Salton and Bergmark Redux

Devarakonda, Sitaram; Korobskiy, Dmitriy; Warnow, Tandy; Chacko, George

doi:10.1007/s11192-020-03624-0

Viewing computer science through citation analysis: Salton and Bergmark Redux

Published: 20 July 2020

Volume 125, pages 271–287, (2020)
Cite this article

Scientometrics Aims and scope Submit manuscript

Sitaram Devarakonda¹^nAff2,
Dmitriy Korobskiy¹,
Tandy Warnow³ &
…
George Chacko ORCID: orcid.org/0000-0002-2127-1892¹

753 Accesses
6 Citations
Explore all metrics

Abstract

Computer science has experienced dramatic growth and diversification over the last twenty years. Towards a current understanding of the structure of this discipline, we analyze a large sample of the computer science literature from the DBLP database. For insight on the features of this cohort and the relationship within its components, we have constructed article level clusters based on either direct citations or co-citations, and reconciled them with major and minor subject categories in the All Science Journal Classification. We describe complementary insights from clustering by direct citation and co-citation, and both point to the increase in computer science publications and their scope. Our analysis reveals cross-category clusters, some that interact with external fields, such as the biological sciences, while others remain inward looking. Overall, we document an increase in computer science publications and their scope.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The explanatory power of citations: a new approach to unpacking impact in science

Article Open access 05 August 2021

Matthias Sebastian Rüdiger, David Antons & Torsten-Oliver Salge

Encouraging data citation and discovery with the Data Citation Index

Article 01 July 2014

Megan M. Force & Nigel J. Robinson

Large-scale analysis of micro-level citation patterns reveals nuanced selection criteria

Article 15 April 2019

Julia Poncela-Casasnovas, Martin Gerlach, … Luís A. N. Amaral

References

Almeida, H., Guedes, D., Meira, W, Jr., & Zaki, M. (2012). Towards a better quality metric for graph cluster evaluation. Journal of Information and Data Management (JIDM), 3, 378–393.
Google Scholar
Archambault, E., Campbell, D., Gingras, Y., & Lariviere, V. (2009). Comparing bibliometric statistics obtained from the web of science and scopus. Journal of the American Society for for Information Science and Technology,. https://doi.org/10.1002/asi.21062.
Article Google Scholar
Association for Computing Machinery: Computing Classification System (2012). https://dl.acm.org/ccs/ccs.cfm. Accessed June 2019.
Boyack, K., & Klavans, R. (2010). Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately? Journal of the American Society for Information Science and Technology, 61(12), 2389–2404. https://doi.org/10.1002/asi.21419.
Article Google Scholar
Boyack, K. W. (2017). Investigating the effect of global data on topic detection. Scientometrics, 111(2), 999–1015. https://doi.org/10.1007/s11192-017-2297-y.
Boyack, K. W., Newman, D., Duhon, R. J., Klavans, R., Patek, M., Biberstine, J. R., et al. (2011). Clustering more than two million biomedical publications: Comparing the accuracies of nine text-based similarity approaches. PLOS ONE, 6(3), e18029. https://doi.org/10.1371/journal.pone.0018029.
Article Google Scholar
Boyack, K. W., Patek, M., Ungar, L. H., Yoon, P., & Klavans, R. (2014). Classification of individual articles from all of science by research level. Journal of Informetrics, 8(1), 1–12. https://doi.org/10.1016/j.joi.2013.10.005.
Article Google Scholar
Boyack, K. W., Small, H., & Klavans, R. (2013). Improving the accuracy of co-citation clustering using full text: Improving the accuracy of co-citation clustering using full text. Journal of the American Society for Information Science and Technology, 64(9), 1759–1767. https://doi.org/10.1002/asi.22896.
Article Google Scholar
Chakraborty, T. (2018). Role of interdisciplinarity in computer sciences: Quantification, impact and life trajectory. Scientometrics, 114(3), 1011–1029. https://doi.org/10.1007/s11192-017-2628-z.
Article Google Scholar
Clarivate Analytics: Web of Science (2019). https://clarivate.com/webofsciencegroup/solutions/web-of-science/. Accessed Dec 2019.
Dhillon, I., Guan, Y., Kulis, B. (2007). Weighted graph cuts without eigenvectors: A multilevel approach. In IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 29:11, (pp 1944–1957). ACM Press.
Elsevier: Scopus (2019). https://www.scopus.com/home.uri. Accessed Dec 2019.
Emmons, S., Kobourov, S., Gallant, M., & Börner, K. (2016). Analysis of network clustering algorithms and cluster quality metrics at scale. PloS one, 11(7), e0159161.
Article Google Scholar
Glänzel, W., & Thijs, B. (2017). Using hybrid methods and ‘core documents’ for the representation of clusters and topics: the astronomy dataset. Scientometrics, 111(2), 1071–1087. https://doi.org/10.1007/s11192-017-2301-6.
Article Google Scholar
Kessler, M. M. (1965). Comparison of the results of bibliographic coupling and analytic subject indexing. American Documentation, 16(3), 223–233. https://doi.org/10.1002/asi.5090160309.
Article Google Scholar
Klavans, R., & Boyack, K. W. (2017). Which Type of Citation Analysis Generates the Most Accurate Taxonomy of Scientific and Technical Knowledge? Journal of the Association for Information Science and Technology, 68(4), 984–998. https://doi.org/10.1002/asi.23734.
Article Google Scholar
Korobskiy, D., Davey, A., Liu, S., Devarakonda, S., Chacko, G. (2019). Enhanced Research Network Informatics Environment (ERNIE). Github repository, NET ESolutions Corporation. https://github.com/NETESOLUTIONS/ERNIE. Accessed Dec 2019.
Marshakova-Shaikevich, I. (1973). System of document connections based on references. Nauchno–Tekhnicheskaya Informatsiya Seriya 2-Informatsionnye Protsessy I Sistemy, 6(4), 3–8. https://doi.org/10.1002/asi.4630240406.
Article Google Scholar
National Academies of Sciences, Engineering, and Medicine, et al. (2018). Assessing and Responding to the Growth of Computer Science Undergraduate Enrollments. The National Academies Press, Washington, DC. https://doi.org/10.17226/24926
National Science Foundation: Classification of Fields of Study (2012). https://www.nsf.gov/statistics/nsf13327/pdf/tabb1.pdf. Accessed June 2019.
Perianes-Rodriguez, A., & Ruiz-Castillo, J. (2017). A comparison of the Web of Science and publication-level classification systems of science. Journal of Informetrics, 11, 32–45. https://doi.org/10.1016/j.joi.2016.10.007.
Article Google Scholar
Pham, M.C., Klemma, R. (2010). The structure of the computer science knowledge network. In 2010 International Conference on Advances in Social Networks Analysis and Mi. https://doi.org/10.1109/ASONAM.2010.58
Salton, G., & Bergmark, D. (1979). A citation study of computer science literature. IEEE Transactions on Professional Communication, PC–22(3), 146–158. https://doi.org/10.1109/TPC.1979.6501740.
Article Google Scholar
Shu, F., Julien, C. A., Zhang, L., Qiu, J., Zhang, J., & Larivière, V. (2019). Comparing journal and paper level classifications of science. Journal of Informetrics, 13(1), 202–225. https://doi.org/10.1016/j.joi.2018.12.005.
Article Google Scholar
Shun, J., Roosta-Khorasani, F., Fountoulakis, K., & Mahoney, M. W. (2016). Parallel local graph clustering. Proceedings of the VLDB Endowment, 9(12), 1041–1052. https://doi.org/10.14778/2994509.2994522.
Article Google Scholar
Siebel, T. (2019). Digital transformation: survive and thrive in an era of mass extinction. New York: RosettaBooks.
Google Scholar
Sjögårde, P., Ahlgren, P. (2019). Granularity of algorithmically constructed publication-level classifications of research publications: Identification of specialties. Quantitative Science Studies (pp 1–32). https://doi.org/10.1162/qss_a_00004.
Small, H. (1973). Co-citation in the scientific literature: A new measure of the relationship between two documents. Journal of the American Society for Information Science, 24(4), 265–269. https://doi.org/10.1002/asi.4630240406.
Small, H., & Griffith, B. C. (1974). The structure of scientific literatures I: Identifying and graphing specialties. Science Studies, 4(1), 17–40. https://doi.org/10.1177/030631277400400102.
Article Google Scholar
Small, H., & Sweeney, E. (1985). Clustering the science citation index using co-citations. Scientometrics, 7(3), 391–409. https://doi.org/10.1007/BF02017157.
Article Google Scholar
The dblp Team: dblp Computer Science Bibliography (2018). https://dblp.org/xml/release/dblp-2018-08-01.xml.gz. Accessed June 2019.
Traag, V. A., Waltman, L., & van Eck, N. J. (2019). From Louvain to Leiden: Guaranteeing well-connected communities. Scientific Reports, 9(1), 1–12. https://doi.org/10.1038/s41598-019-41695-z.
Article Google Scholar
Šubelj, L., van Eck, N. J., & Waltman, L. (2016). Clustering scientific publications based on citation relations: A systematic comparison of different methods. PLOS ONE, 11(4), e0154404. https://doi.org/10.1371/journal.pone.0154404.
Article Google Scholar
Waltman, L., & van Eck, N. J. (2012). A new methodology for constructing a publication-level classification system of science. Journal of the American Society for Information Science and Technology, 63(12), 2378–2392. https://doi.org/10.1002/asi.22748.
Article Google Scholar
Wang, Q., & Waltman, L. (2016). Large-scale analysis of the accuracy of the journal classification systems of Web of Science and Scopus. Journal of Informetrics, 10(2), 347–364. https://doi.org/10.1016/j.joi.2016.02.003.
Article Google Scholar

Download references

Acknowledgements

The authors thank Henry Small for very helpful discussions. Research and development reported in this publication was partially supported by funds from the National Institute on Drug Abuse, National Institutes of Health, US Department of Health and Human Services, under Contract No HHSN271201800040C (N44DA-18-1216). TW is supported by the Grainger Foundation. Citation data used in this paper relied on Scopus data as implemented in the ERNIE project (Korobskiy et al., 2019), which is collaborative between NET ESolutions Corporation and Elsevier Inc. We thank our Elsevier colleagues for their support of the ERNIE project.

Author information

Sitaram Devarakonda
Present address: Randstad USA, Atlanta, GA, USA

Authors and Affiliations

Netelabs, NET ESolutions Corporation, McLean, VA, USA
Sitaram Devarakonda, Dmitriy Korobskiy & George Chacko
Department of Computer Science, University of Illinois Urbana-Champaign, Champaign, IL, USA
Tandy Warnow

Authors

Sitaram Devarakonda
View author publications
You can also search for this author in PubMed Google Scholar
Dmitriy Korobskiy
View author publications
You can also search for this author in PubMed Google Scholar
Tandy Warnow
View author publications
You can also search for this author in PubMed Google Scholar
George Chacko
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to George Chacko.

Ethics declarations

Conflict of interest

The authors declare that they have no conflicts of interest. Elsevier personnel played no role in conceptualization, experimental design, review of results, or conclusions presented. The content of this publication is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health, NET ESolutions Corporation, or Elsevier Inc.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Devarakonda, S., Korobskiy, D., Warnow, T. et al. Viewing computer science through citation analysis: Salton and Bergmark Redux. Scientometrics 125, 271–287 (2020). https://doi.org/10.1007/s11192-020-03624-0

Download citation

Received: 23 December 2019
Published: 20 July 2020
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11192-020-03624-0

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Viewing computer science through citation analysis: Salton and Bergmark Redux

Abstract

Access this article

Similar content being viewed by others

The explanatory power of citations: a new approach to unpacking impact in science

Encouraging data citation and discovery with the Data Citation Index

Large-scale analysis of micro-level citation patterns reveals nuanced selection criteria

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Viewing computer science through citation analysis: Salton and Bergmark Redux

Abstract

Access this article

Similar content being viewed by others

The explanatory power of citations: a new approach to unpacking impact in science

Encouraging data citation and discovery with the Data Citation Index

Large-scale analysis of micro-level citation patterns reveals nuanced selection criteria

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation