Skip to main content

Open Citation Content Data

  • Conference paper
  • First Online:

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 846))

Abstract

There are several projects in the research community to make the citation data extracted from research papers more re-usable. This paper presents results from the CyrCitEc project to create a publicly available source of open citation content data extracted from PDF papers available at a research information system. To reach this aim the project team has created four outputs: (1) an open source software to parse papers’ metadata and full text PDFs; (2) an open service to process papers’ PDFs to extract citation data; (3) a dataset of citation data, including citation contexts (currently mostly for papers in Cyrillic); and (4) a visualization tool that provides users insight into the citation data extraction process and gives some control over the citation data parsing quality.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/citeccyr/CyrCitEc_method.

  2. 2.

    https://github.com/citeccyr.

  3. 3.

    https://en.wikipedia.org/wiki/Web_ARChive.

References

  • Barrueco, J.M., Krichel, T., Parinov, S., Lyapunov, V., Medvedeva, O., Sergeeva, V.: Towards open data for the citation content analysis. arXiv preprint arXiv:1710.00302 (2017)

  • Berger, M., McDonough, K., Seversky, L.M.: cite2vec: citation-driven document exploration via word embeddings. IEEE Trans. Vis. Comput. Graph. 23(1), 691–700 (2017)

    Article  Google Scholar 

  • Bertin, M., Atanassova, I.: InTeReC: in-text reference corpus for applying natural language processing to bibliometrics. In: Proceedings of the Seventh Workshop on Bibliometric-Enhanced Information Retrieval (BIR), Grenoble, France, pp. 54–62. CEURWS.org (2018)

    Google Scholar 

  • Bilder, G., Lin, J., Neylon, C.: Principles for open scholarly infrastructures. In: Blog “Science in the Open” (2015)

    Google Scholar 

  • Ding, Y., Zhang, G., Chambers, T., Song, M., Wang, X., Zhai, C.: Content-based citation analysis: the next generation of citation analysis. J. Assoc. Inf. Sci. Technol. 65(9), 1820–1833 (2014)

    Article  Google Scholar 

  • He, J., Chen, C.: Understanding the changing roles of scientific publications via citation embeddings. arXiv preprint arXiv:1711.05822 (2017)

  • Parinov, S.: Semantic attributes for citation relationships: creation and visualization. In: Garoufallou, E., Virkus, S., Siatri, R., Koutsomiha, D. (eds.) MTSR 2017. CCIS, vol. 755, pp. 286–299. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-70863-8_28

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sergey Parinov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kogalovsky, M., Krichel, T., Lyapunov, V., Medvedeva, O., Parinov, S., Sergeeva, V. (2019). Open Citation Content Data. In: Garoufallou, E., Sartori, F., Siatri, R., Zervas, M. (eds) Metadata and Semantic Research. MTSR 2018. Communications in Computer and Information Science, vol 846. Springer, Cham. https://doi.org/10.1007/978-3-030-14401-2_34

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-14401-2_34

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-14400-5

  • Online ISBN: 978-3-030-14401-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics