Comparison of the citation distribution and h-index between groups of different sizes
Highlights
► We propose a method for comparing citation distributions. ► An estimate of the size-independent reduced citation distribution is proposed. ► An estimate of the size-independent reduced Hirsch index is proposed. ► Tolerance intervals are calculated. ► Several examples illustrate the usage
Introduction
The evaluation of individuals, teams, institutions and even countries is an important part of scientometric studies. Recognizing and comparing science performance is particularly important for governments, funding agencies and managers of research institutions. Peer reviews, the number of published documents, the quality of journals in which the documents are published and the number of citations received by these documents are standard indicators in the evaluation of the productivity and visibility of research work. Although many studies found only a weak positive correlation between peer reviews and bibliometric indicators (Aksnes and Taxt, 2004, Južnič et al., 2010), and although the use of citations as performance measure is controversial and seriously debated (Coryn, 2005), citation analysis is typically still the starting point of research evaluations.
In search for financial support for their work, researchers adapt their publication practice to funding agencies’ policies, and so citation distributions may reflect the funding policies: if productivity is stimulated more than quality, authors tend to spread their research results over more documents, which are prone to lower citation rates (Butler, 2003).1 This presents an important motivation for the comparison of citation distributions. However, due to the skewness of citation distributions, it is difficult to compare individual, team, institution or country performance if their research was backed by different resources. The distribution strongly depends on the number of articles, therefore any comparison is masked by the difference in the total number of articles.
There are methods for evaluating citation distributions by dividing them into subgroups of articles with different numbers of citations, such as uncited papers, poorly cited papers, fairly cited papers, remarkably cited papers and outstandingly cited papers (Schubert & Braun, 1986). Another method measures low and high impact in citation distribution (Albarrán, Ortuño, & Ruiz-Castillo, 2011). For comparisons of sets of different sizes with one another, the percentile-ranks approach has been proposed (Leydesdorff, Bornmann, Mutz, & Opthof, 2011). Recently, a new method for comparisons between research units of different sizes and fields was proposed (Crespo, Ortuño, & Ruiz-Castillo, 2011). It assesses the merit of any set of scientific papers in a given field with the probability that a randomly drawn sample of articles from a reference set would have a lower citation index.
Because of the highly skewed citation distributions (Tijssen, Visser, & Van Leeuwen, 2002), the simplest measures – average number of citations per document and total number of citations – have poor statistical properties. Consequently, a large number of other citation-based bibliometric indicators have been developed.
A widely accepted indicator of scientific performance is the h-index (Hirsch, 2005), which offers a simple measure of quantity and visibility and is easy to calculate, although it also has many disadvantages. For instance, large research institutions producing a large number of documents tend to have a larger h-index than smaller institutions, since the h-index also depends on the number of documents considered. Many properties of the h-index have been studied (Burrell, 2007, Egghe, 2008a, Egghe, 2008b, Egghe and Rousseau, 2006, Glänzel, 2006). To overcome the drawbacks of h-index many of its variants have been proposed (Alonso et al., 2009, Egghe, 2010). Among them g-index is most popular, as it gives more weight to highly cited papers in contrast to the h-index and so it has greater discriminatory power (Egghe, 2006a, Egghe, 2006b, Egghe, 2006c). Egghe and Rousseau (2006) studied different h-indices for groups of authors. Bornmann, Mutz, Hug, and Daniel (2011) conducted a multilevel meta-analysis of studies reporting correlations between the h-index and 37 different variants of it. Despite this, such h-index variants are rarely used. A single measure cannot capture the complete information on the citation distribution over documents (Bornmann, Mutz, & Daniel, 2010).
For comparing the productivity and visibility of institutions of disparate size, a size-corrected index has been proposed based on the decomposition of the h-index into the product of an impact index and a factor that depends on the population size (Molinari & Molinari, 2008). Performance evaluations would greatly benefit from a general method of citation distribution reduction without empirical parameters and not limited to a specific citation distribution shape.
The goal of our research was to develop a method that will make it possible to compare citation distributions of disparately sized groups of authors. We calculated the expected values of a reduced set of citations and compared them with the original citation distribution of a smaller group. Using similar methodology, we obtained the expected value of a reduced (diminished) hr-index for a certain reducing factor and compared it with the original h value of a smaller group.
Section snippets
A motivating example
Suppose a group has published n articles. Let ci denote the number of citations of the ith article, i = 1, …, n, and let the articles be ordered with respect to the number of citations, so that c1 ≥ c2 ≥ ⋯ ≥ cn. Fig. 1 presents the citation curve joining the points (i, ci) for a group A with n = 18 articles, all cited 10 times (ci = 10 for each i): for each article rank i, the number of citations ci is plotted and the values are joined in a curve. The h-index of this group is 10. Say we want to compare
Examples
Slovenian science evaluators often compare their country to neighboring Austria, whose population is approximately four times larger.
As an example, we compare Slovenia's physics articles that were published between 2002 and 2006 and cited up to mid-2008 to a quarter and an eighth of their Austrian counterparts in the same period – Fig. 32 presents
Discussion
Calculation of a reduced (diminished) citation distribution for any diminishing factor has many advantages. Graphical presentation of two or more citation curves offers the most straightforward and easiest way for even people unfamiliar with statistical methods to be able to see the relationship between them. From the perspective of national research authorities, it makes it easy to compare citation distributions for different parameters, such as population, gross domestic product, funding,
Conclusions
Any performance comparison of different groups using the citation curve is obscured by its strong dependency on the size of the groups and the same is true for the measures based on this curve such as the popular h-index. By calculating the expected values of the reduced distribution and hr-index with our approach, this disadvantage is eliminated. A graphical comparison of the citation distributions for groups of different sizes represents a very straightforward procedure and allows for wide
References (28)
- et al.
The measurement of low- and high-impact in citation distributions: Technical results
Journal of Informetrics
(2011) - et al.
h-Index: A review focused in its variants, computation and standardization for different scientific fields
Journal of Informetrics
(2009) - et al.
The h index research output measurement: Two approaches to enhance its accuracy
Journal of Informetrics
(2010) - et al.
A multilevel meta-analysis of studies reporting correlation between the h index and 37 different h index variants
Journal of Informetrics
(2011) Hirsch's h-index: A stochastic model
Journal of Informetrics
(2007)Explaining Australia's increased share of ISI publications-the effects of a funding formula based on publication counts
Research Policy
(2003)Examples of simple transformations of the h-index: Qualitative and quantitative conclusions and consequences for other indices
Journal of Informetrics
(2008)An axiomatic characterization of the Hirsch-index
Mathematical Social Sciences
(2008)- et al.
Peer reviews and bibliometric indicators: A comparative study at a Norwegian university
Research Evaluation
(2004) The use and abuse of citations as indicators of research quality
Journal of Multidisciplinary Evaluation
(2005)
The citation merit of scientific publications.
Working Paper, Economic Series
An improvement of the h-index: The g-index
ISSI Newsletter
How to improve the h-index
The Scientist
Study of different h-indices for groups of authors
Journal of the American Society for Information Science and Technology
Cited by (2)
All along the h-index-related literature: A guided tour
2019, Springer Handbooks