^{1}

^{2}

Various kinds of metrics used for the quantitative evaluation of scholarly journals are reviewed. The impact factor and related metrics including the immediacy index and the aggregate impact factor, which are provided by the Journal Citation Reports, are explained in detail. The Eigenfactor score and the article influence score are also reviewed. In addition, journal metrics such as CiteScore, Source Normalized Impact per Paper, SCImago Journal Rank, h-index, and g-index are discussed. Limitations and problems that these metrics have are pointed out. We should be cautious to rely on those quantitative measures too much when we evaluate journals or researchers.

There exist a variety of metrics that are used to indicate the level and the influence of scholarly journals. Most of these metrics are obtained by analyzing the citation data of journal articles. Among them, the impact factor is the best-known and most influential index. This index is calculated by a very simple and easy method, but it also has several problems. A number of other metrics have been proposed for the purpose of correcting these problems and providing more reliable estimates. In the present review, we introduce the definitions of several journal metrics and the methods to calculate them and explain their characteristics and defects briefly.

The idea of impact factor was proposed by Eugene Garfield in 1955 [

In a given year, the impact factor of a certain journal is defined as the average value of citations per paper received by the items published in the journal in two previous years. More specifically, its definition is given by

Impact factor of the journal J in the year X=A/B,

where A is the number of total citations in the year X received by all items published in the journal J in the years (X-1) and (X-2) and B is the total number of all citable items published in the journal J in the years (X-1) and (X-2). Citable items include only papers and reviews and do not include errata, editorials and abstracts. In the counting of A, however, citations to all items published in J are included.

The 5-year impact factor in the year X is similar to the ordinary (2-year) impact factor, except that it is calculated using the citation data during the 5 years from the year (X-1) to the year (X-5). This index is useful in the academic disciplines where the number of citations is small or it takes some time for published results to be accepted by many researchers. On the other hand, the immediacy index is calculated similarly to the impact factor using the total number of citations received in the year X by all items published in the same year X. If this index is large, it means that the papers published in that journal are cited rather quickly.

The journal self-citation means the case where a paper published in the journal J is cited in the same journal. In the JCR, the impact factor without self cites, which is obtained after excluding journal self-citations, is also announced. If the difference between the impact factor and the impact factor without self cites is significantly large for a certain journal, sometimes that journal is excluded from the JCR list.

The cited half-life is calculated using the number of citations received in the year X by all items published in a certain journal in all years. For example, let us suppose that the journal J received 1,285 citations in 2017. In

There is a problem with the impact factor in that it shows rather large variations among academic disciplines. For that reason, the JCR classifies journals based on the subject category and provides several metrics representing each category. The median impact factor is that of the journal placed precisely in the middle when the journals in a certain category are arranged in the order of their impact factors. When the total number of journals in the category, N, is an odd number, it is the impact factor of the [1+(N-1)/2]-th journal. When N is even, it is the average of the impact factors of the (N/2)-th and [1+N/2]-th journals.

The aggregate impact factor is obtained by dividing the total number of citations received by all items published in all journals in a certain category in the year X by the total number of citable items published in all journals in that category in the years (X-1) and (X-2). Since the distribution of impact factors is not linear but highly skewed, the aggregate impact factor tends to be substantially larger than the median impact factor, as can be seen in

As we mentioned already, there is a problem with the impact factor in that it shows large variations among academic disciplines. In

The impact factor is obtained by the arithmetic mean of the number of citations received by the items published in a certain journal. However, it is well-known that the distribution of the number of citations in a given journal is highly skewed. There exists a tendency that the impact factor overestimates the importance of individual papers. In other words, most papers are cited substantially less than what the journal impact factor indicates. Therefore it is not accurate to judge the quality of an individual paper or researcher based on the journal impact factor.

As competition among scholarly journals becomes stronger, it sometimes occurs that some journal editors adopt policies to manipulate journal impact factor deliberately. One practice that is ethically troubling is to induce authors to do journal self-citation. Publishing more review papers than is necessary and publishing papers which have a higher chance of citation deliberately at the beginning of a year are similar practices. This behavior occurs because too much importance is given to the impact factor and distorts the metric unfairly.

The Eigenfactor score and the journal influence score were developed by Bergstrom et al. [

where _{ij}_{i}_{kj}^{*}, which is obtained by replacing all columns corresponding to the dangling nodes by the article vector

where α is an appropriate constant and is usually selected to be 0.85. The journal influence vector, _{i}

According to this definition, the sum of all Eigenfactor scores for all journals in the database is equal to 100. Since this quantity is not normalized by the total number of papers published in a given journal, it tends to be larger for journals publishing larger number of papers, if all other conditions are the same. A useful characteristic of the Eigenfactor is that it makes it possible to compare journals belonging to different academic disciplines directly because those differences are adjusted for in this metric. The article influence score _{i}

This quantity can be used as an alternative to the impact factor. The mean article in the entire JCR database has an article influence of 1.

In this section, we review three journal metrics provided by the Scopus database, which are the CiteScore, the Source Normalized Impact per Paper (SNIP), and the SCImago Journal Rank (SJR).

The CiteScore is very similar to the impact factor. It is calculated using the Scopus data and is defined as the average value of citations per item received by the items published in the journal in three previous years, rather than in two previous years as in the case of the impact factor. Another difference from the impact factor is that both numerator and denominator include all document types.

The SNIP was proposed by Moed [

SNIP=RIP/RDCP,

where the acronyms RIP and RDCP stand for “raw impact per paper” and “relative database citation potential” respectively. The RIP is the number of citations in the year X received by the papers published in the three previous years, (X-1), (X-2), and (X-3) in a certain journal divided by the total number of papers. It is similar to the impact factor, except that the 3-year citation window is used and only citations of papers are included and those of errata and editorials are excluded. In order to define the RDCP, one needs to define the DCP, which means the database citation potential, first. Let us consider the references of the papers which cited in the year X the papers published in a certain journal in the three previous years, (X-1), (X-2), and (X-3). Among these references, we consider only the references published during the same 3-year period. The DCP is obtained by dividing the total number of those references by the number of citing papers. In this calculation, only citations of the journals belonging to the database are included and other journals are ignored. The RDCP is obtained by normalizing the DCP by the median DCP of the database.

The SJR is provided by the Scopus together with the SNIP [_{i}_{i}_{i}

where the constants _{i}

where _{i}

The h-index was proposed by Hirsch in 2005 [

Since the h-index is obtained by using the total number of citations of each paper, it increases monotonically with time. It has a shortcoming that researchers with a small number of very influential papers have low indices. In order to correct this shortcoming, Leo Egghe proposed a modified index named g-index. This index is defined as the maximum value of g when g papers among a certain group of papers were cited at least g^{2} times. The g-index is always larger than the h-index. In addition to the h-index, Google Scholar provides a metric named i10-index, which is the total number of papers authored by a certain researcher cited at least 10 times.

In this review, we have surveyed the definitions and the characteristics of various kinds of metrics used for the quantitative evaluation of scholarly journals. All of these metrics are obtained from the analysis of citation data. In addition to the metrics surveyed here, new kinds of metrics continue to be devised. More recently, interest in alternative metrics, or ‘altmetrics,’ which go beyond conventional citation analysis, has been growing rapidly. We emphasize, however, that no metric is perfect and all metrics have limits and problems. Therefore it is necessary not to rely on quantitative measures too much when we evaluate journals, papers, researchers, and institutions.

No potential conflict of interest relevant to this article was reported.

Number of citations received in 2017 and its cumulative percentage classified in terms of the published year of cited items

2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007–all | |
---|---|---|---|---|---|---|---|---|---|---|---|

Citations in 2017 | 23 | 65 | 147 | 138 | 58 | 44 | 51 | 45 | 68 | 62 | 584 |

Cumulative percentage | 1.79 | 6.85 | 18.29 | 29.03 | 33.54 | 36.97 | 40.93 | 44.44 | 49.73 | 54.55 | 100 |

The aggregate impact factor, the median factor, the aggregate cited half-life, and the average number of citations per paper for several subject categories listed in the Journal Citation Reports in 2011 and 2013

Subject category | Aggregate impact factor |
Median impact factor |
Aggregate cited half-life |
Average number of citations per paper |
||||
---|---|---|---|---|---|---|---|---|

2011 | 2013 | 2011 | 2013 | 2011 | 2013 | 2011 | 2013 | |

Cell biology | 5.760 | 5.816 | 3.263 | 3.333 | 6.9 | 7.2 | 53.4 | 55.0 |

Chemistry, multidisciplinary | 4.738 | 5.222 | 1.316 | 1.401 | 5.9 | 5.6 | 40.9 | 44.6 |

Nanoscience & nanotechnology | 4.698 | 4.902 | 1.918 | 1.768 | 3.8 | 4.1 | 35.5 | 39.1 |

Astronomy & astrophysics | 4.242 | 4.462 | 1.683 | 1.676 | 6.8 | 7.0 | 49.3 | 53.2 |

Materials science, multidisciplinary | 3.107 | 3.535 | 1.132 | 1.380 | 5.2 | 5.4 | 32.2 | 34.8 |

Physics, multidisciplinary | 2.680 | 2.953 | 0.983 | 1.300 | 7.7 | 8.0 | 30.4 | 33.8 |

Engineering, mechanical | 1.232 | 1.573 | 0.743 | 0.889 | 7.6 | 8.0 | 25.2 | 28.1 |

Mathematics | 0.709 | 0.729 | 0.561 | 0.582 | > 10.0 | > 10.0 | 19.8 | 21.0 |