Tikalon Header Blog Logo

Modeling Scientific Citation

December 16, 2011

After publishing my first few papers, I was always pleased when I saw that they were cited by other authors. Most scientists don't work just for the money. They can't say that they work for the fame, because fame is elusive for the majority of us who practice what Thomas Kuhn called "normal science." They work to add to the corpus of scientific knowledge and thereby improve the human condition. Citations prove that what they did was valuable.

It was no surprise to me that citations to my papers would appear in the first few years after publication, and then essentially vanish. Science advances at a rapid pace, so much so that most universities have limits on how long a student can work towards a Ph.D. degree. It's usually about five to seven years, although the time is often counted just from completion of qualifying exams. Five year old research is generally old news.

A physicist would expect that the citation frequency for a particular paper would follow something like an exponential decay. Not even Einstein's citations are immune to this principle. His special relativity paper, "Zur Elektrodynamik bewegter Körper,"[1] was published in September 26, 1905, a year known as his Annus Mirabilis ("year of wonders"). Aside from a citation blip around his award of a Nobel Prize in 1921, the citation decay with time is clearly evident. Citations for Einstein's relativity paper as a function of time.

A form of citation analysis using the Google Labs Ngram Viewer. This shows the frequency of occurrence of the phrase, "Elektrodynamik bewegter Körper" in the German literary corpus. The corpus includes all books, journals, newspapers and magazines. There's a peak at the publication of Einstein's 1905 paper, and a blip at his award of the Nobel Prize. Otherwise, the exponential decay can be seen. These data are smoothed over a three year interval, which accounts for the slight phase-shifting of dates. Note the post-World War II baseline shift with the advent of Big Science. (Trend via Google Ngram Viewer).


Three physicists from the University of Fribourg (Fribourg, Switzerland) have just published a study that investigates the hypothesis that highly cited papers have a higher probability of being cited in other papers. They further attempt to model scientific paper citations by incorporating this aging effect.[2-5] They limited the scope of their study to 450,000 papers published from 1893 to 2009 in American Physical Society journals.[4]

The simple hypothesis without aging would yield a power-law distribution for the number of citations among papers, but the actual distribution is different. The reason is that old science, just like old news, is not considered relevant. The same aging effect appears on the Internet, where old web pages attract fewer links as a function of time.[4]

The Swiss physicists found that the frequency of new citations that a paper receives declines dramatically after just a few years. Putting an aging correction factor into the general citation model gave a generic model that quite closely predicted the citation distribution of the APS journal papers.[4] As can be seen in the figure, the citation frequency appears to be an exponential decay at longer times.

Citation distribution

Time decay of paper relevance, which is a function of its citations in 91-day intervals, as a function of time. The data are grouped into three sets, as defined in ref. 3. This is a simplified version of fig. 1 of ref. 3.

(Via the arXiv Preprint Server))


One of the more interesting pieces of data in the paper is the revelation that 60,000 of the 450,000 papers (13%) weren't cited at all, at least in APS journals that comprised the dataset.[2-3] So, you spend many hundreds of man hours on your research, many tens of hours preparing the manuscript and jousting with the referees, and no one really cares.

References:

  1. Albert Einstein, "Zur Elektrodynamik bewegter Körper," Annalen der Physik, vol. 17, no. 10 (September 26, 1905), pp. 891-921. Digitized version at Wikilivres; English translation: Megh Nad Saha, Translator, "On the Electrodynamics of Moving Bodies," in The Principle of Relativity: Original Papers by A. Einstein and H. Minkowski, University of Calcutta, 1920, pp. 1–34; Digitized version at Wikisource.
  2. Matúê Medo, Giulio Cimini, and Stanislao Gualdi, "Temporal Effects in the Growth of Networks," Physical Review Letters, vol. 107, no. 23 (December 2, 2011), Document No. 238701 (4 pages).
  3. Matúê Medo, Giulio Cimini, and Stanislao Gualdi, "Temporal Effects in the Growth of Networks," arXiv Preprint Server, September 26, 2011.
  4. Michael Schirber, "You Don't Cite Me Anymore," American Physical Society, December 1, 2011.

Permanent Link to this article

Linked Keywords: Citation; scientist; Thomas Kuhn; normal science; corpus; human condition; university; Ph.D. degreevqualifying exam; physicist; exponential decay; Einstein; special relativity; Annus Mirabilis; Nobel Prize; Google Labs Ngram ViewervWorld War II; Big Science; University of Fribourg (Fribourg, Switzerland); hypothesis; American Physical Society journals; power-law distribution; Internet; Swiss; arXiv Preprint Server; referee.