Tikalon Blog is now in archive mode.
An easily printed and saved version of this article, and a link
to a directory of all articles, can be found below: |
This article |
Directory of all articles |
Modeling Scientific Citation
December 16, 2011
After publishing my first few papers, I was always pleased when I saw that they were
cited by other authors. Most
scientists don't work just for the money. They can't say that they work for the fame, because fame is elusive for the majority of us who practice what
Thomas Kuhn called "
normal science." They work to add to the
corpus of scientific knowledge and thereby improve the
human condition. Citations prove that what they did was valuable.
It was no surprise to me that citations to my papers would appear in the first few years after publication, and then essentially vanish. Science advances at a rapid pace, so much so that most
universities have limits on how long a student can work towards a
Ph.D. degree. It's usually about five to seven years, although the time is often counted just from completion of
qualifying exams. Five year old research is generally old news.
A
physicist would expect that the citation frequency for a particular paper would follow something like an
exponential decay. Not even
Einstein's citations are immune to this principle. His
special relativity paper, "Zur Elektrodynamik bewegter Körper,"[1] was published in September 26, 1905, a year known as his
Annus Mirabilis ("year of wonders"). Aside from a citation blip around his award of a
Nobel Prize in 1921, the citation decay with time is clearly evident.
A form of citation analysis using the Google Labs Ngram Viewer. This shows the frequency of occurrence of the phrase, "Elektrodynamik bewegter Körper" in the German literary corpus. The corpus includes all books, journals, newspapers and magazines. There's a peak at the publication of Einstein's 1905 paper, and a blip at his award of the Nobel Prize. Otherwise, the exponential decay can be seen. These data are smoothed over a three year interval, which accounts for the slight phase-shifting of dates. Note the post-World War II baseline shift with the advent of Big Science. (Trend via Google Ngram Viewer).
Three physicists from the
University of Fribourg (Fribourg, Switzerland) have just published a study that investigates the
hypothesis that highly cited papers have a higher probability of being cited in other papers. They further attempt to model scientific paper citations by incorporating this aging effect.[2-5] They limited the scope of their study to 450,000 papers published from 1893 to 2009 in
American Physical Society journals.[4]
The simple hypothesis without aging would yield a
power-law distribution for the number of citations among papers, but the actual distribution is different. The reason is that old science, just like old news, is not considered relevant. The same aging effect appears on the
Internet, where old web pages attract fewer links as a function of time.[4]
The
Swiss physicists found that the frequency of new citations that a paper receives declines dramatically after just a few years. Putting an aging correction factor into the general citation model gave a generic model that quite closely predicted the citation distribution of the APS journal papers.[4] As can be seen in the figure, the citation frequency appears to be an exponential decay at longer times.
Time decay of paper relevance, which is a function of its citations in 91-day intervals, as a function of time. The data are grouped into three sets, as defined in ref. 3. This is a simplified version of fig. 1 of ref. 3.
(Via the arXiv Preprint Server))
One of the more interesting pieces of data in the paper is the revelation that 60,000 of the 450,000 papers (13%) weren't cited at all, at least in APS journals that comprised the dataset.[2-3] So, you spend many hundreds of man hours on your research, many tens of hours preparing the manuscript and jousting with the
referees, and no one really cares.
References:
- Albert Einstein, "Zur Elektrodynamik bewegter Körper," Annalen der Physik, vol. 17, no. 10 (September 26, 1905), pp. 891-921. Digitized version at Wikilivres; English translation: Megh Nad Saha, Translator, "On the Electrodynamics of Moving Bodies," in The Principle of Relativity: Original Papers by A. Einstein and H. Minkowski, University of Calcutta, 1920, pp. 1–34;
Digitized version at Wikisource.
- Matúê Medo, Giulio Cimini, and Stanislao Gualdi, "Temporal Effects in the Growth of Networks," Physical Review Letters, vol. 107, no. 23 (December 2, 2011), Document No. 238701 (4 pages).
- Matúê Medo, Giulio Cimini, and Stanislao Gualdi, "Temporal Effects in the Growth of Networks," arXiv Preprint Server, September 26, 2011.
- Michael Schirber, "You Don't Cite Me Anymore," American Physical Society, December 1, 2011.
Permanent Link to this article
Linked Keywords: Citation; scientist; Thomas Kuhn; normal science; corpus; human condition; university; Ph.D. degreevqualifying exam; physicist; exponential decay; Einstein; special relativity; Annus Mirabilis; Nobel Prize; Google Labs Ngram ViewervWorld War II; Big Science; University of Fribourg (Fribourg, Switzerland); hypothesis; American Physical Society journals; power-law distribution; Internet; Swiss; arXiv Preprint Server; referee.