Tikalon Blog is now in archive mode.
An easily printed and saved version of this article, and a link
to a directory of all articles, can be found below: |
This article |
Directory of all articles |
A Gigabunch of T-Rex
May 31, 2021
Everyone has heard the statement that "
There are three kinds of lies: lies, damned lies, and statistics." The origin of this saying is unknown, but it was
popularized by
Mark Twain (1835-1910) and often falsely
attributed to him. It does sound like something that Twain would say.
Scientists would object to having
statistics conflated with
falsehood, since statistics are so important in the practice of
science.
Scientific
Knowledge comes from
experiment, but experiments are subject to
observational errors,
instrument calibration errors, and occasionally by a scientist's
selection bias. However,
data analysis using
statistics and
regression analysis can coax a reasonable result from noisy experimental
data.
Statistical analysis benefits from many observations, since the
deviation from the mean value scales as the
square root of the number of measured values. An unfortunate consequence of this law is that you get just three times more
precision in a hundred trials over ten. Most importantly, statistics are not
robust against bad data, so scientists need to design experiments to produce data that are relatively free from interfering
variables.
Victorian polymath, Francis Galton (1822-1911).
This photograph shows that Galton was clearly an egghead, a term for an intellectual with a high forehead, although he would more likely be called the less pejorative term, boffin, in his native country, England.
Adlai Stevenson II (1900-1965), who had the unfortunate experience of being defeated in landslide voting twice in a US presidential election by Dwight Eisenhower (1890-1969), was called an egghead by Richard Nixon (1913-1994), who ran as Eisenhower's Vice President.
(Wikimedia Commons image, modified for artistic effect.)
A I wrote in a
recent article (Educated Guessing, December 7, 2020), you can use statistics to extract
knowledge from
opinion, a
phenomenon called the "
Wisdom of the Crowd." This was demonstrated in 1906 by the
Victorian polymath,
Francis Galton (1822-1911).[1] Galton
observed a contest at a
livestock fair in which 787 visitors guessed the
weight of a particular
ox.[2] Although there was a wide
range of
estimates, the
mean of the 787 guesses was within a
pound of the actual weight, 0.1% of the actual value.[2]
Ornithologists have
cataloged bird species for
centuries, and the question arises as to how many bird species exist. Statistics can easily address this problem, as shown in the figure below in which I've
graphed the
fraction of total discovered species as a
function of year. While
Charles Darwin (1809-1882) discovered
many new species of finches on the
second voyage of the HMS Beagle in 1837, it's rare to find a new species today. The red line is a
fit to the statistical
error function, and selecting a
future date will tell us how difficult it would be to find a new bird species.
Fraction of total discovered bird species as a function of year and an error function fit.
The fitted function is (1+erf(((year-1845)/60)))/2, and using 2021 as a date shows that 99.998% of bird species have been cataloged. Any species found now is a rare bird indeed.
(My data was from ref. 3, which now has link rot of its figures, but I saved an image of this important figure. Graphed using Gnumeric)
One of the most interesting statistical arguments about species is an
estimate of
human extinction. This so-called
Doomsday argument is not based on the likelihood of bad things happening - It merely looks at the
numbers. The estimate by
Princeton University astrophysicist,
John Richard Gott III (b. 1947), is that
humanity has a 95%
probability of lasting another 5100 - 7.8 million years.[4]
The argument is based on an extension of the
Copernican principle, that man does not have a favored position in the
universe, to the
idea that man doesn't have a favored position in
time. This means that our present
population is at a
random position on a
bell curve of time. This estimate has a very large range, and the low estimate is quite a bit shorter than the quarter of a million years that humans have already existed. The argument is
controversial.[5]
One interesting statistical law in
population ecology is the relationship between an
organism's mass and the
number of its individuals known as
Damuth's law. This law
quantifies the observation that the
population of small
animals is larger than that of larger animals; viz., there are many more
mice than
elephants.[6] Damuth's law is simply expressed as
in which d is the
average density of the population, a is a
constant, and W is the average body mass of the organism. As the figure below shows, there is considerable
variance in the data. As an example,
jaguars and
hyaenas are about the same weight, but there are fifty hyenas for every jaguar. The
regression r-value is -0.86, which means that the overall trend is correct.
Schematic representation of the data from Damuth's paper.[6]
The shaded area is the approximate range of data values. This is quite a large range, since these are logarithmic scales.
(Click for larger image.)
Estimating the abundance of bird species is one thing, but estimating abundance for
extinct species is much harder to do. A team of
biologists from the
University of California, Berkeley (Berkeley, California) decided to do this for one of the most
iconic dinosaur species, the
Tyrannosaurus rex, and they used Damuth's law in their analysis.[7-9] The research team was led by
Charles Marshall,
director of the
University of California Museum of Paleontology and a
professor at Berkeley.
This is an image by William D. Matthew of the first restoration of a Tyrannosaurus.
This historical image is presently not considered to be an accurate restoration, since the skull shape is wrong, the Tyrannosaurus had two fingers (not three as pictured), and the tail was held level with body.
(Modified Wikimedia Commons image. Click for larger image.)
Fewer than a hundred T. rex fossils have been found since it was first described in 1905.[9] The present study found that about 2.5 billion T. rex individuals existed over the 2-1/2 million year span of their species before extinction, and there were likely about 20,000 T. rex adults living at any one time.[7-8] This
calculates to an average population density of about one T. rex for 40
square miles (100
square kilometers).[9]
The present study considers that an average adult T. rex had a weight of 5.2
tons, an average
lifespan of 28 years, a
generation period of 19 years, and a total number of generations for the entire species' existence at about 125,000.[7,9] The famous
T. rex specimen named Sue, on display at the
Field Museum of Natural History in
Chicago, measures 40-1/2-
feet (12.3-
meters) in
length, with an estimated weight of 9 tons, and a lifespan of about 33 years.[9] The fossil recovery rate is just 1 per about 80 million individuals; and 1 per 16,000 individuals where its fossils are most abundant, which is the
Hell Creek Formation in
Montana.[7-8]
The study team used
Monte Carlo computer simulations to determine how the uncertainties in their data affected the uncertainties in their results.[8] Among the unknowns was how
warm-blooded T. rex was.[8] Their estimate was to position a T. rex
halfway between a
lion and a
Komodo dragon, the largest
lizard.[8] It's the uncertainty in the density–body mass relationship of Damuth's Law, rather than variance in the
paleobiological reference data, that leads to the large overall variance.[7] The
95% confidence interval of the T. rex population is from 1,300 to 328,000 individuals; so, the total number of individuals that existed over the lifetime of the species can range from 140 million to 42 billion.[8] Says Marshall,
"Our calculations depend on this relationship for living animals between their body mass and their population density, but the uncertainty in the relationship spans about two orders of magnitude... Surprisingly, then, the uncertainty in our estimates is dominated by this ecological variability and not from the uncertainty in the paleontological data we used."[8]
Although the uncertainties in the estimates are large, the procedure might be a
framework for estimating populations of other fossilized creatures.[8] It's also a way to estimate the number of missing fossil species.[8] As Marshall explains, "With these numbers, we can start to estimate how many short-lived,
geographically specialized species we might be missing in the fossil record... This may be a way of beginning to quantify what we don’t know."[8]
References:
- Francis Galton, "Vox Populi," Nature, vol. 75, no. 1949 (March 1, 1907), pp. 450-451, https://doi.org/10.1038/075450a0.
- Graham Kendall, "How to unleash the wisdom of crowds," The Conversation, February 9, 2016.
- Bird species discovery odds and ends, Slybird Blog, December 14, 2010.
- J. Richard Gott III, "Implications of the Copernican principle for our future prospects," Nature, vol, 363 (May 27, 1993) pp. 315-319. A PDF image file can be found here
- Carlton M. Caves, "Predicting future duration from present age: Revisiting a critical assessment of Gott's rule," arXiv, June 22, 2008.
- John Damuth, "Population density and body size in mammals," Nature, Vol. 290 (April 23, 1981), pp. 699-700. A PDF image file is available here.
- Charles R. Marshall, Daniel V. Latorre, Connor J. Wilson, Tanner M. Frank, Katherine M. Magoulick, Joshua B. Zimmt, and Ashley W. Poust, "Absolute abundance and preservation rate of Tyrannosaurus rex," Science, vol. 372, no. 6539, (April 16, 2021), pp. 284-287, DOI: 10.1126/science.abc8300. This is an open access publication with a PDF file here.
- Robert Sanders, "How many T. rexes were there? Billions," University of California Berkeley Press Release, April 15, 2021.
- Will Dunham, "Like Godzilla, but actually real: study shows T. rex numbered 2.5 billion," Reuters, April 16, 2021.
Linked Keywords: There are three kinds of lies: lies, damned lies, and statistics; promulgate; popularize; Mark Twain (1835-1910); attribution; scientist; statistics; conflate; conflated; deception; falsehood; science; Knowledge; experiment; observational error; scientific instrument; calibration; error; selection bias; data analysis; regression analysis; data; deviation (statistics); deviation from the mean value; proportionality (mathematics); scale; square root; precision; resilience; robust; dependent and independent variable; Victorian era; polymath; Francis Galton (1822-1911); photograph; egghead; intellectual; forehead; pejorative; boffin; homeland; native country; England; Adlai Stevenson II (1900-1965); landslide voting; US presidential election; Dwight Eisenhower (1890-1969); Richard Nixon (1913-1994); Vice President of the United States; opinion; phenomenon; Wisdom of the Crowd; contest; livestock; fair; weight; ox; range (statistics); estimation; estimate; arithmetic mean; pound (mass); ornithology; ornithologist; biological classification; cataloge; bird; species; century; Cartesian coordinate system; graph; fraction (mathematics); function (mathematics); Charles Darwin (1809-1882); Darwin's finches; new species of finches; second voyage of the HMS Beagle; regression analysis; fit; error function; future date; fraction (mathematics); bird; species; error function; regression analysis; fit; biological classification; cataloge; rare bird; link rot; Gnumeric; human extinction; Doomsday argument; number; Princeton University; astrophysics; astrophysicist; John Richard Gott III (b. 1947); human; humanity; probability; Copernican principle; universe; idea; time; world population; randomness; random; normal distribution; bell curve; controversy; controversial; population ecology; organism; mass; population; number of individuals; Damuth's law; quantification; quantify; animal; mouse; mice; elephant; mean; average; area density; constant (mathematics); variance; jaguar; hyaena; coefficient of determination; regression r-value; graph of Damuth's Law; schematic representation; data; scientific literature; paper; approximation; approximate; range (statistics); logarithmic scale; extinction; extinct; biologist; University of California, Berkeley (Berkeley, California); cultural icon; iconic; dinosaur; Tyrannosaurus rex; Charles Marshall; director; University of California Museum of Paleontology; professor; image; restoration; history; historical; accuracy and precision; accurate; skull; finger; tail; body plan; Wikimedia Commons; calculation; calculate; square mile; square kilometer; ton; longevity; lifespan; biological life cycle; generation period; T. rex specimen named Sue; Field Museum of Natural History; Chicago; foot (unit); meter; length; Hell Creek Formation; Montana; Monte Carlo method">Monte Carlo computer simulation; physiology of dinosaurs; warm-blooded; mean; halfway; lion; Komodo dragon; lizard; paleobiology; paleobiological; 95% confidence interval; orders of magnitude; ecology; ecological; conceptual framework; geography; geographical; adaptation; specialize