A simple way to improve multivariate analyses of paleoecological data sets

John Alroy

doi:10.1017/pab.2014.21

A simple way to improve multivariate analyses of paleoecological data sets

Published online by Cambridge University Press: 24 February 2015

John Alroy

Show author details

John Alroy*: Affiliation:
Department of Biological Sciences, Macquarie University, New South Wales 2109, Australia. E-mail: [email protected]

Article contents

Abstract
References

Get access

Rights & Permissions

Abstract

Multivariate methods such as cluster analysis and ordination are basic to paleoecology, but the messy nature of fossil occurrence data often makes it difficult to recover clear patterns. A recently described faunal similarity index based on the Forbes coefficient improves results when its complement is employed as a distance metric. This index involves adding terms to the Forbes equation and ignoring one of the counts it employs (that of species found in neither of the samples under consideration). Analyses of simulated data matrices demonstrate its advantages. These matrices include large and small samples from two partially overlapping species pools. In a cluster analysis, the widely used Dice coefficient and the Euclidean distance metric both create groupings that reflect sample size, the Simpson index suggests large differences that do not exist, and the corrected Forbes index creates groupings based strictly on true faunal overlap. In a principal coordinates analysis (PCoA) the Forbes index almost removes the sample-size signal but other approaches create a second axis strongly dominated by sample size. Meanwhile, species lists of late Pleistocene mammals from the United States capture biogeographic signals that standard ordination methods do recover, but the adjusted Forbes coefficient spaces the points out more sensibly. Finally, when biome-scale lists for living mammals are added to the data set and extinct species are removed, correspondence analysis misleadingly separates out the biome lists, and PCoA based on the Dice coefficient places them to the edge of the cloud of fossil assemblage data points. PCoA based on the Forbes index places them in more reasonable positions. Thus, only the adjusted Forbes index is able to recover true biological patterns. These results suggest that the index may be useful in analyzing not only paleontological data sets but any data set that includes species lists having highly variable lengths.

Type: Featured Article
Information: Paleobiology , Volume 41 , Issue 3 , June 2015 , pp. 377 - 386

DOI: https://doi.org/10.1017/pab.2014.21 [Opens in a new window]
Copyright: Copyright © 2015 The Paleontological Society. All rights reserved.

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

Article purchase

Temporarily unavailable

References

Literature Cited

Alroy, J. 1999. Putting North America’s end-Pleistocene megafaunal extinction in context: large scale analyses of spatial patterns, extinction rates, and size distributions. Pp. 105–143in R. D. E. MacPhee, ed. Plenum, New York.Google Scholar

Alroy, J. 2015. A new twist on a very old binary similarity coefficient. Ecology (in press).CrossRef Google Scholar

Bonelli, J. R. Jr., Brett, C. E., Miller, A. I., and Bennington, J. B.. 2006. Testing for faunal stability across a regional biotic transition: quantifying stasis and variation among recurring coral-rich biofacies in the Middle Devonian Appalachian Basin. Paleobiology 32:20–37.CrossRef Google Scholar

Brown, J. H., and Nicoletto, P. F.. 1991. Spatial scaling of species composition: body masses of North American land mammals. American Naturalist 138:1478–1512.CrossRef Google Scholar

Bush, A. M., and Brame, R. I.. 2010. Multiple paleoecological controls on the composition of marine fossil assemblages from the Frasnian (Late Devonian) of Virginia, with a comparison of ordination methods. Paleobiology 36:573–591.CrossRef Google Scholar

Chao, A., Chazdon, R. L., Colwell, R. K., and Shen, T.-J.. 2005. A new statistical approach for assessing similarity of species composition with incidence and abundance data. Ecology Letters 8:148–159.CrossRef Google Scholar

Choi, S.-S., Cha, S.-H., and Tappert, C. C.. 2010. A survey of binary similarity and distance measures. Systemics, Cybernetics and Informatics 8:43–48.Google Scholar

De’ath, G. 1999. Extended similarity: a method of robust estimation of ecology distances from high beta diversity data. Plant Ecology 144:191–190.Google Scholar

Digby, P. G. N., and Kempton, R. A.. 1987. Multivariate analysis of ecological communities. Chapman and Hall, London.Google Scholar

Gauch, H. G. 1982. Multivariate analysis in community ecology. Cambridge University Press, Cambridge.CrossRef Google Scholar

Forbes, S. A. 1907. On the local distribution of certain Illinois fishes: an essay in statistical ecology. Bulletin of the Illinois State Laboratory of Natural History 7:272–303.Google Scholar

Gower, J. C. 1966. Some distance properties of latent root and vector methods used in multivariate analysis. Biometrika 53:325–338.CrossRef Google Scholar

Graham, R. W., Lundelius, E. L. Jr., Graham, M. A., Schroeder, E. K., Toomey, R. S. III, Anderson, E., Barnosky, A. D., Burns, J. A., Churcher, C. S., Grayson, D. K., Guthrie, R. D., Harington, C. R., Jefferson, G. T., Martin, L. D., McDonald, H. G., Morlan, R. E., Semken, H. A. Jr., Webb, S. D., Werdelin, L., and Wilson, M. C.. 1996. Spatial response of mammals to Late Quaternary environmental fluctuations. Science 272:1601–1606.CrossRef Google Scholar PubMed

Hagmeier, E. M., and Stults, C. D.. 1964. A numerical analysis of the distributional patterns of North American mammals. Systematic Zoology 13:125–155.CrossRef Google Scholar

Hill, M. O. 1973. Reciprocal averaging: an eigenvector method of ordination. Journal of Ecology 61:237–249.CrossRef Google Scholar

Hill, M. O., and Gauch, H. G.. 1980. Detrended correspondence analysis, an improved ordination technique. Vegetatio 42:47–58.CrossRef Google Scholar

Holland, S. M., Miller, A. I., Meyer, D. L., and Dattilo, B. F.. 2001. The detection and importance of subtle biofacies within a single lithofacies: the Upper Ordovician Kope Formation of the Cincinnati, Ohio region. Palaios 16:205–217.2.0.CO;2>CrossRef Google Scholar

Hubálek, Z. 1982. Coefficients of association and similarity, based on binary (presence-absence) data: an evaluation. Biological Reviews 57:669–689.CrossRef Google Scholar

Legendre, P., and Gallagher, E. D.. 2001. Ecologically meaningful transformations for ordination of species data. Oecologia 129:271–280.CrossRef Google Scholar PubMed

Reyment, R. A. 1963. Multivariate analytical treatment of quantitative species associations: an example from palaeoecology. Journal of Animal Ecology 32:535–547.CrossRef Google Scholar

Shepard, R. N. 1962. The analysis of proximities: multidimensional scaling with an unknown distance function. II. Psychometrika 27:219–246.CrossRef Google Scholar

Simpson, G. G. 1943. Mammals and the nature of continents. American Journal of Science 241:1–31.CrossRef Google Scholar

Simpson, G. G. 1964. Species density of North American Recent mammals. Systematic Zoology 13:57–73.CrossRef Google Scholar

Smith, F. A., Lyons, K., Ernest, S. K. M., Jones, K. E., Kaufman, D., Dayan, T., Marquet, P. A., Brown, J. H., and Haskell, J. P.. 2003. Body mass of late Quaternary mammals. Ecological Archives E084–E094.Google Scholar

Tsubamoto, T., Takai, M., and Egi, N.. 2004. Quantitative analyses of biogeography and faunal evolution of middle to late Eocene mammals in East Asia. Journal of Vertebrate Paleontology 24:657–667.CrossRef Google Scholar

Valentine, J. W., and Peddicord, R. G.. 1967. Evaluation of fossil assemblages by cluster analysis. Journal of Paleontology 41:502–507.Google Scholar

Williamson, M. H. 1978. The ordination of incidence data. Journal of Ecology 66:911–920.CrossRef Google Scholar

Article contents

A simple way to improve multivariate analyses of paleoecological data sets

Abstract

Access options

Article purchase

Temporarily unavailable

References

Literature Cited

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests