A non-parametric method for the measurement of size diversity, with emphasis on data standardisation

Xavier D. Quintana, Sandra Brucet, Dani Boix, Rocío López-Flores, Stéphanie Gascón, Anna Badosa, Jordi Sala, Ramon Moreno-Amich,Juan J. Egozcue

Limnol. Oceanogr. Methods 6:75-86 (2008) | DOI: 10.4319/lom.2008.6.75

ABSTRACT: The most suitable method for estimation of size diversity is investigated. Size diversity is computed on the basis of the Shannon diversity expression adapted for continuous variables, such as size. It takes the form of an integral involving the probability density function (pdf) of the size of the individuals. Different approaches for the estimation of pdf are compared: parametric methods, assuming that data come from a determinate family of pdfs, and nonparametric methods, where pdf is estimated using some kind of local evaluation. Exponential, generalized Pareto, normal, and log-normal distributions have been used to generate simulated samples using estimated parameters from real samples. Nonparametric methods include discrete computation of data histograms based on size intervals and continuous kernel estimation of pdf. Kernel approach gives accurate estimation of size diversity, whilst parametric methods are only useful when the reference distribution have similar shape to the real one. Special attention is given for data standardization. The division of data by the sample geometric mean is proposed as the most suitable standardization method, which shows additional advantages: the same size diversity value is obtained when using original size or log-transformed data, and size measurements with different dimensionality (longitudes, areas, volumes or biomasses) may be immediately compared with the simple addition of ln k where k is the dimensionality (1, 2, or 3, respectively). Thus, the kernel estimation, after data standardization by division of sample geometric mean, arises as the most reliable and generalizable method of size diversity evaluation.