Hypothesis tests‎ > ‎MANOVA‎ > ‎

NPMANOVA

The main idea...

 Null hypothesisThere are no differences in the presence/absence or relative magnitude of a set of variables among objects from different groups or treatments.
 For example: There are no differences in the composition and/or relative abundances of organisms of different species (variables) in samples from different groups or treatments (adapted from Anderson, 2001)

As noted by Anderson (2001), ecological data sets rarely conform to the assumptions of MANOVA-like procedures (see MANOVA). For example, rare species inflate the data set with zeros while species with low abundances are unlikely to be normally distributed (the "bell-shaped" curve will be 'cut' at zero, resembling a Poisson distribution with λ ~ 1).  Nonetheless, the power of MANOVA-like procedures, especially in partitioning variation between multiple factors and their interactions, is in much demand.

She thus proposed a non-parametric multivariate analysis of variance (NPMANOVA or PERMANOVA) method that addresses the limits of these assumptions, allows the use of any dissimilarity measure between objects (rather than only Euclidean distances), and can partition variation between the various terms included in the NPMANOVA model (i.e. support analysis of multi-factorial designs). Further, NPMANOVA is tolerant towards non-independent variables. In a data set with sites × OTUs table, for example, it is unlikely that the presence or abundance of OTUs at a given site is independent.

NPMANOVA is analogous to to a distance-based redundancy analysis (db-RDA) wherein the 'grouping variables' may be represented as dummy variables in the explanatory variable matrix. However, NPMANOVA, in addition to being simpler, has been found to have a more reliable Type I error rate (McArdle and Anderson, 2001).


Figure 1: A grouped Bray-Curtis dissimilarity matrix. Note that the matrix is symmetrical about its diagonal. NPMANOVA will compare the within-group dissimilarities (blue triangles) to the between group dissimilarities (orange square) through a pseudo F-ratio (Equation 1)


The statistic

The test statistic used is a pseudo F-ratio, similar to the F-ratio in ANOVA. It compares the total sum of squared dissimilarities (or ranked dissimilarities) among objects belonging to different groups to that of objects belonging to the same group (Equation 1). Larger F-ratios indicate more pronounced group separation, however, the significance of this ratio is usually of more interest than its magnitude.

Equation 1: The F-ratio used in standard NPMANOVA is similar to the traditional F-ratio used in ANOVA, however, does not share the same distribution. SSW is the sum of squared dissimilarities within groups, SSA is the sum of squared dissimilarities among (between) groups, a is the number of groups, and N is the total number of objects. The terms (a-1) and (N-a) are the degrees of freedom associated with the explanatory factor (the grouping variable) and the residuals. See Anderson (2001) for discussion and formulae for SSW and SSfor simple and more complex designs


Significance

NPMANOVA uses permutation to assess the significance of the pseudo F-statistic described above.

In a one-way test (where the interest is on whether a statistic is either less than or greater than what can be expected by chance), the P-value calculated reports the proportion of permuted pseudo F-statistics which are greater than or equal to the observed statistic, i.e. what proportion of the permuted data sets yield a better resolution of groups relative to the actual data set following an NPMANOVA. It is generally accepted that any separation between groups is not significant if more than ~ 5% of the permuted F-statistics have values greater than that of the observed statistic (i.e. a P-value > 0.05).  

It is vital that the correct permutational scheme is defined and only exchangeable units are permuted. In nested studies, this would mean restricting permutations to an appropriate subgroup of the data set. At times, exact permutation tests either cannot be done, or are restricted to so few objects, that they are not useful. See Anderson (2001, 2005) for examples of permutational schemes involving complex experimental or sampling designs. 

Post-analysis: a posteriori testing

As in ANOVA, a significant result indicates that there is a significant difference between the groups defined; however, there is no way of knowing which groups are significantly separated. A posteriori testing, using NPMANOVA, of each pair of groups can be performed after a significant result to determine this. As these are pairwise comparisons, the test statistic involved is the non-parametric, multivariate analogue of the t-statistic, with significance determined by permutation, as above. 

As this involves multiple testing, an appropriate correction should be applied.

Key assumptions
  • According to Anderson (2001), the only assumption of NPMANOVA is that the objects in the data set are exchangeable under the null hypothesis. That further implies:
    • exchangeable objects (sites, samples, observations, etc.) are independent
    • exchangeable objects have similar multivariate dispersion (i.e. each group has a similar degree of multivariate scatter. See Anderson, 2001 and 2006)
Warnings
  • NPMANOVA takes no account of correlations between variables and any hypothesis that depends on detecting such relationships will not be addressed.
  • Nested or hierarchical designs require an appropriate permutational scheme, carefully understanding which objects are truly exchangeable under the null hypothesis. Most importantly, the analyst must define "strata" within which to restrict permutations. See the permutation page for more.
  • This method generally assumes balanced designs, however, unbalanced designs can be handled (see McArdle and Anderson, 2001).
  • Anderson (2001) warns that groups of objects with different dispersions, yet no significant differences in centres (centres are similar to means, but may be non-Euclidean), may result in misleadingly low P-values. It is thus recommended that the dispersion be evaluated and considered when interpreting the results of NPMANOVA. See Anderson (2006) for a discussion on tests of multivariate dispersion.
  • Criticisms of this and other (dis)similarity-based methods should be taken into account (e.g. Warton et al. 2012).


Implementations
MASAME PERMANOVA app

References
Comments