Procrustes ANOVA

Procrustes ANOVA is a method for quantifying relative amounts of variation at different levels (Klingenberg and McIntyre 1998; Klingenberg et al. 2002; Klingenberg 2015). It has mostly been used in studies of left-right asymmetry to assess the amount of measurement error in relation to biological sources of variation, because measurement error has been considered as an important factor in these studies (e.g. Palmer and Strobeck 1986).

Background

The background for Procrustes ANOVA was described in detail by Klingenberg and McIntyre (1998) and by Klingenberg et al. (2002). Whereas those papers focused particularly on the study of asymmetry, for which similar types of ANOVA had been developed earlier for distance measurements (Leamy 1984, Palmer and Strobeck 1986), similar considerations apply to a wide variety of morphometric studies. Accordingly, the implementation of Procrustes ANOVA in MorphoJ can accommodate a broader range of morphometric studies. For instance, it can be used to assess the relative magnitudes of measurement error from repeat measurements even if structures for only one body side are measured.

Note that the implementation of Procrustes ANOVA in MorphoJ is limited to the purposes for which it has been used primarily in the literature: quantifying measurement error, asymmetry and variation among individuals. It is therefore not a replacement for general implementations of MANOVAs of various designs (e.g. crossed factorial designs, nested designs). To carry out those analyses, export a dataset Procrustes coordinates or PC scores and conduct the analyses in your favourite statistics program that is designed for such analyses of general linear models (e.g. SAS, SPSS, R, etc.). For doing morphometric analyses on the results, you can import them back into MorphoJ. For instance, matrices of covariance components or MSCP matrices can be imported as covariance matrices (see Import Covariance Matrix) and used in further analyses (PCA, comparisons by matrix correlation, PLS, analyses of modularity etc.).

Goodall's F versus MANOVA statistics

The implementation of Procrustes ANOVA in MorphoJ offers two approaches for the statistical interpretation of the ANOVA results: an interpretation based on Procrustes distances that is an extension of Goodall's F test (Goodall 1991, who considered one-way ANOVA models; extended to the two-way ANOVA customary in asymmetry studies by Klingenberg & McIntyre 1998) and an alternative approach based on MANOVA statistics (Klingenberg et al. 2002).

The advantage of the approach based on Procrustes distances is that it provides an easy and intuitive way to quantify the relative magnitudes of the various effects in the ANOVA model. Another advantage is that the model has many fewer parameters than the MANOVA model, and is therefore better able to cope with small sample sizes. This comes at the price of assuming that the variation is isotropic, that means, that there is the same amount of variation at each landmark and in every direction (circular or spherical scatter of landmarks around the mean position) and that the variation of different landmarks is independent. This assumption is clearly unrealistic for most biological data (e.g. Klingenberg and Monteiro 2005).

A possible solution of this problem for many practical applications is to use the approach based of Procrustes distances to assess the relative magnitudes of effects, but to use the MANOVA approach for making statistical inferences. Caution is necessary in situations where the sample size does not substantially exceed the dimensionality of the data, i.e. with large numbers of landmarks.

Object symmetry: the different shape components

For configurations with object symmetry, the shape space consists of two orthogonal subspaces that contain the components of symmetric shape variation and asymmetry, respectively (e.g. Klingenberg et al. 2002).

For the Procrustes ANOVAs using the isotropic model, MorphoJ uses only the symmetric component for computing the sums of squares for individuals and extra effects. The choice of using only the symmetric component for computing Procrustes SSs and MSs, as well as for testing the extra effects is based on the assumption that these effects simultaneously affect both sides and that effects on directional asymmetry are of a lesser magnitude. This is usually the case if the extra factor is membership in a population or species or and ecological factor.

For the effects of both levels of error and for the residuals, Procrustes SS are computed from both the symmetric and asymmetry components (as applicable to the respective effect).

For all MANOVA tests, MorphoJ considers the two subspaces separately. This means that the test of individual variation uses the symmetric component of the Error 1 effect as the error term (if present; otherwise, the symmetric component of the residual variation). Likewise, the test of Individual-by-Side effect uses the asymmetry component of the Error 1 effect as the error term (if present; otherwise, the asymmetry component of the residual variation).

For computing matrices of covariance components (see Output in the Project Tree, below), the two subspaces are considered as appropriate for the effect under consideration. For the covariance component due to Individuals, only the symmetric component of the Error 1 or residual effect is used to subtract error effects (i.e., those effects that concern the symmetric component of variation). For the component of Individual-by-Side interaction (fluctuating asymmetry), the asymmetry component of the Error 1 or residual effect is used. For the Error 1, Error 2 and residual components, both the symmetric and asymmetry components are included together.

Requesting a Procrustes ANOVA

To request a Procrustes ANOVA, select Procrustes ANOVA from the Variation menu. If there is at least one dataset in the project, a dialog box like the following will appear:

The first item in the dialog box is a text field for specifying a name for the Procrustes ANOVA. Below it is a drop-down menu for selecting the dataset for which the analysis is to be performed.

The dataset must have classifier variables that specify the effects in the Procrustes ANOVA. At the very least, there must be a classifier for the individual effect. To generate or edit classifiers, use Extract New Classifier From ID String or Edit Classifiers, both in the Preliminaries menu, or import the information from a text file by using Import Classifier Variables in the File menu.

The classifiers standing for the effects in the Procrustes ANOVA can be selected in the drop-down menus labeled Individual, Side, Error 1 and Error 2.

A classifier for the Individual effect, indicating the identities of the individuals (specimens) that were measured, is required in every Procrustes ANOVA. For structures with matching symmetry (where there are separate configurations of landmarks for the left and right sides), it is essential that the values for Individual for the left and right sides are identical. All other effects are optional.

The classifier for the Individual effect indicates the identity of each individual, in the same way as the name or social security number do for people (that is assuming that no two people in a group have exactly the same name). Typically, this classifier will have values such as "Jane", "Tony", "Lorna", "Jim" (for people) or "skull 1", "skull 2", "skull 3", etc. (for skulls...). Note that this classifier must denote the individual and not some category such as sex or other groups (there has occasionally been confusion about this).
It is not the format that counts (it could be things like "Joe" or "v569_f 3"), but the classifier must have exactly the same value for all replicates (e.g. images) of each individual and the value for each particular individual must be unique, that is, different from those of all other individuals.

If the data has object symmetry, the effect of Side is specified automatically, and no classifier is required (in this case, the drop-down menu for the Side effect is blocked). If the data have matching symmetry, a classifier for side is needed for analyses of asymmetry. This classifier needs to have exactly two values (e.g. "left" and "right"). Recall that the values of classifiers in MorphoJ are case-sensitive (i.e. "left" and "Left" are considered to be different).

Note: for analyses with matching symmetry (i.e. if a Side effect is specified but the data don't have object symmetry), the Procrustes ANOVA will exclude individuals for which only one side is available. Those individuals also won't appear in the output dataset.

The Error 1 and Error 2 effects are for assessing two nested levels of measurement error. For instance, an investigator may take two images of each object, and then digitize each image twice. In this case, Error 1 may be associated to the imaging error and Error 2 to the digitizing error.

Finally, there is a residual effect, which is for the variation 'left over' after the specified effects have been included. If there are no degrees of freedom for the residual effect, it is omitted.

In addition to these effects, MorphoJ can also include additional main effects. These are to take into account additional effects (e.g. sexes or if the individuals were reared in different batches or sampled from different locations). If there is more than one of these additional main effects, they are treated as crossed effects (not nested effects). Note that the analysis assumes that variation and asymmetry are the same within all the groups corresponding to these extra effects. An example of such an effect is maternal age in the analysis of tsetse fly wings by Klingenberg & McIntyre (1998).

The example shown in the screen shot above concerns a data set of fly wings. A classifier named "ind" is used to designate to which individual each of landmark configurations belongs. The left and right sides are indicated by "side". Because two images of each wing were digitized, a further classifier "image" is used to indicate whether a configuration is from the first or second image of each wing. Finally, because the flies were from different crosses, the classifier "cross" is used as an extra main effect.

Clicking the Execute button starts the Procrustes ANOVA. To stop the procedure, click Cancel instead.

Graphical output

If a classifier for Side is included as a factor in the analysis or if the data have object symmetry, the Procrustes ANOVA provides a diagram of the difference between the shape averages of the two sides, corresponding to directional asymmetry.

Text output

The main output is presented in text form in the Results panel.

The output indicates the classifiers used to designate the different effects in the ANOVA and then presents separate ANOVA tables for centroid size and for shape.

The ANOVA table for centroid size follows the traditional model established particularly for studies of asymmetry (Palmer and Strobeck 1986). The table contains sums of squares (SS), mean squares (MS), degrees of freedom (df), F statistics and parametric P-values for each of the effects in the ANOVA. In the example shown in the screen shot, all effects appear. For data sets with object symmetry, no effects of Side or Individual-by-Side interaction are shown for centroid size, because centroid size is computed for the entire configuration (including both sides).

The ANOVA table for shape contains the Procrustes sums of squares (SS), Procrustes mean squares (MS), degrees of freedom (df), Goodall's F statistic(F) and the associated parametric P-value, as well as Pillai's trace an the associated parametric P-value.

If a dataset has object symmetry, the results of MANOVA tests are not presented as part of the main ANOVA table, but separate tables with MANOVA results are produced for the symmetric component and for the asymmetry component of shape variation.

If the dataset has object symmetry or if an effect for Side was included in the model, the output also presents the directional asymmetry vector (the vector of mean asymmetry of the landmark coordinates).

Output in the Project Tree

After a Procrustes ANOVA has run, several new items are attached to the dataset on which the analysis is based:

In addition to the icon for the Procrustes ANOVA itself, there is a new dataset containing individual values and several covariance matrices with the variance and covariance components for some of the effects included in the analysis.

The new dataset contains individual averages of shape. If the data have object symmetry or if a Side factor is included, both the symmetric component of shape variation (averages of left and right sides) and shape asymmetry (shape differences between sides) are provided. If the data have object symmetry, the individual means of centroid size of the whole configuration of landmarks (including both sides) are also given; if the data have matching symmetry (no object symmetry, but with a Side effect included), both the individual averages of both sides and the asymmetries (differences between sides) are provided.

Note: if the data have matching symmetry, the output dataset will only contain those individuals for which landmark configurations of both sides have been measured. For producing left-right averages if only a single side is available for some individuals, it might therefore be advantageous to use Average Observations By in the Preliminaries menu.

If the data have object symmetry or if a Side effect was included in the analysis, the output dataset also contains individual scores that quantify the amount of fluctuating asymmetry of shape for each individual (Klingenberg and Monteiro 2005). These scores quantify the individual asymmetries of shape (as deviations from the mean asymmetry) either in units of Procrustes distance (absolute shape differences) or by using Mahalanobis distances (scaled relative to the variation of asymmetry in the sample).

For the random effects in the Procrustes ANOVA, covariance matrices with the variance and covariance components of the corresponding effects are provided (in the screen shot above, these are for the effects of Individuals, Individual-by-Side interaction, and for Error 1). The variance and covariance components are corrected for 'lower-level' effects (e.g. those for fluctuating asymmetry are corrected for measurement error). These covariance matrices can be used, for instance, in matrix comparisons. The procedure used to estimate variance and covariance components assumes that the data are (nearly) balanced. Accordingly, users should try to ensure that the data are as balanced as possible (e.g. the same number of replicate measurements for all individuals).

References

Goodall, C. R. 1991. Procrustes methods in the statistical analysis of shape. Journal of the Royal Statistical Society B 53:285–339.

Klingenberg, C. P. 2015. Analyzing fluctuating asymmetry with geometric morphometrics: concepts, methods, and applications. Symmetry 7:843–934.

Klingenberg, C. P., and G. S. McIntyre. 1998. Geometric morphometrics of developmental instability: analyzing patterns of fluctuating asymmetry with Procrustes methods. Evolution 52:1363–1375.

Klingenberg, C. P., and L. R. Monteiro. 2005. Distances and directions in multidimensional shape spaces: implications for morphometric applications. Systematic Biology 54:678–688.

Klingenberg, C. P., M. Barluenga, and A. Meyer. 2002. Shape analysis of symmetric structures: quantifying variation among individuals and asymmetry. Evolution 56:1909–1920.

Leamy, L. 1984. Morphometric studies in inbred and hybrid house mice. V. Directional and fluctuating asymmetry. American Naturalist 123:579–593.

Palmer, A. R., and C. Strobeck. 1986. Fluctuating asymmetry: measurement, analysis, patterns. Annual Review of Ecology and Systematics 17:391–421.