Two-block partial least squares (PLS) analysis is a method to examine the covariation between two sets of variables (e.g. Rohlf and Corti 2000). In the context of geometric morphometrics, one or both sets consist of shape variables.
MorphoJ offers an implementation of two-block PLS for separate sets of variables. The blocks can contain shape variation for two different configurations of landmarks or different data matrices in the same dataset (e.g. shape versus centroid size and/or one or more covariates).
For analyzing covariation between parts of a single configuration, users may either subdivide it into separate configurations (using Select Landmarks in the Preliminaries menu) and then run a two-block PLS analysis, or use PLS for subsets of landmarks within a configuration. The difference between the two procedures is whether there are two separate Procrustes fits or a common one for both blocks (for more details, see here).
The implementation of two-block PLS in MorphoJ uses the singular value decomposition of the matrix of covariances between the two blocks of variables (e.g. Rohlf and Corti 2000). The use of covariances assumes that the variables within each block are on a consistent scale. This is the case if they are Procrustes coordinates from a single configuration of landmarks. Especially for covariates, however, users may consider suitable transformations (e.g. standardization to unit variance, etc.).
The permutation tests offered in the procedure all concern the null hypothesis of complete independence between the two blocks of variables. Achieved significance levels are indicated for the RV coefficient, a global measure of association (equivalent to a permutation test using the sum of squared covariances or the sum of squared singular values as the test statistic), for the singular value associated with each pair of PLS axes, and for the correlation between the scores for each pair of PLS axes. With the null hypothesis of complete independence between blocks, the tests using the singular value and the correlation as the test statistic are equivalent and therefore both provide the same P-value.
If there are multiple groups in the data, such as different species, it may be of interest to run a pooled within-group PLS analysis. This analysis focuses on the covariation between the deviations from the group averages in the two blocks of variables. Accordingly, the analysis will first remove the differences in the group means, and then run the PLS analysis. The PLS scores for this type of analysis are computed in two ways: group-centred scores, where the group means are removed, and the standard PLS scores computed from the raw Procrustes coordinates, which therefore do include differences in group means. The use of pooled within-group PLS implies the assumption that the covariation between the blocks of variables is the same in the different groups. If this assumption is not met, the pooled withon-group PLS can still be interpreted as a compromise between the patterns of covariation in the different groups, but the results need to be interpreted with caution (e.g. groups with bigger sample sizes and groups with greater covariances among blocks will have a disproportionate influence on the PLS analysis).
In the Covariation menu, select Partial Least Squares and then Two Separate Blocks. A dialog box like the following will appear:
At the top of the dialog box, there is a text field for entering a name for the PLS analysis that will be visible in the Project Tree.
Most of the dialog box is taken up by the interfaces for selecting the datasets, data matrices and variables for the two blocks of the PLS analysis. Start with selecting the dataset for block 1 (in the list on the left side). This will affect the datasets that can be selected for block 2: the list will only display datasets linked to the one selected for block 1. If the user wants to use a different dataset instead, the two datasets must first be linked to each other (Link Datasets in the Preliminaries menu). For each block, at least one dataset, one data matrix (data type) and one variable (or set of variables) must be selected. For variables such as Procrustes coordinates, all coordinates of an entire landmark configuration are automatically selected together (as in the screen shot above).
The check box Perform permutation test can be activated to request a permutation test for the null hypothesis of complete independence between the two blocks. Note that this null hypothesis is used for the overall test of association between the two blocks, for the tests of singular values, as well as for the correlations between pairs of PLS axes. If the check box is selected, the text field for entering the number of permutation rounds is activated.
To request a pooled within-group PLS analysis, select the check box Pooled analysis within subgroups. If the check box is selected, the list below is activated, which should be used for choosing one or more classifier variables that are to be used as the criterion for forming groups.
To start the computations, click on the Execute button. To abort the procedure, click on Cancel.
The graphical output is presented as a series of tabs. First, there are graphs with visualizations of the PLS coefficients or the associated shape changes. Second, there is a bar chart showing the contribution of the PLS axes to the total amount of covariation. Finally, there is a series of plots of the scores for the PLS axes.
The visualizations of PLS coefficients use either the graph types for illustrating shape changes or biplots. Biplots are used if the variables in a block are not shape variables (e.g. centroid size or covariates).
The PLS scores are plotted in two different ways. First, there is a panel with scatter plots of Block 1 versus Block 2 for each pair of PLS axes. These graphs can be used to inspect the association between the two blocks. Moreover, there are two further panels providing scatter plots of pairs of PLS axes within the two blocks. These plots allow the user to exploit the fact that plots of corresponding PLS axes provide a pair of 'optimally corresponding' ordinations of the data points in the two blocks of variables (e.g. ordinations of diet and morphological variation; Klingenberg and Ekau 1996, Fig. 7).
For a pooled within-group PLS analysis, there is an additional scatter plot with the group-centered PLS scores of Block 1 versus Block 2. These scores differ from the standard scores by having average scores of 0.0 in each of the groups.
The text output contains information about the variables in the two blocks and the sample size (number of specimens with complete information). Then there are two tables with the PLS coefficients for the two blocks of variables.
As an overall measure of association between the two blocks, the RV coefficient is provided. The RV coefficient is a multivariate analogue of the squared correlation (Escoufier 1973). It takes values from 0 (completely uncorrelated data) to 1.0 (one set of variables can be obtained from the other set by a rigid rotation and/or reflection).
If the option for the permutation test is selected, the output provides the number of permutation iterations and a P-value of the test against the null hypothesis of complete independence. This global test of covariation uses the RV coefficient as the test statistic.
The final block of information contains statistics for each pair of PLS axes. These include the singular values and the P-value of the associated pernutation test (if the permutation test was selected), the proportion of total covariation for which the pair of PLS axes account (the squared singular value as a percentage of the sum of squared covariances between blocks), the correlation between the PLS scores for each pair of axes, and the associated permutation P-value (if a permutation test was selected).
If the resuts of permutation tests are printed, note that all tests (even those for individual singular values and correlations between PLS axes) use the null hypothesis of complete independence between the blocks.
The output dataset contains the PLS scores and, if a pooled within-group PLS analysis was run, the group-centered PLS scores.
Escoufier, Y. 1973. Le traitement des variables vectorielles. Biometrics 29:751–760.
Klingenberg, C. P., and W. Ekau. 1996. A combined morphometric and phylogenetic analysis of an ecomorphological trend: pelagization in Antarctic fishes (Perciformes: Nototheniidae). Biol. J. Linn. Soc. 59:143–177.
Rohlf, F. J., and M. Corti. 2000. The use of two-block partial least-squares to study covariation in shape. Syst. Biol. 49:740–753.