Combine Datasets

The observations from multiple datasets can be combined in a single dataset. This is the reverse of the operation performed by Subdivide Datasets. In combination with that procedure, it allows a rather detailed control over the contents of a dataset (similar to Include or Exclude Observations, but with irreversible effect).

Combining datasets only makes sense if the datasets contain observations with compatible landmark data. This means that the datasets must contain corresponding landmarks in the same order, have the same dimensionality and the same type of symmetry. Moreover, if the configurations have object symmetry, the pairing of landmarks must be the same. MorphoJ checks for these conditions. However, this does not make the procedure foolproof! For instance, it might still be possible to combine data from fly wings and mouse mandibles, both with 15 landmarks in 2D and with matching symmetry, into a single dataset. The user is responsible for ensuring that only truly compatible datasets are combined.

To combine datasets in MorphoJ, they must contain the corresponding raw data. Moreover, MorphoJ checks for the occurrence of duplicates (records derived from the same observations) in the different datasets, and will stop the procedure if duplicates are found.

To start the procedure, click on the dataset in the Project Tree window and then select Combine Datasets from the Preliminaries menu. A dialog box like the following will appear:

The first item in the dialog is a text field for entering a name for the new dataset that is to contain the combined data. In the screen shot above, the name "Lake Thun" has already been entered instead of the default string "Combined dataset ...".

The next item in the interface is a drop-down menu for selecting the start dataset. This dataset is important because various pieces of information will be copied from it into the new dataset (e.g., information about the alignment of the data after Procrustes fit, about wireframe or outline graphs, etc.). The drop-down menu contains those datasets in the current project that contain raw data.

The list of other datasets shows those datasets that are compatible with the start dataset. Select one or more of the datasets to include them in the combined dataset.

Click Execute to combine the datasets. Alternatively, click Cancel to stop the procedure.

 

If there are observations that appear in more than one dataset, the following warning message will appear:

In this case, reconsider the choices for the start dataset and the other datasets to avoid duplication.

 

The combined dataset contains those classifiers and covariates that appear in all the datasets that are combined. To establish the correspondence of classifiers and covariates, MorphoJ compares their names; this comparison is case-sensitive. If more than one classifier or covariate has the same name in any one dataset, this classifier is excluded from the combined dataset. Use Edit Classifiers or Edit Covariates to change the names of classifiers and covariates to make them unique and compatible among datasets.

The combined dataset contains a new classifier named "From dataset", which indicates from which of the datasets the respective observation came.

Observations that were excluded by using either Find Outliers or Include or Exclude Observations retain the status they had in the original dataset: the raw data for these observations is transferred into the new dataset, but the observations are excluded from analysis. These observations can be included in further analyses by invoking Find Outliers or Include or Exclude Observations for the new dataset.