Consider four objects, O, A, B and C, that can be caracterized by one character with two possible values: 0 and 1. The character matrix is:
O | 0 |
A | 1 |
B | 1 |
C | 0 |
The idea of cladistics is to consider all possible arrangements of the objects on a tree and identify the minimum changes (called steps) of character values to go from one object to the other along the branches. For 4 taxa, there are three possibilities:
The tick bars indicates where the character has to change. On the tree to the left, the character is “0” on the two top branches and “1” on the two bottom branches. There are necessarily two steps like on the middle tree. The tick marks can be placed on any of the bottom or top branches, this simply modify the putative value at the node which is “0” on the left tree and “1” on the left one.
The tree to the right requires only on step since going from 0 to C does not require any change of the character value, and going from O to A or B requires just one step. The tree to the right is first the most parsimonious tree with only 1 step. It is the preferred one under the parsimony criterion.
The trees above are unrooted, there is apparently no starting point. In order to understand the relationships in an evolutionary context, it is preferable to root the tree by providing an outgroup.
Let’s choose O as the outgroup. The rooted trees are now :
The information is the same as above with the additional reading that the more diversified objects are the farthest ones from the outgroup O.
The number of arrangements only depends on the number of taxa, and it increases very rapidly with it. You can exercice yourselves with 5 or 6, but it will become very cumbersome afterwards. Even with computers, there is a (relatively small) limit for an exhaustive search of the tree space. tricks are used for more than a few tens of objects, but you are never entirely sure to find the most parsimonious trees. This is another story.
For the fun, consider now the same four objects as above but with 5 characters:
c1 | c2 | c3 | c4 | c5 | |
O | 0 | 0 | 0 | 0 | 0 |
A | 0 | 0 | 1 | 1 | 1 |
B | 0 | 1 | 0 | 0 | 0 |
C | 0 | 1 | 1 | 0 | 1 |
The rooted trees then become (each tick mark indicates a change from 0 to 1 or 1 to 0 for the corresponding character):
As you can notice, the most parsimonious tree is now the one in the middle. This illustrates the fact that adding information on the same set of objects can modify the evolutionary scenario and the classification.
Once you have practiced and understood these simple examples, you can consider characters with more than two states. For instance, try an additional character with four states: 0, 1, 2 and 3. You will have to consider all possible changes, this is not very complicated, this simply requires more care and more time.
Of course you could use softwares to build the trees, but if you do not practice some simple examples by hand, then you will not be able to use the resulting tree.