Monday, February 4, 2013

Ranunculus, part 6: Guess the fallacy

This is part of a series on a paper by Hörandl & Emadzade (subsequently H&E) suggesting an "evolutionary", i.e. pre-cladistic, classification of Ranunculus. See the previous installments here: part 1, part 2, part 3, part 4, part 5.

After the previous part, one could arguably already move on. I was curious if the paper would contain a suggestion for a scientific criterion to accept paraphyletic taxa but as always it reduces to the personal feeling that X morphological differences are sufficient dissimilarity to replace, in that particular case, classification by descent with classification by, if we are honest about it, phenetics. Because it mixes two incompatible criteria - phenetic and cladistic - such a system is incoherent. Because it is arbitrary to demand X morphological differences instead of X+1 or X-1, the system is subjective instead of objective. Finally, a system like this will always be less natural and less accurate a description of biological reality than a classification of nested clades simply because said biological reality has a nested structure shaped by common descent.

Nonetheless, I have two more parts to this series in me because there is another aspect in this paper that I would like to discuss, and there are more arguments for paraphyletic taxa in the last part of the discussion that it would be interesting to examine.

To move on a bit, let us rush through the results section very quickly: the authors describe their molecular phylogeny very concisely. In the network analysis of one clade, they find it internally to be very spiderwebby or, in their words, "the internal structure of this cluster remains highly reticulate with well-supported alternative splits". Again, I am unsure how to interpret this quantitatively and uncertain about what exactly this demonstrates.

H&E's parsimony analysis with only morphological data results in a tree with very little branch support. That is probably unsurprising to anybody who has worked with morphological data at this taxonomic level before, especially considering that there were only 46 characters for 220 species. Lastly, they map the morphological characters onto the molecular tree and describe the results of their phylogenetic analysis with combined morphological and molecular data. The details are only relevant if one wants to write a blow-by-blow account of the resulting classification, which I don't; I am more interested in the principles and theoretical considerations behind it.

Now follows the discussion section of the paper. In this post I want to deal with a logical fallacy that underlies much of evolutionary systematics. Read through the following sentences from the discussion and try to find the problem they all have in common. (Apart from the visible ellipses, I have taken the liberty of deleting the references to improve readability.)
Our results suggest a highly complex evolutionary history with frequent stages of paraphyly within the genus. Molecular data suggested the existence of a paraphyletic stem group (A) within Ranunculus, comprising five big clades (I-V). These five small clades probably represent a basal group in the genus, from which the holophyletic clade B arose. Such patterns of alternating paraphyly/ holophyly appear regularly within land plants and also within the phylogeny of angiosperms. Alternating stages of holophyly and paraphyly are a result of incomplete speciation/extinction processes. [...] Within Ranunculus s.s., the presence of swollen achenes is ancestral and a shared, probably originally synapomorphic morphological feature of the paraphyletic stem group A. [...] Paraphyly is also expected in cases of hybridization and/or polyploidy, as parental species do not necessarily go extinct after such cases of saltational speciation, but may survive as remaining stem group. In flowering plants, paraphyly is a common phenomenon, likely as a consequence of frequent whole genome duplications and/or reticulate evolution. Hybridization and several episodes of genome duplications have occurred multiple times within the phylogeny of flowering plants, and may explain the high incidences of paraphyly in deeper nodes of angiosperms. [...] The addition of morphological information, however, supports a hypothesis of a paraphyletic stem group of yellow-flowered wetland species and a higher specialized holophyletic white-flowered group of aquatic species.
Got it? The solution is circular reasoning. Stripped to its bones, the authors spend much of the first half of the discussion advancing the following argument:

Premise: Taxa are sometimes paraphyletic.
Conclusion: Therefore, taxa should sometimes be circumscribed to be paraphyletic.

The problem is, the premise is not even wrong. Being merely wrong would be a considerable improvement for this premise. Taxa are groups of organisms accepted by taxonomists; if taxonomists circumscribe supraspecific taxa to be paraphyletic, then they are paraphyletic, and if taxonomists circumscribe them to be monophyletic, then they are monophyletic. It is simple as that. The conclusion the authors want to arrive at is that we should circumscribe them to be paraphyletic, but skipping the step where they should have provided an argument for that conclusion and merely stating that taxa are circumscribed to be paraphyletic is begging the question*. Why are they paraphyletic? Did that information float down from the sky on some golden tablets? No, it was a taxonomist's decision to circumscribe them in the way they are.

In slightly different words, the evolutionary systematists' reasoning could also be described as follows:
  1. Taxa are phenetic clusters (i.e. defined by morphological similarity)
  2. We map the taxa onto the phylogeny and discover that they are paraphyletic
  3. See, these taxa simply are paraphyletic, why don't you accept them as such?
One cannot start the discussion by assuming that taxa are phenetic clusters, because the nature of taxa is really what the entire discussion is about. A cladist would argue that taxa should be groups defined by common ancestry and the possession of synapomorphies, which has the advantage that there is an objective biological reality behind such taxa, the plain and reproducibly discoverable fact that they include all descendants of one ancestral species which had acquired the synapomorphy.

Paraphyletic taxa do not have such an objective, discoverable biological reality. If naturally paraphyletic taxa were really out there to be discovered, then it should be possible to provide clear and universal criteria for their discovery instead of something on the lines of "to circumscribe subgroups in our specific study group, take five, not four or three, morphological, not molecular, symplesiomorphies, in the specific way we have delimited and coded morphological characters in this study; perhaps something else in other groups of organisms, and perhaps something else again once somebody else repeats this study with their own character matrix for the same study group". And remember that this arbitrary muddling around is supposed to be not only better but also more useful and more "evolutionary" than the simple idea that a natural group has to include all descendant species of its common ancestral species.

A few phrases are hidden in the lengthy quotes above that merit special attention for unrelated reasons:
Alternating stages of holophyly and paraphyly are a result of incomplete speciation/extinction processes.
Yes, there is again the circular reasoning, because in reality paraphyly is not a "stage" in evolution but merely the shape of a badly circumscribed taxon on a phylogeny, but disregard that for a moment. What are incomplete speciation and extinction processes? The most likely interpretation of this sentence is that the authors are talking about incomplete lineage sorting (the sequence copies of one gene found in a species can be paraphyletic). However, that would mean that they confuse tokogeny and phylogeny, that they confuse gene trees and species trees, and that they conflate the paraphyly of groups of sequence copies with the paraphyly of groups of species, which are two very different things indeed, meaning basically that they try to build a classification of sequence copies instead of one of species**. One hopes that there is some other possible interpretation for the sentence, but I am not sure what it would be.
Paraphyly is also expected in cases of hybridization and/or polyploidy, as parental species do not necessarily go extinct after such cases of saltational speciation, but may survive as remaining stem group. In flowering plants, paraphyly is a common phenomenon, likely as a consequence of frequent whole genome duplications and/or reticulate evolution. Hybridization and several episodes of genome duplications have occurred multiple times within the phylogeny of flowering plants, and may explain the high incidences of paraphyly in deeper nodes of angiosperms.
And this is another sequence that I have a hard time with, although this time the introduction of the same paper provides pointers as to what this is about. Taken literally, very little of this makes sense because neither hybridization nor polyploidy will lead to paraphyly. First, nothing ever leads to paraphyly except a taxonomist actively circumscribing a taxon to be paraphyletic, but well, that is the circular reasoning again. Second, hybridization leads to the formation of hybrids and polyploidy means a genome duplication, and that is it, no paraphyly there. So what the authors must mean here in all cases, although they do not mention those terms and only imply it with "and/or reticulate evolution", is either introgression or allopolyploid speciation.

Third, introgression is, as discussed previously, only of concern insofar as it may make it harder to infer the species phylogeny, but it has precisely zero relevance for the validity of paraphyletic supraspecific taxa on the species phylogeny once it is inferred; this is another case where it would be nice if we could all consider the difference between tokogeny and phylogeny and the difference between gene trees and species trees. So that leaves allopolyploidy which, fourth, and again as previously discussed, is only a problem if it is more rampant and/or happens across more distant lineages than can be made plausible, and which, fifth, and also as previously discussed, if rampant would not lead to paraphyly but to absence of phylogenetic structure, i.e. neither para- nor monophyly. Saying that a group of items in a tokogenic system is paraphyletic is another claim that is not even wrong; it makes about as much sense as saying that a temperature of 27°C is tall.

Continue reading here.

Footnotes

*) In all fairness, additional arguments for the acceptance of paraphyletic taxa follow in the very last subsection of the discussion, to be dealt with in the next and last post of this series. But that structure of the discussion is itself merely a symptom of the problem: H&E start with the assumption that paraphyletic taxa are valid and argue from there.

**) Not saying that there aren't cladists out there who try to make groups of sequence copies monophyletic, but that is no excuse. It is one thing to have a regrettably simplistic understanding of the work you are trying to do but another to publicly campaign against something because you consider your simplistic caricature of it to be the real thing.

No comments:

Post a Comment