PhyloBotanist: Genotypic cluster species (and similar)

This post continues a small series on species that started here and continued with this post.

When I wrote about the Biological Species Concept and its relatives, I wrote that it is what most non-scientists would spontaneously suggest when asked for a definition of species, and that it is also very popular especially with zoologists. At least in theory - it is not very practical to conduct crossing experiments every time you write a monograph or describe a novelty you found as a single specimen on a field trip. It is, however, clearly the concept that evolutionary biologists and everybody studying speciation have in their minds. In contrast, I would argue the concept to be discussed today is the one that most competent taxonomists use in actual practice, often intuitively but increasingly explicitly and supported by quantitative analysis.

The Genotypic Cluster Concept of species (subsequently GCC) was formalized by Mallet (1995). It sees species as groups of individuals that form genetic or morphological clusters with few or no intermediates to other such clusters.

The figure above shows the principle. It does not matter whether this is based on genetic or morphological data, nor what analysis this actually is (although it is in fact a real life example from a publication I am preparing). It is generally some variant of ordination like a PCA, PCO, or any number of related methods that summarize the variability of the characters in the datasets in two or three dimensions and plot two samples close together when they are very similar and far apart when they are very different. As one can see, the green species is nicely distinct from the rest. It forms a tight cluster with no intermediates between it and the other two. In contrast, one would be forgiven to wonder whether the red and blue samples are really forming two distinct groups here because there seems to be considerable overlap. To test that, one should now repeat the analysis without the green samples.

As implied at the beginning of the post, the same is done intuitively by many taxonomists even if they do not use a formal analysis such as the above. When preparing the treatment of a plant group for example, they reshuffle herbarium specimens to sort them into stacks defined by overall similarity and divided by gaps in morphological variation between them. The number of intermediates that are acceptable is a matter of taste, unfortunately. It is unrealistic to allow none, not least because they could simply be sterile hybrids, but even if there is a bit of gene flow between two groups one could still argue that it is negligible in the grand scheme of things.

My own Ph.D. supervisor opined that the average taxonomic study would usually have around 5% specimens left over at the end that do not fall into any of the nice clusters one would like to recognize as species. They could be hybrids but they could also be their own separate species that we do not recognize as distinct yet because we have seen so little of them, or we could be dealing with very aberrant specimens, perhaps due to having grown under unusual conditions (the latter is probably less of a problem in most of zoology where the organisms are less notoriously modular and plastic than plants). The point is that some taxonomists are horrible "lumpers" and would unite everything that shows the slightest hint of intermediate forms into one big muddy species while others are extreme "splitters" who will always accept the smallest possible clusters even when the characters that define them are obviously taxonomically useless.

That brings us to an important issue, the selection of characters. If you chose the right characters, you can easily do a cluster analysis to separate humanity into two reasonably well defined clusters, males and females, but obviously our two sexes aren't species. This is such an obvious issue that no competent taxonomist would deliberately use characters influenced by sexual dimorphism in an analysis like this, but there have been cases in which the males and the females of the same species were first described as separate species simply because the taxonomist did not have the relevant field experience, for example in the Restionaceae plant family. In the same vein, one should avoid using characters that are strongly influenced by the environment (see the plasticity of plants again) and those that change over the course of ontogeny, or at least one should use only individuals from homologous developmental stages for the analysis.

I have to admit that it is not entirely clear to me how the phenospecies concept differs from the GCC in practice. Yes, the definition asks for individuals sharing "most of a set of characters" and, yes, there is no theoretical consideration behind it that the overall similarity will be due to interbreeding or some other evolutionary process, but in practice one would likely use similar analyses to group the samples into species and thus likely arrive at the same results.

According to many definitions I read, the morphological species concept or morphospecies concept also seems to be basically the same although in that case limited to morphological data (duh) and with the implication of there not being any formal analysis, only intuitive judgements by biologists. It is what an ecologist or vegetation scientist would apply when they make, for example, an inventory of a rainforest plot to find out how many species of trees or beetles it contains. They would not be the experts for the taxonomy of all the relevant genera and families, so the best they could usually do is to use their judgement, informed by intuition and experience, to decide if several samples are similar enough to make it a reasonable assumption that they are the same species.

Summary (Genotypic Cluster Species)

Grouping criterion

Species are groups of individuals that are overall significantly more similar to each other than to members of other species. The concept demands the existence of discontinuous variation, of gaps in morphological or genetic variation between putative species.

Ranking criterion

Potentially not well-defined; clusters and gaps in variation exist at all levels of biological diversity. The solution is to ask for the smallest clusters that can be shown to exist and call them species, i.e. one would iteratively apply the analysis again to the sub-clusters found in higher level analyses until arriving at a sub-cluster that cannot be broken up any more. See below for one problem with that, however.

What is it good for?

A clear strength of this concept is that it is very useful for empirical research: when in doubt about the distinctness of two groups of samples, generate some kind of morphological or genotypic data and see if they form two discrete clusters in a formal ordination or reclassification analysis. It also has the advantage of working for asexually reproducing organisms just as well as for sexually reproducing ones. In practice, it will probably arrive at similar species circumscriptions as the less strict variants of the Biological Species Concept. The only drawback of the GCC for practical taxonomic work is the lack of a clear cut-off for how many intermediates are acceptable. From an asynchronous perspective, the concept is obviously useless because due to the gradual nature of evolutionary change there are no clear gaps between species along the branches of a phylogenetic tree. The utility of the GCC is thus limited to one time-slice. On the other hand, in contrast to the Biological Species Concept it works fine for defining species of fossil specimens from the same time-slice (using morphological data of course).

Are species real?

In a sense obviously yes - the clusters are observable reality. But do they mean something biologically? That depends. Phenetics can actually mislead us about higher level relationships (which is why we moved to cladistics), but if applied to the very specific question of whether a group of populations already known to be most closely related to each other form only one or two distinct groups, searching for a gap in variation between them will reliably tell us something about the biological reality of their distinctness.

Are species a special rank?

Potentially not - genetic and morphological variation are clearly nested. Throwing all species of a genus into one analysis may produce three clear clusters, and then repeating the analysis on one of those clusters alone may show it to fall into two clear sub-clusters, and so on. One could however say that species are the smallest groups of individuals that do not fall into separate clusters; because it can be assumed that this homogeneity would in many cases be due to interbreeding between those individuals, this would again approximate a reproductive community concept of species and thus make species a special rank, but there are some possible alternative explanations. In particular, it is possible that different populations of what is still very clearly a reproductive community experience disruptive selection, and overly enthusiastic application of the GCC would lead to the recognition of mere locally adapted variants as separate species.

Monday, April 8, 2013

Genotypic cluster species (and similar)

No comments:

Post a Comment