Warning This is a long and kind of technical diary. I have tried to keep it as simple as I could and still include what I think is really interesting but it is unavoidably very rich in nerdy detail. Read at your own peril. You have been warned!
Way back in March 2011 I posted a diary outlining the findings of a massive effort (Hackett et al. 2008) to work out the phylogeny (tree of evolutionary relatedness) of modern birds.
Recently another phylogeny was published in Science (the same journal that published the 2008 study) that presented another estimate of the avian phylogeny by Jarvis et al. (2014). This paper was part of a massive study (published as several papers as a set in the same issue of Science) that used entire genome sequences to study bird evolution. The Hackett study had sequenced 19 genes from each of 169 bird species in constructing their phylogeny. The current study (Jarvis et al. 2014) sequenced entire genomes for each species (tens of thousands of genes). The down side to this is that they used fewer bird species than Hackett (48 rather than 169). The upside (and it is a huge upside) is that the 48 different bird genomes provide a lot of information about the evolution of bird characteristics such as learned vocalization, flight, and so on. The 48 species were chosen to address certain evolutionary problems as explained in a second paper, Zhang et al. 2014. We'll discuss that after the phylogeny paper.
I hope at least a few people get something out of the genomic analyses. They are pretty tough going and I have chosen ones that I could understand and explain to some extent. These analyses are part of what developmental biologist and science popularizer Sean Carroll has called the second great era of natural history. These papers are basically giving a taste of what can be accomplished in terms of understanding evolution and diversity by studying genomes. It really is a "let's see what's out there" type of affair but in this case 'out there' means deep inside the nucleus of a cell. These studies are just skimming the surface of each topic - pointing the way for much more detailed studies to follow.
All the images are figures from relevant papers. Both the papers and the figures are freely available on the Science magazine website. All the images are clickable to make them bigger as I know not all the text is going to be easily read within the diary.
Part I: Genomes and Phylogeny
Take Home Message: The genomic data largely support the proposed phylogeny of Hackett et al. (2008) with a couple of exceptions and are also able to resolve the positions of some groups.
But let's started with a comparison of what's new in the current phylogeny estimate compared to the Hackett study. It turns out that they agree quite a bit. In order to keep this diary reasonably short I'm not going to repeat a lot of the detail in my previous diary. Please refer back to it for more information on each group).
First both of them find that all modern (i.e. currently existing) birds fall into three groups
The Paleognathae are the ratites (ostriches and relatives) and tinamous (small grouse-like birds from Central and South America).
The Galloanseres are the ducks, geese, and swans as well as the grouse, quail, pheasants, turkey, and other 'chicken-like' birds.
The Neoaves are all other birds.
The current study only included two species of the Paleognathae and three of the Galloanseres so it doesn't really provide information on relationships within either of these groups. However this study, not surprisingly as it is not a new finding and has been reported many times, supports the idea that these three groups split from one another well back in the Mesozoic, early in bird evolution and that they are very genetically distinct.
The current study also finds that the various groups within the Neoaves diverged right around the time of the last mass extinction at the end of Mesozoic. The Hackett study found five major groups of Neoaves and a number of birds whose position in the phylogeny weren't resolved including things as unusual as the enigmatic Hoatzin and as prosaic as pigeons and doves.
One of the criteria used in selecting species for the study was to include representatives of many of these groups with uncertain placement with the hopes that the more detailed genomic data would resolve many of the uncertainties. Below is the best phylogeny as shown by the genomic data.
The current study supported the existence of three of the five of the groups from Hackett et al. (2008) as well as the relationships within each of these groups. These groups are the massive 'land bird' group as well as the 'water birds' and the group I dubbed the super fliers comprising the swifts, hummingbirds, and nightjars. So this study supports the idea that parrots and falcons are closely related to songbirds.
The other two groups found in the Hackett study were the shorebirds and their relatives and the cuckoos and rails and relatives. The former consists of plovers, sandpipers, gulls, terns, and related species. It corresponds to the traditional bird order Charadriiformes and Hacket et al. found it to be the group most closely related to the land birds. The current study only included one species from this group (killdeer) and so nothing can be said about relationships within the group. However the current study finds the 'water bird' group are the closest relatives of the 'land birds' and the killdeer is more distantly related.
More strikingly the cuckoo/rail group is broken apart. Again, there are a lot fewer species sampled in this study (only three: a crane, a bustard, and a cuckoo). The crane is grouped with the killdeer while the bustard and the cuckoo are shown to be more closely related to the swift/hummingbird/nightjar group.
Still this represents a striking level of agreement between the two studies and an indication that we may be narrowing in on the true evolutionary relationships of major bird groups.
The current study also had good success at placing some the mystery birds into the larger phylogeny in definite positions. The tropic birds and sunbittern (again see my previous diary for more detail) were found to be closely related to one another and to be jointly, closely related to water birds. The Hoatzin is closely related to the crane and killdeer. Finally as was suggested by Hackett, flamingos and grebes are closely related and jointly related to pigeons and some obscure ground living birds. Below is a simplified version of the phylogeny with the names of the taxonomic groups included.
Part II: Structure of Avian Genomes
Take Home Message: Birds have very compact genomes. The small genome size has been accomplished in several ways. Bird genomes are also very conservative, changing very slowly over time.
What I (revealing my personal bias here) think is even more interesting is the information on how bird genomes have evolved. Having 48 genomes to look at can tell us a lot about the evolution of bird characteristics.
Birds have small genomes and fairly similar genomes compared to other vertebrate groups. Quite a lot of genes appear to have been lost very early in bird evolution that occur in non-avian reptiles. Many of these genes have important functions and birds appear to have used other copies of these genes to replace those functions.
Also birds have very compact genomes a lot less DNA devoted to space between genes and to introns (portions of DNA within genes that are cut out of the RNA before it is used to make proteins). They also have fewer repeated elements - short sections of DNA that are present in genomes in many copies. Bird genomes are thus streamlined, smaller than any known mammal or non-avian reptile genomes.
Another striking finding is that chromosomal evolution in birds has been very low. In other words degree of synteny in birds is very high. Synteny refers to the order of genes on chromosomes. This can be a remarkably changeable thing. For example mice and humans have essentially the same set of genes and the structure of these genes is mostly quite similar. However the organization of the genes onto chromosomes is very different. Rats and mice also have very different chromosomal organization despite being even more similar to one another.
While all bird species are certainly not identical in their chromosomal structure, chromosomes seem to have changed much less in birds than in other groups. Some of this information is not new. The number of chromosomes in different types of birds is information that could have been gathered a century ago (I don't know when such studies were actually done). And the initial three bird genomes sequences (zebra finch, chicken, and ostrich) would have given some idea of the structure of bird genomes relative to other genomes. But the addition of 45 more genomes would allow for an unprecedented level of detail.
Part A of the figure above compares the average sizes of parts of the genome in mammals, non-avian reptiles, and birds. As you can see birds have very small introns and spaces between genes relative to the others, particularly mammals. You can skip B or explain it to me as I don't understand it. Part C is showing how similar the structure of chromosomes are (syntenic percentage) in pairs of species based on how long ago in evolutionary time they diverged. As you might expect the further in the past they diverged the more different they are. However mammals diverge much faster than birds. Part D is comparing the hemoglobin genes of various birds and mammals. You can see that the birds generally have fewer genes and are more similar to one another than the mammals. The order of birds and mammals relative to one another is arbitrary (i.e. there is no reason to compare a bird to the mammal on the same line).
Part III: Evolution in Birds Relative to other Groups
Take Home Message: Birds and their close relatives all have very slow rates of molecular evolution compared to mammals and more distantly related reptiles.
This is going to get pretty technical but I'll try and make it understandable to almost anyone. You can learn a lot about the rate of evolution and how selection may have operated by comparing DNA sequences across species. In order to understand how this works you need to understand a bit about the structure of DNA and about the Neutral Theory of Evolution. I'm going to put a summary of the background material in an appendix so you can choose your preferred level of detail.
One of the other more specialized papers in the set is on the evolution of crocodilians by Green et al. (2014). In addition to the bird genomes there were a number of other genomes used for comparative purposes. Some of these were previously sequenced and others were sequenced as part of this comparative program. Among the organisms were three species of crocodilians. Crocodilians are the closest living relatives of birds so they represent an interesting basis of comparison.
The figure below shows that crocodilians have an extremely low rate of molecular evolution (meaning rates of changes of DNA sequences) compared to birds, mammals, and other reptiles. Crocodilians are 'living fossils' in that they have existed for a very long time in essentially the same form. Turtles and birds also have fairly slow rates of evolution at the molecular level. In contrast mammals and the snakes and lizards (Lepidosaurians) have much more rapid rates of evolution. This intriguingly suggests that the conservative genomic evolution of birds may be a characteristic of this entire lineage of vertebrates.
Part IV - Adaptive Evolution Within Birds.
Take Home Message: Comparisons made across avian genomes allow biologists to study how genes have evolved to produce adaptations over the course of bird evolutionary history. Some examples are given.
If you look back up at the second phylogeny they have color coded some characteristics in the names of the groups. Blue names indicate water birds and red indicates birds of prey (these are admittedly somewhat vague). Probably the most interesting are the vocal learners. These are birds whose neural structure is affected by hearing sounds in some way. There are three groups - the oscine songbirds (most things you would think of as song birds and some that you probably wouldn't think of as songbirds like crows), parrots, and hummingbirds. I have to admit that until this paper I had no idea hummingbirds had vocal learning - the only vocalizations I think of them having are the noises they make when they are angry.
Anyway the phylogeny indicates that vocal learning has evolved in birds either 3 times (if it evolved separately in parrots and songbirds) or 2 times (if it evolved in the common ancestor of parrots and songbirds and was lost by the sub oscines). Looking at genes associated with learning reveals that the evolution of learning in these groups was done through modification of the same genes.
The figure below looks at bird adaptation in a variety of ways. Part A shows the number of copies of genes for two different types of keratins are found in different vertebrates. Keratins are fibrous proteins that make up structures like skin, scales, and feathers. Mammals has alpha keratins. Birds and their close relatives (turtles and crocodilians) have fewer alpha keratins but also have a unique group, beta keratins. Beta keratins that produce feathers are unique to birds. Land birds have many more copies of these genes than water birds and domesticated birds have huge numbers of copies of these genes. The significance of this is not yet know. Part B shows when genes related to diet get turned into non-functional pseudogenes. AGT is a gene that codes for an enzyme involved digestion, GULO is a gene whose product is involved in vitamin C synthesis. The loss of the function of these genes is presumably associated with changes in diet.
Parts C and D of the figure above both use dN/dS ratios (explained in the appendix) to look at the nature of selection over the long term on genes related to vision and color. A very low dN/dS ratio indicates strong selection to keep the gene from changing (stabilizing selection). A very high dN/dS ratio indicates that the gene is under selection to change its product. Part C is comparing dN/dS ratios in birds and mammals for a gene associated with vision. Most of the time this vision gene is under stronger stabilizing selection (keeping it the same) in birds than in mammals although there is a small subset of examples in which the function of the gene seems to have evolved really rapidly. Part D is looking at the importance of visual discrimination (they don't define this properly) on selection on plumage color genes. Birds with higher visual discrimination have stronger stabilizing selection on the color plumage genes (lower dN/dS ratios). This probably indicates the importance of social interactions, particularly mate choice is the evolution of these genes.
Appendix: DNA Structure, Mutation, and Molecular Evolution.
Caveat - what follows is a considerable simplification for the purposes of brevity. DNA is a long chain of smaller molecules called nucleotides. The crucial part of each nucleotide is a base (base just refers to a type of molecule). There are four different bases in DNA (A, G, T, and C) and each nucleotide has one of the these bases. The sequence of bases in a DNA strand is in the information content of the molecule. For example a partial sequence of a DNA strand might be AATCTTTGCCATCCAGGA. Each set of three bases is referred to as a Codon because it codes for a particular amino acid. A string of amino acids makes up a protein. Thus DNA provides the code to build a protein. For example the codon AAT at the start of the strand above codes for the amino acid Asparagine. The CTT codon following it codes for the amino acid Leucine and so on.
The crucial aspect of this from our perspective is that the coding system has considerable redundancy. There are 64 triplet codons and they only code for 20 different amino acids (plus signals to stop and start making proteins). So multiple codons will produce the same amino acid. For example, the codons TTA, TTG, CTT, CTC, CTA,and CTG all code for Leucine.
One common type of mutation is a substitution in which an accident in DNA copying replaces one base with another. For example the sequence I gave above could get accidentally changed to AATCTCTGCCATCCAGGA where the sixth base was changed from a T to a C. This example is what is known as a synonymous mutation because the new sequence will produce the same amino acids as the old sequence. In terms of function the gene is exactly the same as before.
In contrast the sequence AATTTTTGCCATCCAGGA has a non synonymous mutation where the fourth base is changed from a C to a T. This would result in a change in the amino acid sequence where the Leucine would be replaced by a Phenylalanine.
So it is fairly straightforward when comparing the same DNA strand in difference species to count up the number of synonymous and non-synonymous changes. Synonymous changes happen at a fair constant rate as they are simply the result of mutation and then chance changes in the frequency of the new mutation. Non-synonymous mutations should occur at the same rate (once you correct for the number of possible mutations of each type) but they will be subject to natural selection and are much more likely (relative to synonymous mutations) to be lost if they have bad effects or become really common if they have good effects.
It turns out that the rate synonymous changes over time is fairly constant within groups of organisms producing a kind of molecular clock. This clock can be calibrated to the fossil record and used to estimate the times of divergence in the past for cases in which there is no fossil evidence. As the crocodilian example shows, this rate can vary quite a bit across different groups for reasons that are not entirely clear.
Appendix Part 2 - The dN/dS Ratio.
Once you correct for differences in number of possible mutations and some other things we can ignore for our purposes you can estimate how common synonymous and non-synonymous changes are when comparing two strands (the dN/dS ratio). A small dN/dS ratio indicates that very few non synonymous changes have been successful and that selection is keeping the gene 'the same' in terms of the nature of what it produces. A high dN/dS ratio would indicate that there has been selection to change the nature of the gene product.