Tuesday, 22 December 2015

10th genome of Christmas: The laboratory mouse

After human, the most studied animal, by a long margin, is mouse. Or, more strictly, the laboratory mouse, which is a rather curious creation of the last 200 years of breeding and science. 

Laboratory mice originate mainly from circus mice and pet “fancy” mice kept by wealthy American and European ladies in the 18th century. Many of these mice had their roots in Japan and China, where their ancestors would have been kept by rich households. Unsurprisingly, the selection of which mice to breed over the centuries came down to habituation to humans and coat colour rather than scientific principles. 

The founding genetic material for the lab mouse was not just one species, the European house mouse (Mus musculus domesticus), but three: Mus musculus domesticus, Mus musculus musculus (mainly Asian) and Mus musculus castaneus. Because mice have been following humans around for thousands of years, the history of these three species or strains (everything gets a bit murky here, as mice mate if they meet - but Asia to Europe is quite a distance if you are a mouse) is complex, to say the least.

Mice got their start in the genetics laboratory in a rather eccentric collaboration between a Harvard Geneticist (W. E. Castle) and a fancy-mouse breeder (Abbie Lathrop), who provided a series of mice with specific traits, such as Japanese Waltzing mice. Abbie arguably ran the world’s first-ever mouse house on her farm in Massachusetts. A student of Castle, C.C. Little, got involved in studying mice and transformed a small hamlet on the coast of Maine, Bar Harbor, into a research laboratory, later named the “Jackson Laboratory” after a generous donor. The Jackson lab (shortened to “Jax”) is still one of the world’s premier mouse research sites.

Mice are excellent mammalian models: they really do have all the cell types, tissues and organs that human has, and so many features (though not all) of human biology, from cellular to physiological, can be replicated and studied in this animal. But it is the detailed control we have over the mouse genome that makes it an exceptional species for helping us understand biology. This control is thanks to two key developments. First, because mouse embryonic stem cells can be produced so easily, there are mouse cells (which you can keep in a petri dish) that can be coaxed into making viable embryos. These embryos can be implanted in pseudopregnant mice, and become full grown individuals. Second, one can swap pieces of DNA in and out in these stem cell lines at will - almost as easily as in yeast (and certainly more easily than in fly or worm). 

The ability to swap, not just insert, DNA segments (“homologous recombination”) is key. This unique-in-animals genomic control of genetics means there are elegant, precise experiments that are only feasible in mouse. For example, one can 'humanise' specific genes (i.e. swap the human copy in for the mouse copy), or trigger the deletion of a gene at a particular developmental time-point by using a variety control elements, ending up with molecular 'cutters' that will turn on only when you want them to. Mice are far more than just a 'good' model for human - they are arguably the premier multi-cellular organism over which we have the most experimental control. 

Given its importance to a massive community of researchers, mouse was clearly going to be the most important genome to sequence, after human.

The Black6 strain (Full name: C57BL/6) from the original breeding of C.C. Little was chosen as the strain to sequence, because it was the most inbred and the one most often used in experiments. Indeed, in the public/private race to the human genome (more on this in a later post), the company Celera switched to sequencing mouse when it was clear that the public human genome project was matching the Celera production rate. 

Both the Celera mouse data and the public mouse genome data were based on a whole-genome shotgun sequencing approach. This was standard fare for Celera, but signalled the start of whole-genome shotgun sequencing for 'big' genomes academically (at least for 'reasonable' draft genomes). The inbred nature of mice, Black 6 in particular, simplifies the assembly problem for whole genome shotgun. It’s bad enough trying to put together a 3 billion-letter-long genome from 500 letter fragments - it’s even worse when you have two near-but-not-quite-identical 3 billion-letter-long genomes to reconstruct. 

But in many ways, the mouse genome brought us into a new era of genome sequencing: one of routine, 'pretty good' drafts from whole-genome shotgun, with fairly routine automated annotation. This was in stark contrast to the step-by-step approach taken with previous genomes, coupled with a more involved, manual annotation. 

Given the importance of mouse to researchers, both the genome and the annotation have been regularly upgraded. Though they had broken the back of the big-genome quandary, like many problems, the last 10% of the work, sorting things out, has turned out to be as annoying and involved as the first 90% of the job. After the first draft mouse genome, the next five years was about nailing down the frustrating ~10% of the genome that wasn't easy to assemble from shotgun, and attending to all the details.

Mouse is also likely to lead us in future to a more graph-based view of reference genomes. As there are inbred lines of mice, one can really talk about "individual" genomes in a solid way, knowing that others can 'order up' the same strain and work on them. Thomas Keane and colleagues have been building out the set of mouse strains beyond Black6, and doing increasingly independent assemblies, strain by strain. The resulting set of individual sequences absolutely shows the complex origin of laboratory mice; at any point, some mouse strains are as divergent as two species, and some are more like two individuals from a population. This complex web is best represented as a graph of sequences, rather than a set of edits from one reference, which is the current mode. 

In 1787 Chobei Zenya (from Kyoto) wrote a book, "The Breeding of Curious Varieties of the Mouse", which apparently had "recipes" for making particular coat colours for breeding strategies. There are far earlier documents from China on mouse strains, including the "waltzing" mouse (which we now know is a neurological condition). In some sense this is both the rootstock of this laboratory species and part of the motivation for and discovery of evolution and genetics (though Darwin spent more time looking at pigeons than mice). 

Given the laboratory mouse's flexible genetic manipulation, we will studying this species for at another 200 years.