The first technological innovation to radically change human society was agriculture. The ability to cultivate – rather than hunt or pick – food had a profound change on everything from our immune system to our societal structures. It encouraged specialisation, favoured robust, complex inter-generational knowledge transmission and enabled the explosive growth of this bipedal ape.
Arguably, the centrepiece of agricultural innovation is wheat. If you look at the ancestral grass from which it was bred, wheat looks just like… grass. With tiny seeds sticking out of its head at harvest time. Some 10,000 years ago in Anatolia, enterprising farmers bred the biggest, most consistent of these grasses year after year. By selecting for the size of the wheat ears, they brought about changes in the genome that gave rise to larger and larger wheat. One type of change was duplication, by which two individuals (often different subspecies) were bred together without first splitting their genomes in half. In ancient times, single duplications like this gave rise to varieties like emmer, or durum wheat, and the first duplication looks like a wild, common place process. More recently, a second merger with a third subspecies was introduced in bread wheat in more "modern" times (around 5000 years ago), making its genome three times the size of the basic grass genome. (In case you’re new to genomics, the genome comes in pairs: one maternal and one paternal, so this three-way increase is described as hexaploid, three times the normal diploid.)
The basic Anatolian grass genome is about the twice size as the human genome: around 6 billion bases (6 Gigabases), and the 3-fold hexaploid wheat is around 16 Gb. Annoyingly, every bit of DNA in wheat has three pretty similar copies, even when the strain is completely inbred (for outbred wheat plants, one expects 6 loci, 2 from each triplicated loci). In technical terms, this is described as a “nightmare” for genome assembly and analysis. For a long time, the wheat genome was a seemingly unobtainable goal for agricultural genomics research.
It’s not uncommon for plants to duplicate their genome in the wild (indeed, this seems like the starting point of the ancient wheat), but it’s a regular practice in agriculture when selecting for larger fruits/seeds. Strawberries are octoploid (four duplicates). Brassicas (cabbages, broccoli, cauliflower and friends) are all tetraploid strains from different mixtures and tweaks of three different base lines genomes. It makes my head hurt just thinking about the genetics. Commercial sugar cane is duodecaploid (6 duplications) and, as we propagate it using cuttings, even its cells have completely lost the desire to even keep track of their chromosomes.
Despite its fiendish complexity, the community has finally, slowly and steadily, tamed the big, bad bread-wheat genome. First-off-the-mark survey skims of the wheat genome were generated, then compared against the smaller Brachypodium grass genome. The Barley genome (far saner but still annoyingly big) followed. Heroically slogging through chromosomal sorting, people started to tease apart the specific components of the genome. And just recently, excellent work by Matt Clarke and colleagues at TGAC brought us a solid, draft assembly using a clean sequencing protocol, a custom-tweaked assembly algorithm for wheat and a very large computer.
I am delighted that this has happened for many reasons. First, the work is a tour-de-force by Matt and his team. Second, I know a good draft genome will unleash a whole series of experiments – from diversity panels to chip-seq. Thirdly, it allows wheat to come into the fold of species-we-have-a-reasonable-genome for, so we don’t need to treat it like a special case any longer with tricky, bespoke systems (though there is still a need for these, given wheat’s endless annoyances - for example, it is very important to know the relationship between the 3 copies).