Friday, 14 October 2016

GA4GH: What? Why? How?

In August 2016, I was offered the position of Chair of the Global Alliance for Genomics and Health (GA4GH), and am delighted to accept. To give a little more background to the announcement going out today, and to provide some answers to personal questions, I’ve written a bit of Q&A.

In this post I answer the three questions I am most frequently asked about the GA4GH, namely: What on Earth is it? Why am I becoming Chair? And how on earth do I find the time for these things?

What is the Global Alliance for Genomics and Health?


The GA4GH is producing solutions for sharing genomic and clinical data responsibly. Low-cost, high-throughput sequencing has changed – and is still changing – the way we understand living things, from the basic science of life to human disease. Healthcare is a big focal point for this change and, now that there is a critical mass of knowledge about the path from personal sequence to treatment decisions, is able to embrace routine sequencing of patient genomes and other molecular measurements.

But if we want a future in which all people can benefit from this change, we need to solve a number of technical, structural, security and ethical problems. The Global Alliance for Genomics and Health was set up to to do just that.

At present, my overall impression of GA4GH is of a professional orchestra warming up: intense, but disjointed, activity and passion. The different sections are poised to work together, producing an ecosystem of harmonised technical and ethical standards.

How we measure


Over the next decade, healthcare will begin to change the way it collects molecular measurements from patients. 'Genomics entering the clinic' means that it will soon become a matter of routine to gather DNA, RNA, protein and metabolite data from patients, along with traditional information. (Note: I like to use ‘Genomics’ in a broad sense, encompassing DNA, RNA, protein and metabolite measurement, partly because terms like ‘omics' and ‘multiomics’ are clumsy and don't translate well beyond the field).

Incorporating new measurements is not new to healthcare: for example, blood biochemistry has been a mainstay in medical practice for over 50 years, and clinical genetics has been used to successfully diagnose millions of people in recent decades. Oncologists routinely use the presence or absence of specific genetic loci in certain tumours to guide treatment.

I think it is generally accepted that these more narrow, field-specific measurements of genomics will change to become a more comprehensive, routine collection of many molecular aspects at the same time, applicable to many areas of healthcare.

Genomes, whole-blood transcriptomes, tumour RNA-seq and large scale-metabolomics can all provide relevant information that is useful in assessing individuals, and still more useful when analysed collectively.

A game-changing opportunity


This is all a bit hairy in terms of skills transfer and capacity, but it’s much more exciting in terms of opportunity.

The consequence of this is that healthcare, a massive chunk of the world economy (between 8% and 20% of GDP in developed-world economies) is going to be conducting high-throughput molecular phenotyping on humans – a wonderful,outbred mammal.

From the perspective of research, which has traditionally looked to other organisms to understand how they work before translating that knowledge to humans, this is an amazing opportunity. Being able to use human data directly – particularly by gathering the huge datasets generated in routine healthcare – will be transformative for science.

The sheer ‘firepower’ of healthcare means humans will be the most studied organism on the planet. No other animal will come close in terms of scale, detail and longitudinal sampling. What an opportunity for research – both basic and applied!

Turbulent waters


Repurposing healthcare data for research, at scale, will not be smooth sailing. There are real cross currents around data, with different levels of access and rules of engagement buffeting one another.

Fundamentally, much of molecular biology research data is fully open, globally aggregated (e.g. ENA/GenBank, PDB, the Human Genome) or, in the case of human research subjects, distributed in accordance with different consents that patients have signed.

Healthcare data is completely different. There is a thicket of national legislation, each rooted deeply in national law, language and societal norms, and the primary remit of each system is to keep its citizens healthy – not to create resources for research.

As generating molecular data becomes more a matter of routine, the constraints for access will doubtless change, perhaps without reference to research and its potential to create better long-term solutions.

Another interesting driver for change is patient engagement. Increasingly, clinical research has become more of a two-way relationship, with patients empowered to be owners of their personal measurement data, in addition to being donors.

Even assuming that all goes well and access issues are resolved, there is the matter of handling data on massive scales, and being equipped to analyse it. Engineering around large-scale genomic data is no trivial matter. One can’t simply slurp up spreadsheets or STATA frames of genomes, transcriptomes and metabolomes – you need proper computational muscle.

There is also the opposite 'knowledge flow': for healthcare to leverage genomics well, there are many practical problems for which research holds the solutions. We need these solutions and skills to flow from basic research into healthcare.

If we want a future in which we can all benefit from our investment in studying humans on the molecular level, we need to solve these problems.

The major challenges, in a nutshell


Technical problems require solutions for working with data on different scales in sensible, portable ways.

Structural problems can be resolved when we agree on how to represent reference data. That includes, for example, genomes and variants, but also the way we describe things. The meta-data must be aligned to allow the transmission of key clinical data, and to allow data sharing more broadly.

(The devil is in the detail here. Consider the many ways one could represent ‘nested variation’ – a single nucleotide polymorphism on a structural insertion of DNA in the context of an inversion – something we elide over in both research and clinical practice.)

Ethical and regulatory problems are perhaps the most discussed across the board, and we must find a way for bona fide researchers to access data within an appropriate framework, globally.

Security problems require tight coordination. The GA4GH aims to establish a federation in which datasets are appropriately accessible. That means we need access tools like APIs and virtualisation schemes that can work smoothly, with predominantly secure electronic methods, and absolute clarity and constant forward thinking about security.

But as with all complex issues, many of the challenges are some kind of combination of problems, or hide in the spaces in between.

GA4GH: an ambitious endeavour


Resolving so many challenges in a relatively short time is certainly ambitious, but it can be achieved. It isn’t easy technically or socially, because it will only be effective if it is global. But we know from experience that it isn’t impossible. 

The extensive work done already shows that it is tractable: for example we already share (mainly by data transfer) large cohorts of patients for joint analysis delivering many new insights.

We have established appropriate ethical access to these schemes. We have also demonstrated, in specific areas, that federation can work (e.g. MatchMaker Exchange for rare disease patient discovery) and that virtualisation is an effective approach (e.g. PanCancer Analysis). 

Many academic and commercial groups in the GA4GH already provide practical solutions, but they are not as well coordinated as they should be.

The goal of the GA4GH is to enable a future in which secondary use of healthcare-generated genomics data is routine and practical, and we already have a strong start.
We need to make existing ad-hoc schemes better by coming together more.

Why am I Chairing the Global Alliance?


GA4GH has been in operation for three years, led first by David Altschuler (now at Vertex Pharmaceuticals), then by Tom Hudson (now at Abbvie). David and Tom oversaw the establishment of GA4GH and grew it from 90 to 433 partner organisations. Under their leadership the GA4GH set up a series of technical, meta-data, ethical, regulatory and security work streams, many of which have been very successful, if isolated. There have also been a number of exploratory projects set up, though many seem to be driven by curiosity and personal interest.

My goal for the GA4GH over the next three years is to rebalance delivery and structure, building on the partnership’s existing strength of exploratory work. Many people in GA4GH, and some outside the Alliance, are eager to see more alignment, and there is an incredible pool of talent in engineering, genomics, clinical and ethics, all ready to come together around this.

We may not be able to solve every challenge, as many of these eventually merge with healthcare informatics generally. But I am confident that we will make substantial progress and achieve a far better world for both healthcare and research.

Where do I find time for these things?

(No, I do not have a Time-Turner.)


When people who know me heard that I took on another leadership role, they rolled their eyes and either berated me for not saying no, or simply asked how on Earth I will balance this with my other responsibilities.

I am stretched a bit thin, between my leadership roles at EMBL-EBI, ELIXIR and Genomics England, my consultancy for Oxford Nanopore and GSK and my advisory role for other organisations, and other professional responsibilities. Importantly, I also have a life outside of science: I am a Dad with two children and a wonderful wife.

How could I take on being Chair of the GA4GH, with everything else going on? How could I … not?

I am an endlessly curious, optimistic person and love bringing people together to make collaborations work, even if having such diverse commitments requires time slicing, and results in my being distracted. In fact, for the past three years I have been quite active in GA4GH, but at a very technical level. This role is more than just guiding one or two working groups.

Team work

Working in teams – tight or loose – and in close partnership is my default strategy, in both work and family life. My wife and I are very much equitable partners, with demanding careers and full-time jobs. We are both responsible for making sure the logistics work (and that we have backup plans), and for setting aside quality time with our children. That said, one of the drawbacks of being spread thin is that sometimes I will be at home, but completely distracted by work – something that drives the whole family a bit nuts. I know I am not alone in struggling with this. Like many people I feel that I short-change my family, even as they support me completely.

I could not function as Director of EMBL-EBI without Rolf Apweiler as joint Director, and the high level of trust we share. Although we are chalk and cheese (focused, organised German and messy, problem-orientated Brit), our complementarity is a real strength. I also lead EMBL-EBI research in partnership with Nick Goldman, and as a group leader I’ve partnered previously with Ian Dunham and now with Tom Fitzgerald to lead my research projects. I see my roles with Genomics England, Oxford Nanopore and GSK as providing help, support and constructive criticism, but as a consultant my interactions are limited.

Chairing the GA4GH will be a partnership role I share with the Alliance’s strong Executive Director, Peter Goodhand of the Ontario Institute for Cancer Research, who keeps many of the processes working smoothly. We both plan to recruit an active set of Vice-Chairs who will provide a high level of strategic oversight. 

I know there is enough talent in the GA4GH to deliver this.

Enabling talented people to deliver


High-level leadership is mainly about providing the right space and conditions for knowledgeable, talented people to step up and deliver. Being clear about the vision and direction is incredibly important, but setting out a vision often isn’t the most challenging aspect. The hard thing is to identify the people who have the right mindset and skills, and enable them to drive part of the work all the way through to delivery.

I am speaking from experience when I say that this is true for leadership generally, both in formal organisations like EMBL-EBI and more loosely coupled organisations such as GA4GH. The problems we are grappling with cannot be resolved single-handed; rather, we will be able to deliver practical solutions by aligning individuals and groups, and ensuring they have the right balance of skills, enthusiasm, resources and motivation.

No human is an island.


The adventure of understanding things is deeply exciting for me, whether it’s a well-known problem or an unexplored area of science, so the GA4GH is a project after my own heart. As with any ambitious endeavour, there are bound to be arguments and hard decisions of all shapes and sizes in the GA4GH. But the motivation of people to participate, the rewards of collaboration and the potential benefits to society are great.

I am very lucky to be surrounded by supportive, excellent colleagues on every level: the people who manage me, my peers around the world and those I manage. I am also lucky to be working in science, which thrives on collaboration, information exchange and support, and where just being reciprocally nice is an excellent strategy.


Working together, we are going to make the next few years of GA4GH amazing. I cannot wait.