Writing Our Genome
A near outsider’s thoughts on Human Genome Project-Write
Thirteen years ago, the largest ever collaborative project in biology was declared complete. Scientists from the US, the UK, Germany, France, Japan and China had deciphered a majority of the human genome — the DNA blueprint present in every human cell.
To most people my age this is old news, stuff we learned about in middle school. By the time my peers and I had started on our careers in science, the genomes of most economically or scientifically important species had been deciphered and were in near complete states of annotation.
Genome Annotation: determining the functions of long stretches of DNA in the genome, such as genes.
You see, in the years since the completion of the Human Genome Project, the price of DNA sequencing has fallen dramatically. This includes both the cost of sequencing short fragments of DNA, which followed a Moore’s Law-equivalent reduction in costs over time, and next-generation sequencing (more high-throughput methods useful for sequencing larger DNA samples, such as genomes), which even out-paced Moore’s Law.
Moore’s Law: A prediction made by Gordon Moore at Intel that the number of transistors on an affordable CPU will double every two years.
So to us newcomers to biology, the next big thing was not, and is not ‘reading’ DNA but ‘writing’ it.
Synthesising short, single stranded fragments of DNA, called oligonucleotides (oligos for short) has been automated and affordable for a number of years now, and almost every biology laboratory in the world uses these short fragments (usually 18–25 base-pairs — compare this to the human genome which is 3 billion base-pairs long) for applications ranging from disease diagnostics to making genetically modified plants. What really has been a game-changer in DNA synthesis is the ability to synthesise longer pieces of DNA and the ability to join these together efficiently to form synthetic gene length fragments.
This ability to synthesise DNA quickly and cheaply has excited a group of biologists who want to use it to create entire genomes from scratch. To study life by building it, gene by gene. Going further, scientists want to create a “minimal genome” to identify exactly only those genes absolutely necessary to support the most basic form of life. Genome synthesis could also enable us to create optimised strains of biological workhorses like yeast, to more efficiently produce pharmaceuticals and fuels. These ambitions have fuelled the field of synthetic genomics, beginning with the synthesis of a Mycoplasma mycoides genome by Craig Venter in 2008.
Currently the most ambitious such project is Sc 2.0. Teams from the US, UK, China and Australia are synthesising a synthetic yeast genome, chromosome by chromosome. This represents a huge leap in complexity since the yeast genome is organised into 16 pairs of chromosomes (we humans have 23 pairs of chromosomes and bacteria usually have just one) and totals 12 million base pairs of DNA (the human genome is 3 billion base pairs and the synthetic Mycoplasma genome only had about half a million).
And this brings us to the title of this piece, writing a human genome.
This week an article authored by some of the leading researchers in synthetic biology and genetics appeared in the academic journal Science, and for the first time, publicly explored the idea of synthesising an entire human genome. Christened HGP-Write (while rechristening the original Human Genome Project as HGP-Read), the project’s main goal is to “reduce the costs of engineering and testing large (0.1 to 100 billion base pairs) genomes in cell lines by over 1000-fold within 10 years.” The next very sentence goes on to expand this goal to encompass “whole genome engineering of …other organisms of agricultural and public health significance”. The paper goes on to outline a project structure composed of loosely defined pilot projects and milestones. You can read it in full, for free, here and check out the project’s webpage at http://engineeringbiologycenter.org/. Interestingly, and unlike many scientific manuscripts, the article begins with talking about responsible innovation. The authors acknowledge the huge ELSI (ethical, legal and social implications) associated with the project and promise to enable public dialogue prior to and around the project’s implementation. More on this in a bit.
Of course, as you may have realised by now, this project is challenging, more so even that the Human Genome Project and I would argue that it’s probably the most ambitious scientific proposal ever made, spaceflight included. The authors place an launch price tag of $100 million on the project — though I reckon that the project will ultimately cost much more, given that HGP-Read cost $2.7 billion in 1991 dollars.
You might wonder now, why write a human genome at all? The project lists several engineering goals on it’s website ranging from growing transplantable human organs to engineering virus immunity. To me however, the more exciting, and perhaps more immediate, results will be the scientific ones. By building a complex mammalian genome from the ground up, we will start filling in holes in our understanding of genetics, from the roles of so-called junk DNA to how epigenetic factors affect cellular function. Further, the project could make available a lot of technologies and generate huge amounts of data that would benefit more distant fields of biological research.
However, in spite of the potential gains, I for one have concerns about HGP-Write.
A. The Vision
The project explicitly states that its goal is to synthesise the human genome and to reduce the “cost of engineering and testing” genome synthesis. It envisions catalysing a steep price drop in genome-scale synthesis costs and enabling large-scale genome engineering at more affordable rates. I (like others) question the relevance of this goal to the wider biological and biotechnology industry. Most scientists and biological engineers will never synthesise a eukaryotic genome. It simply is not a technology that is universally applicable or even necessary in most biotechnological solutions.
We synthetic biologists think of ourselves as engineers — we tweak and modify self-replicating systems (aka biological chassis), and we usually make smart, targeted interventions to generate value for industry, for agriculture, for healthcare. Some of the most impressive solutions the field has come up with to date: Humulin, golden rice, artemisinin, genome editing have been done through careful, logical modifications to existing systems. And the tools developed to achieve these goals: faster cloning, metabolic models, model organisms etc. are in constant need of improvement. Perhaps the current technology most relevant to this discussion is molecular cloning.
Cloning: Unlike what Hollywood would have you believe, in molecular biology, cloning is not the making of identical twins who inevitably turn into murderous, soulless monsters. Cloning refers to the manual assembly of plasmids — circular pieces of DNA that can be propagated in bacteria. These plasmids are the basic ingredients, the ‘apps’ if you will, that allow us to customise microbes to produce insulin and make insect resistant plants etc.
Cloning used to be extremely laborious and buggy, and despite recent advances it still is, to an extent. To put it in perspective, a few decades ago, cloning a single gene would constitute an entire PhD. These days it’s routine and largely performed by miserable undergraduates, and in some labs, by even more miserable PhD students and postdocs. Cloning is also a huge market: $3 billion by one estimate. To us regular biotechnologists, the single biggest promise of cheaper DNA synthesis is to eliminate cloning. As DNA synthesis costs have fallen, we have adopted the technology more into our cloning workflows. We no longer need to laboriously isolate genes and DNA fragments from existing organisms (some of which are hard to grow in labs), we can simply order the DNA from synthesis companies. However, the technology has still not become affordable enough that we can do away with cloning entirely. In my line of research: plant synthetic biology, we work with plasmids that measure up to tens of thousands of base-pairs. No synthesis company at the moment can synthesise these plasmids for us, and if they can, it’s certainly not at a rate I could justify to my supervisors.
Would HGP-Write address this need by accelerating DNA synthesis technologies? Possibly. But how long would the price-reductions take to reach the average bioengineer making 15–20 plasmids rather than entire genomes? Would HGP-Write lead to synthesis companies focusing on low costs-per-base for high-volumes, while ignoring demand for low-volumes? I haven’t seen any answers to these questions. Would the project tie up research funding in DNA synthesis? I certainly can’t imagine grant agencies funding multiple large-scale DNA synthesis projects, although I hope they would.
B. Stakeholder engagement
HGP-Write positions itself as a successor to the original Human Genome Project, and perhaps seeks to capture the public enthusiasm that large blue-sky scientific projects sometimes enjoy. The proposal also recognises the huge ethical, social and legal implications of synthesising a human genome, and to the authors’ credit, addresses them more openly than most such research has in the past. I however think that, almost inevitably, the proposal fails to take stakeholder engagement seriously, or seriously enough. The article talks at length about enabling public dialogue, biosafety standards, regulations and intellectual property but misses a crucial point.
It does not, at least explicitly, allow for stakeholders to accept, reject or modify its primary goal — making a human genome.
This is of course, a consequence of it being named HGP-Write, and with its main raison d’être being human genome synthesis. The proposal tries to encompass alternate goals by making hints towards incorporating genome synthesis of other model organisms (thale cress, mouse, fruit fly etc.) and even “enabling research on crop plants and infectious agents and vectors in developing nations.” (This is a statement I take issue with over concerns about the consolidation of distinct research areas). Overall though, the article appears to assume that public engagement will revolve around its pilot projects and assuring biosafety, and other such concerns while implicitly allowing for the construction of a human genome.
Stakeholder engagement of this sort — where the goal is set beforehand and the means, timelines, methods, etc. are open for discussion — is of course perfectly legitimate. Most public projects, scientific and not, follow this approach. I question whether this is enough for a project where everyone of us holds a stake and where the goal is both so radical and so encompassing. Wouldn’t a more flexible proposal to build a eukaryotic genome hold more appeal in this regard? A project where public discussion could include how far we agree to proceed with genome synthesis, for what organism and at what investment? This would perhaps be a more limited project, at smaller scale and without the Human Genome Project-moniker and associated marketing pull, but I think it would ultimately get off the ground faster and with fewer ruffled feathers.
Synthetic biology is often defined as a subset, or extension of biotechnology and genetic engineering and the technologies used often represent a significant step-change over past approaches. The one area where synthetic biology differs most from traditional biotechnology is in it’s handling of human practices (inculcated from very early on), openness, transparency and public dialogue. Synthetic biology centres around the world engage regularly with artists, ethicists and social scientists to an extent that would have been unimaginable in the early days of recombinant DNA technology. Here in Europe, much synthetic biology research has to struggle with the memories of the GMO debates and the mistakes made in past public engagement. In this context, HGP-Write, like other large-scale research proposals makes big, bold promises that are yet a long way off and begins with less than perfect openness (I refer to how much controversy the private meeting last month has already attracted). Big promises carry large risks of under-delivery and I’m wary of the negative fallout from underwhelming expectations. There is little doubt that HGP-Write can achieve its primary goal, the issues here lie with the outcomes described in Box 1 of the article: universal(?) virus resistance, improved genome stability, and cancer (it’s always cancer.)
I realise upon scrolling up through this piece that it reads a little too much like strident criticism, and that too from a barely published PhD student who has never worked with genome synthesis! I originally started out writing with the intent to make the topic, and the issues surrounding it, easier to parse for non-specialists — family members, for instance. Researching further, I realised this article could perhaps become part of the public discourse invited by the authors of HGP-Write, written from the perspective of a student who still (even with the enormous help that MoClo provides) wishes he didn’t have to do any more cloning. On the other hand, as a Science & Policy student working with crop biotechnology (and genome editing) for developing nations, I couldn’t help but chip in with my take on the public engagement side of things.
If HGP-Write goes ahead to achieve even a fraction of all that it promises, I will still number among its many fans — I just think it could be done with a little less fanfare and little more openness. Overall, for all my concerns, I remain enamoured with genome-scale technologies and the promise that whole genome synthesis holds. After all, my statement of purpose letter for grad school began with that oft-cited Richard Feynman quote:
“What I cannot create, I do not understand”
This image is for illustrative purposes only, it is not an accurate representation of the genomes of any of the species shown. The estimate of 1 billion bases of gene synthesis sold in 2015 is borrowed from here. The chromosomes counts for yeast and humans is for diploid cells. There is no definition of genome complexity, however I have used the percentage of non-coding DNA as a rough guide to complexity, or mystery. The values are obtained from here and are pre-ENCODE.