The publication of the human genome, more than a decade ago, marked the dawn of a new era in genetics (Lander et al., Reference Lander, Linton, Birren, Nusbaum, Zody, Baldwin, Devon, Dewar, Doyle, FitzHugh, Funke and Gage2001; Venter et al., Reference Venter, Adams, Myers, Li, Mural, Sutton, Smith, Yandell, Evans, Holt, Gocayne and Amanatides2001). It enabled scientists to examine the genetic information from beginning to end, as a whole. However, given that it had taken many years and huge sums of money to complete the reading of one genome, it was far from feasible to further use this technology to evaluate more than a handful of additional individuals.
Professor Frederick Sanger, the British biochemist received the Nobel prize for chemistry in 1980, together with Walter Gilbert, for what was defined as ‘their contributions concerning the determination of base sequences in nucleic acids’. At first, they showed how they could sequence up to 80 nucleotides per run. This was a tedious process. Nevertheless, it enabled the sequencing of more than 5000 nucleotides of the single-stranded bacteriophage ϕX174, the first fully sequenced genome (Sanger et al., Reference Sanger, Air, Barrell, Brown, Coulson, Fiddes, Hutchinson, Slocombe and Smith1977). To their amazement, they were able to reveal a novel genomic feature from this one complete DNA read of multiple overlapping genes in one locus, a feature that is still being explored today (Sorek & Cossart, Reference Sorek and Cossart2010). Subsequently, Sanger et al. introduced the chain-termination technique for sequencing DNA molecules, which was later known as the ‘Sanger sequencing method’ (Sanger et al., Reference Sanger, Coulson, Barrell, Smith and Roe1980). This major leap forward in the type of sequencing approach used allowed for long stretches of DNA to be systematically and accurately recorded, laying the foundations of DNA sequencing thereof. In 1984, scientists of the Medical Research Council (MRC) in the UK were able to decode the entire DNA sequence of the Epstein–Barr virus (170 kb). Two years later, the laboratory of Leroy Hood at the California Institute of Technology (CA, USA), announced the first semi-automated DNA sequencing machine. In 1987, Applied Biosystems marketed the first automated sequencing machine that boosted sequencing such as those of human expressed sequence tags (ESTs; by Craig Ventor). Ironically, the title of one of Sanger's first sequencing papers was ‘Cloning in single-stranded bacteriophage as an aid to rapid DNA sequencing’ (Sanger et al., Reference Sanger, Coulson, Barrell, Smith and Roe1980). Little did Sanger know that during his lifetime ‘rapid’ – the adjective he chose to describe ‘DNA sequencing’– would take on a supersonic form.
A solution for sequencing multiple human genomes turned up about half a decade after the first human draft sequence was published. It came in the form of a second generation sequencing machine, which was also identified under the following terms: ‘Next Generation Sequencing’ (NGS), ‘Massively Parallel Sequencing’ (MPS), ‘High Throughput Sequencing’ (HTS), or ‘Deep Sequencing’. This revolutionary technology enabled reading an individual human genome in a matter of days (Shendure & Lieberman Aiden, Reference Shendure and Lieberman Aiden2012). This notion made grand projects realistic, such as the sequencing of: (i) 1000 genomes from 13 different populations (1000 Genomes Project Consortium, Reference Abecasis, Auton, Brooks, DePristo, Durbin, Handsaker, Kang, Marth and McVean2012); (ii) thousands of cancer genomes (Boehm & Hahn, Reference Boehm and Hahn2011); and (iii) the entire micro-organisms of a human gut (Human Microbiome Project Consortium, 2012).
The technology behind the first and second generation sequencers is conceptually similar. Fragments of unknown DNA are flanked by linkers, amplified (in most cases) and read by consecutive light emission (or change of chemical state) once a complementary nucleotide is incorporated (A-T and G-C). However, in the second generation apparatuses, this process occurs around 10–100 million times per experiment representing a 1–10 million fold increase in read depth in just over three decades. For comparison, and in order to grasp this meteoric advance, think about the magnification of a light microscope compared with an electron microscope: about ×400 versus ×200 000; a 50-fold increase. Similarly, in the transportation world, the progress made from one of the first cars (more than 100 years ago) which cruised at 4 km per hour, versus a space shuttle accelerating to leave planet Earth at about 40 000 km per hour, presents a mere 10 000-fold increase. Thus, these exhilarating abilities will no doubt advance current research and will evidently progress our profound understanding of the human genome.
Currently, there are numerous scientific laboratories and companies that utilize massive sequencing for the study of human genetics on a regular basis. These projects either record the base composition of the entire 23 human chromosomes or focus on reading all protein-coding regions in the human genome, also known as the ‘Exome’. In the near future, it is safe to assume that every individual would carry their own genetic makeup on a digital media device.
In order to interpret the information stored in the DNA, powerful bioinformatics analysis must be implemented by the sequencing team. By means of computational investigation, scientists attempt to link particular genetic composition, or changes thereof, to functional outcomes or phenotypes (Isakov & Shomron, Reference Isakov, Shomron and Mahdavi2011). The advent of genome sequencing allows for the: (i) identification of genetic diseases (Walsh et al., Reference Walsh, Shahin, Elkan-Miller, Lee, Thornton, Roeb, Abu Rayyan, Loulus, Avraham, King and Kanaan2010; Fuchs-Telem et al., Reference Fuchs-Telem, Sarig, van Steensel, Isakov, Israeli, Nousbeck, Richard, Winnepenninckx, Vernooij, Shomron, Uitto, Fleckman, Richard and Sprecher2012); (ii) mapping of cancerous tissue (Ley et al., Reference Ley, Mardis, Ding, Fulton, McLellan, Chen, Dooling, Dunford-Shore, McGrath, Hickenbotham, Cook and Abbott2008); and (iii) profiling of pathogen infections (Isakov et al., Reference Isakov, Modai and Shomron2011), to name a few. A decade ago, fewer than 100 genetic disease-causative genes were identified. Today, nearly 3000 Mendelian diseased genes have been revealed, and the list is rapidly growing with the increase in the number of genetic and physical maps created by every genome sequenced.
For scientific researchers, receiving the complete DNA sequence of an organism is as straightforward as supplying them with a substrate to work on. It is similar to allowing a mechanic to look at the car's blueprints before attempting to fix it. For physicians, the comprehensive view of the DNA allows an unbiased examination of genomic information, which serves as a possible link to the clinical evaluation and treatment management. Some physicians describe the interpretation of a patient's genetic makeup as a ‘gift’ that enables them to look ‘outside the box’ and explore the genetic causes of symptoms, which they would have never done otherwise. For the general public, access to one's own genetic profile currently opens a Pandora box with a myriad of questions and very few answers. This will soon change owing to the intensive research this new technology enables.
With the rapidly ever-growing amount of genetic information, and the importance of understanding what it all means, there is a need to generate an interdisciplinary hub that will connect researchers, both experimentalists and bioinformaticians, along with physicians and community representatives in order to come up with a common genomic language. This should lead to an accessible, readable and interpretive human genome with a short list of personal actionable items. We will then be able to declare that we are moving ever closer to the point at which one's own genome will affect one's personal life at a scope beyond our current comprehension.
All of the above sums up in essence what Genetics Research hopes to achieve going forward and it will become the forum where these new and exciting challenges will be highlighted, debated and disseminated. I look forward to welcoming this groundbreaking research to the journal!
Acknowledgements
I thank the Shomron laboratory for their valuable discussions and comments on the manuscript. The Shomron laboratory is supported by the Wolfson family Charitable Fund; Claire and Amedee Maratier Institute for the Study of Blindness and Visual Disorders; Israel, Frida and Haya Hamer Fellowship; Levine Katan Leukemia Research Fellowship; Kurz-Lion Foundation; I-CORE Program of the Planning and Budgeting Committee and the Israel Science Foundation [Grant No. 41/11].