This review is of the newly published paperback edition of the book, which was originally published in hardback in 2012. Miguel García-Sancho traces the history of molecular sequencing by primarily following the path of Frederic Sanger’s work on protein, RNA and DNA sequencing and its subsequent use and alteration in DNA sequencing machines. García-Sancho traces the work on linear sequences through the historical contexts of biochemistry, molecular biology, the human genome project and biocomputing.
García-Sancho utilises – and occasionally critiques – previous work on the history of molecular biology, eg. by Horace Judson and Michel Morange; the specific British work of Soraya de Chadarevian; the human genome project by Robert Cook-Deegan; work on sequences and databases, eg. by Bruno Strasser and Joel Hagen; and finally, biocomputing, by eg., Joseph November. García-Sancho’s 2012 book was in production at the same time as Hallam Stevens’s Life Out of Sequence (University of Chicago Press, 2013); neither refers to the other but the two books nicely complement each other. They make some of the same points about the change in the nature of work in biology with the advent of computerised databases to store sequences and algorithms to analyse them. Stevens is US-centred. García-Sancho concentrates on sequencing work in the UK: first – where Sanger was located – the Biochemistry Department of the University of Cambridge, the Laboratory of Molecular Biology (LMB) in Cambridge and, then, the formation of the European Molecular Biology Laboratory (EMBL) to store and analyse sequences.
Part I, ‘Emergence: Frederick Sanger’s Pioneering Techniques’ begins with the early twentieth-century history of protein chemistry and the context within which Sanger began his career. In the 1950s, Sanger developed techniques for sequencing proteins and used them to sequence insulin for which he received his first Nobel Prize in 1958. Sanger was, according to García-Sancho, one of those who ‘created...the concept of sequence’ (p. 34). With the advent of molecular biology and the influence of Francis Crick and Sidney Brenner, Sanger shifted to RNA and DNA sequencing, joining the newly formed LMB in 1962. Nonetheless, Sanger always viewed his work as within the discipline of biochemistry. Hence García-Sancho emphasises that the ‘phenomenon of molecularization’ cannot be explained solely with the development of molecular biology.
Part II, ‘Mechanisation-1: Computing and the Automation of Sequence Reconstruction’ follows the changing use of computers from structure analysis in crystallography to sequence storage and analysis in biochemistry and molecular biology. The ‘form of work’ (a historiographical category explicitly utilised by García-Sancho) shifted with the introduction of databases for storing sequences, algorithms for analysing them, the personal computer in the laboratory and more centralised working groups.
In Part III, ‘Mechanisation-2: The Sequencer and the Automation of Sequence Construction’, García-Sancho compares work on the development and commercialisation of sequencing machines in the UK and the US. Different values and attitudes toward the ‘academic-industrial complexes’ account for the eventual success of the American Leroy Hood’s sequencing machine. Its use ‘triggered a shift in sequencing from a human-led to a mechanised from of work’ (p. 117).
Readers interested in the history of bioinformatics will also want to consult a recent open source series on the ‘roots of bioinformatics’, edited by David Searls (‘The Roots of Bioinformatics’, PLoS Computational Biology 6, 6 (2010): e1000809. doi:10.1371/journal.pcbi.1000809).
García-Sancho might have compared sequencing methods in the US. He just mentions the Maxam-Gilbert sequencing method in passing, with no comparison or analysis as to why Sanger’s method became more widely used, both by people and those who altered it when building sequencing machines. Anecdotally, a molecular biologist once told me that she had run a sample using both techniques; the Sanger method took less time.
García-Sancho uses the term ‘information’ in several different ways, without distinguishing them. When he mentions ‘qualifications to the central dogma of molecular biology’ (p. 192, n. 14), he might have quoted Francis Crick’s definition of information when Crick first stated the Central Dogma: ‘Information means here the precise determination of sequence, either of bases in the nucleic acid or of amino acid residues in the protein’ (Crick 1958, ‘On Protein Synthesis’, in Symposium of the Society of Experimental Biology 12, p. 153). In this sense, information involves a mechanism operating to produce a linear sequence. A different non-semantic sense of ‘information’ is that used in assessing the accuracy of the transmission of electrical signals (García-Sancho, p. 65), which Crick claims did not influence him. A final sense of ‘information’ is that used in the later chapters to refer to any item entered into and processed by a computer program.
In sum, readers seeking an intellectual and institutional biography of Sanger’s work on sequencing or the early UK perspective on databases, algorithms and machines for sequencings will find it in García-Sancho’s book.