The history of technological change in computing has been the subject of intensive research over the last five decades. However, little attention has been paid to comparing the performance of modern computers to pre–World War II technologies or even pencil-and-pad calculations. The present study investigates the progress of computing over the last century and a half, including estimates of the progress relative to manual calculations.1
The data used in this study are provided in a background spreadsheet available at http://www.econ.yale.edu/~nordhaus/Computers/Appendix.xls. That spreadsheet contains 14 pages with details of the calculations used in this article. The contents of the spreadsheet are in the page labeled “Contents.” The relevant page of the spreadsheet is cited here as “Nordhaus, Online Appendix, page x,” when referring to page x of this source as documentation.
The usual way to examine technological progress in computing is either through estimating the rate of total or partial factor productivity or through examining trends in quality-adjusted prices. For such measures, it is critical to use constant-quality prices so that improvements in the capabilities of computers are adequately captured. The earliest studies, dating from around 1953, examined the price declines of mainframe computers and used computers. Early studies found annual price declines of 15 to 30 percent per year, and recent estimates find annual price declines of 25 to 45 percent.2
Table 10 provides some documentation. See Landefeld and Grimm, “Note,” pp. 17–22, for a discussion and a compilation of studies.
Although many analysts are today examining the impact of the “new economy” and especially the impact of computers on real output, inflation, and productivity, one might naturally wonder how new the new economy really is. Mainframe computers were crunching numbers long before the new economy appeared on the radar screen, and mechanical calculators produced improvements in computational capabilities even before that. How does the progress of computing in recent years compare with that of earlier epochs of the computer and calculator age? This is the question addressed in the current study.
A SHORT HISTORY OF COMPUTING
We begin with some fundamentals of computational theory. Abacuses, manual arithmetic, calculators, and computers are information-processing systems. They involve taking an array of data (binary digits, words, and so on) and transforming them according to some functional relationship or algorithm (such as adding, simulating, solving a differential equation, or producing at PET image) into a new array. Technological change in computing is the development of new software, hardware, communications, and systems that can expand the range of problems that can be solved using computational techniques and the speed at which they are solved.
Computers are such a pervasive feature of modern life that we can easily forget how much of human history existed with only the most rudimentary aids to addition, data storage, printing, copying, rapid communications, or graphics. The earliest recorded computational device was the abacus, but its origins are not known. The Darius Vase in Naples (dated around 450 BC) shows a Greek treasurer using a table abacus, or counting board, on which counters were moved to record numbers and perform addition and subtraction. The earliest extant “calculator” is the Babylonian Salamis tablet (300 BC), a huge piece of marble, which used the Greek number system and probably deployed stone counters. Analog devices developed during the first century BC, such as the Antikythera Mechanism, may have been used to calculate astronomical dates and cycles.3
Freeth, Bitsakis, Moussas, et al., “Decoding the Ancient Greek Astronomical Calculator,” pp. 587–91.
The design for the modern abacus appears to have its roots in the Roman hand-abacus, introducing grooves to move the counters, of which there are a few surviving examples. Counting boards looking much like the modern abacus were widely used as mechanical aids in Europe from Roman times until the Napoleonic era, after which most reckoning was done manually using the Hindu-Arabic number system. The earliest records of the modern rod abacus date from the thirteenth century in China (the suan-pan), and the Japanese variant (the modern soroban) came into widespread use in Japan in the nineteenth century.
Improving the technology for calculations naturally appealed to mathematically inclined inventors. Around 1502 Leonardo sketched a mechanical adding machine; it was never built and probably would not have worked. The first surviving machine was built by Pascal in 1642, using interlocking wheels. I estimate that fewer than 100 operable calculating machines were built before 1800.4
Nordhaus, Online Appendix, page “Quant_History.”
Early calculators were “dumb” machines that essentially relied on incrementation of digits. An important step in the development of modern computers was mechanical representation of logical steps. The first commercially practical information-processing machine was the Jacquard loom, developed in 1804. This machine used interchangeable punched cards that controlled the weaving and allowed a large variety of patterns to be produced automatically. This invention was part of the inspiration of Charles Babbage, who developed one of the great precursor inventions in computation. He designed two major conceptual breakthroughs, the “Difference Engine” and the “Analytical Engine.” The latter sketched the first programmable digital computer. Neither of the Babbage machines was constructed during his lifetime. An attempt in the 1990s by the British Museum to build the simpler Difference Engine using early-nineteenth-century technologies failed to perform its designed tasks.5
See Swade, Difference Engine.
The first calculator to enjoy large sales was the “arithmometer,” designed and built by Thomas de Colmar, patented in 1820. This device used levers rather than keys to enter numbers, slowing data entry. It could perform all four arithmetic operations, although the techniques are today somewhat mysterious.6
An excellent short biography of this device is available in Johnston, “Making the Arithmometer,” pp. 12–21.
The present author attempted to use a variant of the arithmometer but gave up after an hour when failing to perform a single addition.
Nordhaus, Online Appendix, page “Quant_History.”
Table 1 shows an estimate of the cumulative production of computational devices (excluding abacuses and counting boards) through 1920. This tabulation indicates that fewer than 1,000 mechanical calculators were extant at the time of rise of the calculator industry in the 1870s, so most calculations at that time were clearly done manually.9
A comprehensive economic history of calculation before the electronic age is presented in Cortada, Before the Computer.
Two different sets of designs were the circular machine and the keyboard design. The circular calculator was designed by Frank Baldwin in the United States and T. Odhner in Russia, both first built in the 1872–1874 period. The second and ultimately most successful early calculator was invented by Dorr E. Felt (1884) and William S. Burroughs (1885). These machines used the now-familiar matrix array of keys, and were produced by firms such as Felt Comptometer, American Arithmometer, Monroe, and Burroughs. Production and sales of calculators began to ramp up sharply in the 1890s, as Table 1 indicates.
It is difficult to imagine the tedium of office work in the late nineteenth century. According to John Coleman, president of Burroughs, “Bookkeeping, before the advent of the adding machine, was not an occupation for the flagging spirit or the wandering mind …. It required in extraordinary degree a capacity for sustained concentration, attention to detail, and a passion for accuracy.”10
Quoted in Cortada, Before the Computer, p. 26.
Calculator manufacturers recognized that sales would depend upon the new machines being both quicker and more accurate than early devices or humans, but comparative studies of different devices are rare. A 1909 report from Burroughs compared the speed of trained clerks adding up long columns of numbers by hand with that of a Burroughs calculator, as shown in Figure 1. These showed that the calculator had an advantage of about a factor of six, as reported: Ex-President Eliot of Harvard hit the nail squarely on the head when he said, “A man ought not to be employed at a task which a machine can perform.”
Put an eight dollar a week clerk at listing and adding figures, and the left hand column (see Figure 1) is a fair example of what he would produce in nine minutes if he was earning his money.
The column on the right shows what the same clerk could do in one-sixth the time, or one and a half minutes.11
Burroughs Adding Machine Company, Better Day's Work, pp. 153–54.
The early calculators were not well designed for mass data input and output. This problem was solved with the introduction of punched-card technology, adapted circuitously from the Jacquard power loom. The Electrical Tabulating System, designed by Herman Hollerith in the late 1880s, saw limited use in hospitals and the War Department, but its first serious deployment was for the 1890 census. The Tabulator was unable to subtract, multiply, or divide, and its addition was limited to simple incrementation. Its only function was to count the number of individuals in specified categories, but for this sole function, it was far speedier than all other available methods. During a government test in 1889, the tabulator processed 10,491 cards in five and a half hours, averaging 0.53 cards per second.
Over the next half-century, several approaches were taken to improving the speed and accuracy of computation, and the tales of mechanical and electrical engineering have been retold many times. The major technologies underlying the computers examined here are shown in Table 2. Some of the major technological milestones were the development of the principles of computer architecture and software by John von Neumann (1945), the first electronic automatic computer (the ENIAC in 1946), the invention of the transistor (1947) and its introduction into computers (1953–1956), the first high-level programming language (Fortran, 1954), the development of the first microprocessor (1971), personal computers (dated variously from the Simon in 1950 to the Apple I in 1976 or the IBM PC in 1981), the first edition of Microsoft Windows (1983), and the introduction of the world wide web (1989).
Although the engineering of calculators and computers is a much-told tale, virtually nothing has been written on the economics of early calculating devices. The economics of the computer begins with a study by Gregory Chow.12
Chow, “Technological Change,” pp. 1117–30.
Triplett, “Performance Measures,” pp. 97–140.
Nordhaus, Online Appendix, page “Data.”
MEASURING COMPUTER PERFORMANCE
The Scope and Definition of Computer Power
The present study focuses on the long-term trend in the prices and productivity of “computer power.” It will be useful to begin with a definition of this term and an explanation of its scope and limitations. The central measure of computer power is the rate at which calculators or computers could execute certain standard tasks, measured in computations per second. This measure has an important advantage in comparison to most other measures of “real output” because it can be measured directly rather than by taking dollar values and applying price deflators. Moreover, because scientists and engineers are particularly interested in computer performance, there have been careful measurements of these data for over half a century.
In constructing the measures of computer power, I rely upon data on the costs and performance data of different machines over the last two centuries, with particular focus on the last hundred years. These data involve costs and inputs of capital and labor, which are relatively straightforward to obtain; additionally, I estimate performance in terms of time to perform standardized computational tasks, which turn out to be much more difficult to measure.
The bundle of computations performed by different systems evolves greatly over time. For the earliest calculators, the tasks involved primarily addition (say for accounting ledgers). To these early tasks were soon added scientific and military applications (such as calculating ballistic trajectories, design of atomic weapons, and weather forecasts). In the modern era, computers are virtually everywhere, making complex calculations in science and industry, helping consumers e-mail or surf the web, operating drones on the battlefield, producing images from medical scans, and combating electronic diseases. In all cases, I measure “computer power” as the number of times that a given bundle of computations can be performed in a given time; and the cost of computation as the cost of performing the benchmark tasks.
An ideal measure of computer performance would follow the principles of standard index number theory. For example, it would take an evolving mix of tasks {X1(t), …, Xn(t)} along with the prices or costs of these tasks {P1(t), …, P_n(t)}. The tasks might be {addition, subtraction, multiplication, …, flight simulation, Internet access, playing chess, …, DNA sequencing, solving problems in quantum chromodynamics, …}. The prices would be the constant-quality prices of each of these activities (using the reservation price when the activity level is zero). In principle, we could use Törnqvist or other superlative indexes to construct chained cost indexes.
In practice, construction of an ideal measure is far beyond what is feasible with existing data. The first shortcoming is that there is virtually no information on either the mix or relative importance of applications over time or on the market or implicit prices of different applications. Measuring computer power has bedeviled analysts because computer characteristics are multidimensional and evolve rapidly over time. The absence of reliable data on performance has forced economic studies of computer prices (called “hedonic” pricing studies) to draw instead on the prices of the input components of computers. The hedonic approach is not taken in this study but will be discussed in a later section.
As a substitute for the ideal measure, the present study has linked together price measures using changing bundles of computational tasks. The tasks examined here have evolved over time as the capabilities of computers grew. Table 3 gives an overview of the different measures of performance that are applied to the different computers. This approach means that the indexes largely involve addition and multiplication speed for the early years but involve the speed for complex procedures for the later years.
A second major shortcoming is that the present study does not account adequately for the contribution of complementary inputs into computational technologies. In early years, where the devices were simple adding machines, the major complementary factor was a roll of paper, and the omission is probably minor. In the modern era, software, high-level languages, and input-output technologies were complementary components of the production process. More recently, high-speed data transmission, video capabilities, multitasking, and Internet connectivity have been essential parts of computational capacity. The present study does not include either the costs or the productivity of these complementary technologies, but they have clearly been an important part of the rapid growth in both the speed and breadth of productivity growth.
Details on Measures of Computer Performance
This section describes the measures of computer performance that have been used to construct the time series. I begin with an overview and then describe the procedures in detail. The purpose of these measures is to develop a time series of computer performance from the earliest days to the present. I designate CPS or “computations per second” as the index of computer power, and MCPS as “millions of computations per second.”
For ease of understanding, I have set this index so that the speed of manual computations equals one. As a rough guide, if you can add two five-digit numbers in seven second and multiply two five-digit numbers in 80 seconds, you have one unit of computer power. The earliest devices, such as counting boards, the abacus, and adding machines, were primarily designed for addition; these could sometimes parlay addition into other arithmetic functions (multiplication as repeated addition). The earliest metric of computer performance therefore is simply addition time. This is converted into a measure of performance that can be compared with later computers using alternative benchmark tests. For computers from around World War II until around 1975, we use a measure of performance developed by Kenneth Knight that incorporates additional attributes. For the modern period, we use computer benchmarks that have been devised by computer scientists to measure performance on today's demanding tasks.
ADDITION TIME
The earliest devices were adding machines. Figure 1 shows the results of a typical task as described in 1909. In fact, until World War II, virtually all commercial machines were devoted solely to addition. We can compare the addition time of different machines quite easily as long as we are careful to ensure that the length of the word is kept constant for different machines.
MORAVEC'S INFORMATION-THEORETIC MEASURE OF PERFORMANCE
A measure of performance that relies primarily on arithmetic operations but has a stronger conceptual basis is the information-theoretic measure devised by Hans Moravec. To compare different machines, Moravec defined computing power as the amount of information delivered per second by the machine.15
See Moravec, Mind Children, especially appendix A2 and p. 63f.
This can then be put on a standardized basis by considering words with a standard length of 32 bits (equivalent to a nine-digit integer), and instructions with a length of one word. Moravec assumed that there were 32 instructions, and included measures on addition and multiplication time, which were weighted seven to one in the operation mix. Using this definition, the information-theoretic definition of performance is
The attractiveness of this approach is that each of these parameters is available for virtually all computers back to 1940, and can be estimated or inferred for manual calculations, abacuses, and many early calculators. The disadvantages are that it omits many of the important operations of modern computers, it considers only machine-level operations, and it cannot incorporate the advantages of modern software, higher-level languages, and operating systems.
KNIGHT'S MEASURE
One of the earliest studies of computer performance was by Kenneth Knight of RAND in 1966.
16Knight, “Changes,” Datamation, pp. 40–54, and “Evolving Computer Performance,” pp. 31–35.
Knight's formula is quite similar to Moravec's except that he includes a larger number of variables and particularly because he calibrates the parameters to the actual performance of different machines.
MIPS
One of the earliest benchmarks used was MIPS, or millions of instructions per second. In simple terms, instructions per second measures the number of machine instructions that a computer can execute in one second. This measure was developed to compare the performance of mainframe computers. The most careful studies used weighted instruction mixes, where the weights were drawn from the records of computer centers on the frequency of different instructions. These benchmarks were probably the only time something approaching the ideal measure described previously was constructed.
A simplified description of MIPS is the following. For a single instruction
To understand the logic of this measure, recall that computers that use the von Neumann architecture contain an internal clock that regulates the rate at which instructions are executed and synchronizes all the various computer components. The speed at which the microprocessor executes instructions is its “clock speed.” For most personal computers up to around 2000, operations were performed sequentially, once per clock tick.17
Many of the major topics in computer architecture can be found in books on computer science. For example, see Schneider and Gersting, Invitation.
Instructions differ in terms of the size of the “word” that is addressed. In the earliest computers (such as the Whirlwind I), words were as short as 16 binary digits or five decimal digits. Most personal computers today use 32-bit words, while mainframes generally employ 64-bit words.
MODERN BENCHMARK TESTS
Measures such as additions or instructions per second or more complete indexes such as those of Knight or Moravec clearly cannot capture today's complex computational environment. Computers today do much more than bookkeeping, and a performance benchmark must reflect today's mix of activities rather than that of a century ago. For this purpose, we turn to modern benchmark tests.
A benchmark test is an index that measures the performance of a system or subsystem on a well-defined set of tasks. Widely used benchmarks for personal computers today are those designed by SPEC, or the Standard Performance Evaluation Corporation. As of mid-2006, the version used for personal computers was SPEC CPU2000.18
SPEC CPU2000 is made up of two components that focus on different types of compute intensive performance: SPECint2000 for measuring and comparing computer-intensive integer computation and SPECfp2000 for measuring computer-intensive floating-point computation.Table 4 shows the suite of activities that SPEC2000 tests. These are obviously not routine chores. The benchmark fails to follow the elementary rule of ideal indexes in that the performance on different benchmarks is clearly not weighted by the economic importance of different applications. I discuss below the relationship between the SPEC and other benchmarks. To make current tests comparable with early ones, ratings have been set by comparing the rating of a machine with the rating of a benchmark machine.
MEASURE OF COMPUTER PERFORMANCE
This study is an attempt to link together computational performance of different machines from the nineteenth century to the present. A unit of computer performance is indexed so that manual computations are equal to one. A standard modern convention is that the VAX 11-780 is designated as a one MIPS machine. In our units, the VAX 11-780 is approximately 150 million times as powerful as manual computations. Different modern benchmarks yield different numbers, but they are essentially scalar multiples of one another.
Constructing metrics of performance is difficult both because the tasks and machines differ enormously over this period and because measures of performance are very sketchy before 1945. The data since 1945 have been the subject of many studies since that period. Data for this study for computers from 1945 to 1961 were largely drawn from technical manuals of the Army Research Laboratory, which contain an exhaustive study of the performance characteristics of systems from ENIAC through IBM-702.19
See particularly Weik, Survey. This was updated in Weik, Third Survey.
See http://www.jcmit.com/cpu-performance.htm; as well as Nordhaus, Online Appendix, page “MacCallum.”
Nordhaus, Online Appendix, page “Abacus.”
Reliable data for the earliest calculators and computers (for the period before 1945) were not available in published studies. With the help of Eric Weese of Yale University, data from historical sources on the performance of 32 technologies from before 1940 were obtained, for which 12 have performance and price data that I consider reasonably reliable. I will discuss the data on the early technologies because these are the major original data for the present study.
The data on manual calculations were taken from a Burroughs monograph, from estimates of Moravec, and from tests by the author.22
Nordhaus, Online Appendix, page “Manual.”
The contest and its results are described in Kojima, Japanese Abacus.
This comparison suggests that, in the hands of a champion, the abacus had a computer power approximately four and a half times that of manual calculation. Given the complexity of using an abacus, however, it is unlikely that this large an advantage would be found among average users. We have reviewed requirements for Japanese licensing examinations for different grades of abacus users from the 1950s. These estimates suggest that the lowest license level (third grade) has a speed approximately 10 percent faster than manual computations.24
Kojima, Japanese Abacus.
We have estimated the capabilities of early machines based on then-current procedures. For example, many of the early machines were unable to multiply. We therefore assume that multiplication was achieved by repeated addition. Additionally, the meaning of memory size in early machines is not obvious. For machines that operate by incrementation, we assume that the memory is one word. There are major discrepancies between different estimates of the performance of early machines, with estimates varying by as much as a factor of three. Given the difficulties of collecting data on the earliest machines, along with the problems of making the measures compatible, we regard the estimates for the period before 1945 as subject to large errors.
The only other important assumptions involve constructing the cost per operation. These calculations include primarily the cost of capital. The data on prices and wage rates were prepared by the author and are from standard sources, particularly the U.S. Bureau of Labor Statistics and the U.S. Bureau of Economic Analysis. We have also included estimates of operating costs as these appear to have been a substantial fraction of costs for many computers and may be important for recent computers. For the capital cost, we use the standard user cost of capital formula with a constant real interest rate of 10 percent per year, an exponential depreciation rate of 10 percent per year, a utilization factor of 2,000 hours per year, and no adjustment for taxes. These assumptions are likely to be oversimplified for some technologies, but given the pace of improvement in performance, even errors of 10 or 20 percent for particular technologies will have little effect on the overall results.
RESULTS
Overall Trends
I now discuss the major results of the study. Table 5 shows a summary of the overall improvement in computing relative to manual calculations and the growth rates in performance. The quantitative measures are computer power, the cost per unit computer power in terms of the overall price level, and the cost of computation in terms of the price of labor. The overall improvements relative to manual computing range between two and 73 trillion depending upon the measure used. For the period 1850 (which I take as the birth of modern computing) to 2006, the compound logarithmic growth rate is around 20 percent per year.
We now discuss the results in detail. Start with Figure 2, which shows the results in terms of pure performance—computing power in terms of computations per second. Recall that the index is normalized so that manual computation is one. Before World War II, the computation speeds of the best machines were between ten and 100 times the speed of manual calculations. There was improvement, but it was relatively slow. Figure 3 shows the trend in the cost of computing over the last century and a half. The prices of computation begin at around $500 per MCPS for manual computations and decline to around $6 × 10−11 per MCPS by 2006 (all in 2006 prices), which is a decline of a factor of seven trillion.
Table 6 shows five different measures of computational performance, starting with manual computations through 2006. The five measures are computer power, cost per unit calculation, labor cost per unit calculation, cycles per second, and rapid memory. The general trends are similar, but different measures can differ substantially. One important index is the relative cost of computation to labor cost. This is the inverse of total labor productivity in computation, and the units are therefore CPS per hour of work.25
The advantages of using the wage rate as a deflator are twofold. First, it provides a measure of the relative price of two important inputs (that is, the relative costs of labor and computation). Additionally, the convention of using a price index as a deflator is defective because the numerator is also partially contained in the denominator.
Trends for Different Periods
We next examine the progress of computing for different subperiods. The major surprise, clearly shown in Figures 2 and 3, is the discontinuity that took place around World War II. Table 6 shows data on performance of machines in different periods, while Table 7 shows the logarithmic annual growth rates between periods (defining manual calculations as the first period). Table 7 indicates modest growth in performance from manual computation until the 1940s. The average increase in computer productivity shown in the first three columns of the first row of Table 7—showing gains of around 3 percent per year—was probably close to the average for the economy as a whole during this period.
Statistical estimates of the decadal improvements are constructed using a log-linear spline regression analysis. Table 8 shows a regression of the logarithm of the constant-dollar price of computer power with decadal trend variables. The coefficient is the logarithmic growth rate, so to get the growth rate for a period we can sum the coefficients up to that period. The last column of Table 8 shows the annual rates of improvement of computer performance. All measures of growth rates are logarithmic growth rates.26
The growth rates are instantaneous or logarithmic growth rates, which are equivalent to the derivatives of the logarithms of series with respect to time for smooth variables. This convention is used to avoid the numerical problems that arise for high growth rates.
A word is in order for those not accustomed to logarithmic growth rates: These will be close to the conventional arithmetic growth rate for small numbers (2 or 3 percent per year) but will diverge significantly for high growth rates. For example, an arithmetic growth rate of 100 percent per year is equivalent to a logarithmic rate of 0.693. That is, e0.693 = 2. A further warning should be given on negative growth rates. There is no difficulty in converting negative to positive rates as long as logarithmic growth rates are used. However, in using arithmetic growth rates, decline rates may look significantly smaller than the corresponding growth rate. For example, a logarithmic growth rate of −0.693 represents a decline rate of 50 percent per year; that is, e−0.693 = 0.5.
The regression analysis shows that the explosion in computer power, performance, and productivity growth began in the mid-1940s. A Chow test for stability of coefficients find the maximum-likelihood year for the break in trend was 1944, but the data cannot distinguish that year from neighboring years with a high degree of statistical significance. Tables 7 and 8 provide slightly different estimates of the subperiod growth rates, but it is clear that productivity growth was extremely rapid during virtually the entire period since 1945. Using decadal trend-break variables, as shown in Table 8, we find highly significant positive coefficients for the dummy variables beginning in 1945 and in 1985 (both indicating acceleration of progress). The only period when progress was slow (only 22 percent per year!) was during the 1970s. Table 7 uses a different methodology for examining subperiods. It shows a slowing in the 1960–1969 and 1970–1979 periods. We were unable to resolve the timing and cause of the slowdown in the 1960–1979 subperiod, and this is left as an open question.
The rapid improvement in computer power is often linked with “Moore's Law.” This derives from Gordon Moore, co-founder of Intel, who observed in 1965 that the number of transistors per square inch on integrated circuits had doubled every year since the integrated circuit was invented. Moore predicted that this trend would continue for the foreseeable future. When he revisited this question a decade later, he thought that the growth rate had slowed somewhat and forecast that doubling every 18 months was a likely rate for the future (46 percent logarithmic growth). Two remarks arise here. First, it is clear that rapid improvements in computational speed and cost predated Moore's forecast. For the period 1945–1980, cost per computation declined at 37 percent per year, as compared to 64 percent per year after 1980. Second, computational power actually grows more rapidly than Moore's Law would predict because computer performance is not identical to chip density. From 1982 to 2001, the rate of performance as measured by computer power grew 18 percent per year faster than Moore's Law would indicate.
One of the concerns with the approach taken in this study is that our measures might be poor indexes of performance. We have compared MCPS with both addition time and cycle time (the latter comparison is shown in Tables 6 and 7). Both simple proxies show a very high correlation with our synthetic measure of MCPS over the entire period. Computer power grows at very close to the speed of addition time (for observations from 1900 to 1978) but 10 percent per year more rapidly than cycle speed (1938–2006).
In this regard, it is natural to ask whether the changing character of computers is likely to bias the estimates of the price of computer power. The earliest calculators had very low capability relative to modern computers, being limited to addition and multiplication. Modern computers perform a vast array of activities that were unimaginable a century ago (see Table 4). In terms of the ideal measure described above, it is likely that standard measures of performance are biased downward. If we take an early output mix—addition only—then the price index changes very little, as discussed in the last paragraph. On the other hand, today's output bundle was infeasible a century ago, so a price index using today's bundle of output would have fallen even faster than the index reported here. Put differently, a particular benchmark only includes what is feasible, that is, tasks that can be performed in a straightforward way by that year's computers and operating systems. Quantum chromodynamics is included in SPEC 2000, but it would not have been dreamt of by Kenneth Knight in his 1966 study. This changing bundle of tasks suggests that, if anything, the price of computation has fallen even faster than the figures reported here.
ALTERNATIVE APPROACHES
Comparison of Alternative Modern Benchmark Tests
Using direct measures of computer performance raises two major problems. First, a properly constructed benchmark should weight the different tasks according to their economic importance, but this property is satisfied by none of the current benchmark tests. For example, as shown in Table 4, the SPEC benchmark that is widely used to test PCs contains several exotic tasks, such as quantum chromodynamics, which are probably not part of the family computer hour. Most benchmarks simply apply equal geometric weights to the different tasks. Second, the rapid evolution of computer performance leads to rapid changes in the tasks that the benchmarks actually measure. For example, the SPEC performance benchmark has been revised every two or three years. In one sense, such changes represent a kind of chain index in tasks; however, because tasks are not appropriately weighted, it is impossible to know whether the chaining improves the accuracy of the indexes.
Eric Weese and I investigated the results of using different benchmark tests over the last decade. For this purpose, we examined the SPEC benchmarks; a series of tests known as WorldBench, which have been published by PC World; and SYSmark98, a measure that evaluates performance for 14 applications-based tasks. To illustrate how PC benchmarks work, the SYSmark98 test of office productivity is the harmonic mean of the time to open and perform set tasks on the following programs: CorelDRAW 8, Microsoft Excel 97, Dragon Systems NaturallySpeaking 2.02, Netscape Communicator 4.05 Standard Edition, Caere OmniPage Pro 8.0, Corel Paradox 8, Microsoft PowerPoint 97, and Microsoft Word 97.
We first examined 30 machines for which both PC benchmarks had results over the period December 1998 to November 1999. The two benchmarks were reasonably consistent, with a correlation of 0.962 in the logarithm of the benchmark scores over the 30 machines. However, as shown in comparison one of Table 9, the rate of improvement of the two indexes differed markedly, with the SYSmark98 showing a 38 percent per year improvement over the 11 month period, while the WorldBench tests showed a 24 percent per year improvement for the same machines. This difference was found even in the individual benchmarks (e.g., the results for Excel 97), and queries to the sources produced no reasons for the discrepancies.27
For example, we compared the raw scores for the two benchmarks for six identical machines and three identical programs. The harmonic means differed by as much as 17 percent between the two sets of tests.
A second comparison was between the SPEC benchmark results and the WorldBench tests. For this comparison, we were able to find seven machines that were tested for both benchmarks over a period of two years, using the 1995 SPEC test and three different WorldBench tests. For these machines, as shown in comparison two of Table 9, the rate of improvement of the SPEC and WorldBench tests were virtually identical at 67 and 66 percent per year, respectively.
The final test involved a comparison of the WorldBench score with the improvement in computations per second calculated for this article. For this purpose, we gathered different WorldBench tests for the period from 1992 to 2002 and spliced them together to obtain a single index for this period. We then calculated the growth of WorldBench performance per constant dollar and compared this to the growth of CPS per constant dollar from the current study. As shown in comparison three, the WorldBench performance per real dollar over 1992–2002 showed a 52 percent per year increase. This compares with a 62 percent per year increase for the computers in our data set over the same period.
To summarize, we have investigated the results of alternative benchmarks tests. None of the benchmarks is well constructed because the weights on the different tasks do not reflect the relative economic importance of tasks. We found some discrepancies among the different benchmarks, even those that purport to measure the same tasks. The WorldBench test, which is oriented toward PCs, showed slower improvements in constant-dollar performance over the 1992–2002 period than the CPS measure constructed for this study. However, an alternative test, SYSmark, show more rapid growth than the WorldBench and was more consistent with the CPS measure in this article and with the SPEC benchmark. In any case, the improvement in constant dollar performance was extremely rapid, with the lower number being a 52 percent per year logarithmic increase for the WorldBench and the higher number being a 62 percent per year increase over the last decade for the CPS.
Comparison with a Direct Measure of Performance
The long-term performance measures used in this study are constructed by chaining together several different benchmark tests. Because they use different weights and underlying benchmarks, we might be concerned that the actual performance diverges substantially over time from our chained index, particularly when the changes are by a factor of a trillion or more.
To examine the potential long-term bias, I undertook a simple set of multiplication and addition experiments with present-day personal computers (the actual calculations were performed by Ray Fair). Using a small Fortran program, we performed 100 billion multiplications of three three-digit numbers and 1 trillion additions of two five-digit numbers. These calculations took 60 and 315 seconds respectively on a 1.7 GHz Dell machine. Putting these numbers into the Moravec formula gives a computer power for this machine that is 4.7 1011 larger than manual computations. The same machine is rated as 4.1 1011 times more powerful than manual computations in our index of computer power. The estimated computer power of the chained index is therefore extremely close to the actual performance for addition and multiplication. This calculation suggests that the constructed index is reasonably consistent with actual computer performance and that no obvious drift has arisen from chaining the different measures together.
Comparison with Alternative Indexes of Computer Prices
Economists today tend to favor the use of hedonic or constant-quality price indexes to measure improvements. The hedonic approaches measure the change in the “quantity” of goods by examining the change in characteristics along with measures of the importance of the different characteristics.28
There are many excellent surveys of hedonic methods. A recent National Academy of Sciences report has a clear explanation of different approaches. See Schultze and Mackie, At What Price.
A pioneering study that investigated hedonic prices of performance was undertaken by Paul Chwelos. He investigated the characteristics of computers that were important for users and information scientists in 1999 and found the top six characteristics were performance, compatibility, RAM, network connectivity, industrial standard components, and operating system.29
See Chwelos, Hedonic Approaches, p. 43. Performance was defined as a “characteristic of the a number of components: CPU (generation, Level 1 cache, and clock speed), motherboard architecture (PCI versus ISA) and bus speed, quantity and type of Level 2 cache and RAM, type of drive interface (EIDE versus SCSI).”
Clearly, such an approach is not feasible over the long span used here. In the present study, we examined only the price of a single characteristic, performance. This decision reflects the fact that only two of the six performance characteristics discussed in the last paragraph (performance and RAM) can be tracked back for more than a few decades. Network connectivity is a brand-new feature, while operating systems have evolved from tangles of wires to Windows-type operating systems with tens of millions of lines of high-level code. This discussion indicates that computers have experienced not only rapid improvements in speed but also in breadth through a growing array of goods and services.
How do the performance-based indexes used here compare with price indexes for computers? A summary table of different price indexes for recent periods is provided in Table 10. There are six variants of computer price indexes prepared by the government, either for the national income and product accounts by the Bureau of Economic Analysis (BEA) or for the Producer Price Index by the Bureau of Labor Statistics.30
See Landefeld and Grimm, “Note,” pp. 17–22, for a discussion and a compilation of studies. The data are available at http://www.bea.doc.gov/.
During the 1969–2004 period, for which we have detailed price indexes from the BEA, the price index for computers fell by 23 percent per year relative to the GDP price index (using the logarithmic growth rate), while the real BLS price index for personal workstations and computers fell by 31 percent. Academic studies, using hedonic approaches or performance measures, show larger decreases, between 35 and 40 percent. Our real price index of the price of computer power fell by between 50 and 58 percent depending upon the subperiod.
How might we reconcile the significant discrepancy between the hedonic measures and the performance-based prices reported here? A first possible discrepancy arises because government price indexes for computers are based on the prices of inputs into computers, while the measures presented here are indexes of the cost of specified tasks. The hedonic measures will only be accurate to the extent that the prices of components accurately reflect the marginal contribution of different components to users' valuation of computer power. It is worth noting that current government hedonic indexes of computers contain no performance measure.31
The variables in an earlier BLS hedonic regression for personal desktop computers (designed in 1999 but discontinued after 2003) contained one performance proxy (clock speed), two performance-related proxies (RAM and size of hard drive), an array of feature dummy variables (presence of Celeron CPU, ZIP drive, DVD, fax modem, speakers, and software), three company dummy variables, and a few other items. It contained no performance measures such as the SPEC benchmark. The new BLS pricing approach contains no performance measures at all and instead uses attribute values available on the Internet as a basis to determine appropriate quality adjustments amounts.
A second and more important difference is that computers increasingly are doing much more than computing, so the prices used in this study include many non-computational features. To illustrate, in late 2005, a Intel® Pentium® 4 Processor 630 with HT (3GHz, 2M, 800MHz FSB) was priced at $218 while the Dell OptiPlexTM GX620 Mini-Tower personal computer in which it was embedded cost $809. The $591 difference reflects ancillary features such as hard-drives, ports, CD/DVD readers, pre-loaded software, assembly, box, and so forth. A perfectly constructed hedonic price index will capture this changing mix of components. To illustrate this point, assume that a 2005 computer is 25 percent computation and 75 percent ancillary enhancements while a 1965 computer such as the DEC PDP-8 was 100 percent computation. Excluding the ancillary items would change the real price decline from 45 percent per year to 48 percent over the four decades. It seems unlikely that the prices of the noncomputational components are falling as rapidly as the computational parts, a point that has been emphasized by Kenneth Flamm.32
Flamm, “Technological Advance,” pp. 13–61; and Flamm, “Digital Convergence,” p. 267.
Supercomputing
While this study has emphasized familiar species of computers, it will be useful to devote a moment's attentions to the elephants of the computer kingdom. Scientists and policy makers often emphasize supercomputing as the “frontier” aspect of computation, the “grand challenges of computation,” or the need for “high performance computing.” These are the romantic moon shots of the computer age. What are the grand challenges? Generally, supercomputers are necessary for the simulation or solution of extremely large nonlinear dynamic systems. In a recent report, the National Research Council listed some of the important current and prospective applications of supercomputers. These included defense and intelligence, climate prediction, plasma physics, transportation engineering, bioinformatics and computational biology, environmental science, earthquake modeling, petroleum exploration, astrophysics, nanotechnology, and macroeconomics.33
Graham, Snir, and Patterson, Eds., Getting Up to Speed, chapter 4.
The progress in supercomputing has paralleled that in smaller computers. As of November 2006, for example, the largest supercomputer (IBM's Blue Gene/L with 131,072 processors) operated at a maximum speed of 280,600 gigaflops (billions of floating-point operations per second or Gflops). Using a rough conversion ratio of 475 CPS per Flop, this machine is therefore approximately a 133,000,000,000 MCPS machine and therefore about 53,000 times more powerful than the top personal computer in our list as of 2006. The performance improvement for supercomputers has been tracked by an on-line consortium called “TOP500.” It shows that the top machine's performance grew from 59.7 Gflops in June 1993 to 280,600 Gflops in November 2006.34
See www.top500.org.
The price of supercomputing is generally unfavorable relative to personal computers. IBM's stock model supercomputer, called “Blue Horizon,” is clocked at 1,700 Gflops and had a list price in 2002 of $50 million—about $30,000 per Gflop—which makes it approximately 34 times as expensive on a pure performance basis as a Dell personal computer in 2004.
CONCLUSIONS
The key purpose of this study is to extend estimates of the price of computers and computation back in time to the earliest computers and calculators as well as to manual calculations. Along the way, we have developed performance-based measures of price and output that can be compared with input-based or component-based measures. This final section discussions some reservations and then the major implications of the analysis.
Although we have provided performance-based measures of different devices, we note that the measures are generally extremely limited in their purview. They capture primarily computational capacity and generally omit other important aspects of modern computers such as connectivity, reliability, size, and portability. In one sense, we are comparing the transportation skills of the computer analogs of mice and men without taking into account many of the “higher” functions that modern computers perform relative to mice like the IBM 1620 or ants like the Hollerith tabulator.
In addition, we emphasize that some of the data used in the analysis, particularly those for devices before 1945, are relatively crude. Additionally, the measures of performance or computer power used for early computers have been superceded by more sophisticated benchmarks. While conventional equivalence scales exist and are used when possible, the calibrations are imperfect. Subject to these reservations, six points emerge from the analysis.
First, there has been a phenomenal increase in computer power over the twentieth century. Performance in constant dollars has improved relative to manual calculations by a factor in the order of 2 × 1012 (2 trillion). Most of the increase has taken place since 1945, when the average rate of improvement has been 45 percent per year. The record shows virtually continuous extremely rapid productivity improvement over the last six decades. These increases in productivity are far larger than that for any other good or service in the historical record.35
Scholars have sometimes compared productivity growth in computers with that in electricity. In fact, this is a cheetah-to-snails comparison. Over the half-century after the first introduction of electricity, its price fell 6 percent per year relative to wages, whereas for the six decades after World War II the price of computer power fell 47 percent per year relative to wages.
Second, the data show a sharp break in trend around 1945—at the time when the technological transition occurred from mechanical calculators to what are recognizably the ancestors of modern computers. There was only modest progress—perhaps a factor of 10—in general computational capability from the skilled clerk to the mechanical calculators of the 1920s and 1930s. Around the beginning of World War II, all the major elements of the first part of the computer revolution were developed, including the concept of stored programs, the use of relays, vacuum tubes, and eventually the transistor, improved software, along with a host of other components. Dating from about 1945, computational speed increased and costs decreased rapidly up to the present. The most rapid pace of improvement was in the periods 1945–1955 and 1985–1995.
Third, these estimates of the growth in computer power, or the decline in calculation costs, are more rapid than price measures for computers used in the official government statistics. There are likely to be two reasons for the difference: first, the measures developed here are indexes of performance, whereas the approaches used by governments are based on the prices of components or inputs; and, second, “computers” to day are doing much more than computation.
Fourth, the phenomenal increases of computer power and declines in the cost of computation over the last three decades have taken place through improvements of a given underlying technology: stored programs using the von Neumann architecture of 1946 and hardware based on Intel microprocessors descended from the Intel 4004 of 1971. The fact that this extraordinary growth in productivity took place in a relatively stable industry, in the world's most stable country, relying on a largely unchanged core architecture, is provocative for students of industrial organization to consider.36
Bresnahan and Greenstein emphasize competition among platforms rather than among firms as key to the rapid productivity growth. See Bresnahan and Greenstein, “Technological Competition,” pp. 1–40.
Fifth, these results imply that there has been a rapid deepening of computer capital in the United States. Because of the growth in both the power and scope of computer power, the capital-labor ratio for computer capital has risen sharply. To provide an order-of-magnitude idea of the amount of capital deepening that has occurred, I estimate the amount of computer power available per hour of work. Using estimates of the number of machines and computer power per machine, I estimate that there was approximately 0.001 unit of (manual-equivalents of) computer power available per hour worked in 1900. That increased to about one unit of computer power per hour by the middle of the twentieth century. By 2005, computational power had increased to about 1012 per hour worked.37
Data on capital deepening is at Nordhaus, Online Appendix, pages “Capital_deep.”
At the same time, and as a sixth point, this enormous growth in computer power does not imply that there were correspondingly large increases in economic welfare all along the way. The rapid increase in productivity reflected an equally rapid decline in the cost of computation, and the decline was probably matched by a similar decline in the marginal productivity of computing.
More importantly, the contribution of computer power to aggregate economic welfare depends upon the relative size of computer capital as well as its rate of productivity improvement. To a first approximation, the contribution of productivity growth of a particular kind of capital to overall economic welfare is proportional to the share of that particular capital in the total capital stock. In 1945, computer, software, and office equipment were only 0.1 percent of total capital stock, while by 2000 they were 2.3 percent of capital. Hence, even if productivity were growing at the same very rapid rate in the two periods, it would have contributed an order of magnitude less to economic welfare in the early period. While we have only fragmentary data on the value of calculating machines around 1900, it appears that the share of calculating capital in 1900 was about one-twentieth of its share in 1945. These figures suggest why the rapid rates of computer performance became important for the overall economy only late in the twentieth century.38
Data on aggregate capital and information capital are from BEA at http://www. bea.gov/bea/dn/FA2004/SelectTable.asp, Table 2.1. Data for early calculators from Nordhaus, Online Appendix, pages “Quant_History,” “Capital_Deep,” and “Data.” The proposition in this paragraph assumes that capital goods and rental prices move inversely with productivity growth and that output is produced by a Cobb-Douglas production function.
What of the future? While forward-looking speculations might seem inappropriate in this JOURNAL, we would note that the history of computing to date shows no slackening of innovation in the fundamental computational processes or in applications of computation throughout the economy. Perhaps, aside from humans, computers and software are the ultimate general purpose technology.39
Bresnahan and Trajtenberg, “General Purpose Technologies,” pp. 83–108.
See Moravec, Mind Children. For further details of the comparison, see Nordhaus, “Progress.”
This is a revised version of an earlier paper entitled, “The Progress of Computing.” The author is grateful for comments from Ernst Berndt, Tim Bresnahan, Paul Chwelos, Iain Cockburn, Ray Fair, Robert Gordon, Steve Landefeld, Phil Long, Chuck Powell, and three anonymous referees. Eric Weese provided extraordinary help with research on computer history.