Skip to main content Accessibility help
×
Hostname: page-component-78c5997874-v9fdk Total loading time: 0 Render date: 2024-11-09T17:22:32.289Z Has data issue: false hasContentIssue false

3 - Superscalar Processors

Published online by Cambridge University Press:  05 June 2012

Jean-Loup Baer
Affiliation:
University of Washington
Get access

Summary

From Scalar to Superscalar Processors

In the previous chapter we introduced a five-stage pipeline. The basic concept was that the instruction execution cycle could be decomposed into nonoverlapping stages with one instruction passing through each stage at every cycle. This so-called scalar processor had an ideal throughput of 1, or in other words, ideally the number of instructions per cycle (IPC) was 1.

If we return to the formula giving the execution time, namely,

EXCPU = Number of instructions × CPI × cycle time

we see that in order to reduce EXCPU in a processor with the same ISA – that is, without changing the number of instructions, N – we must either reduce CPI (increase IPC) or reduce the cycle time, or both. Let us look at the two options.

The only possibility to increase the ideal IPC of 1 is to radically modify the structure of the pipeline to allow more than one instruction to be in each stage at a given time. In doing so, we make a transition from a scalar processor to a superscalar one. From the microarchitecture viewpoint, we make the pipeline wider in the sense that its representation is not linear any longer. The most evident effect is that we shall need several functional units, but, as we shall see, each stage of the pipeline will be affected.

Type
Chapter
Information
Microprocessor Architecture
From Simple Pipelines to Chip Multiprocessors
, pp. 75 - 128
Publisher: Cambridge University Press
Print publication year: 2009

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Abel, N., Budnick, D., Kuck, D., Muraoka, Y., Northcote, R., and Wilhelmson, R., “TRANQUIL: A Language for an Array Processing Computer,” Proc. AFIPS SJCC, 1969, 57–73Google Scholar
August, D., Connors, D., Mahlke, S., Sias, J., Crozier, K., Cheng, B., Eaton, P., Olaniran, Q., and Hwu, W-m., “Integrated Predicated and Speculative Execution in the IMPACT EPIC Architecture,” Proc. 25th Int. Symp. on Computer Architecture, 1998, 227–237CrossRefGoogle Scholar
Anderson, D., Sparacio, F., and Tomasulo, R., “Machine Philosophy and Instruction Handling,” IBM Journal of Research and Development, 11, 1, Jan. 1967, 8–24CrossRefGoogle Scholar
Bernstein, A., “Analysis of Programs for Parallel Processing,” IEEE Trans. on Elec. Computers, Ec03-76992, Oct. 1966, 746–757Google Scholar
Bhandarkar, D., Alpha Implementations and Architecture. Complete Reference and Guide, Digital Press, Boston, 1995Google Scholar
Boggs, D., Baktha, A., Hawkins, J., Marr, D., Miller, J., Roussel, P., Singhal, R., Toll, B., and Venkatraman, K., “The Microarchitecture of the Pentium 4 Processor on 90nm Technology,” Intel Tech. Journal, 8, 1, Feb. 2004, 1–17Google Scholar
Cvetanovic, Z. and Bhandarkar, D., “Performance Characterization of the Alpha 21164 Microprocessor Using TP and SPEC Workloads,” Proc. 2nd Int. Symp. on High-Performance Computer Architecture, 1996, 270–280CrossRefGoogle Scholar
Colwell, R., Papworth, D., Hinton, G., Fetterman, M., and Glew, A., “Intel's P6 Microarchitecture,” Chapter 7 in Shen, J. P. and Lipasti, M., Eds., Modern Processor Design, 2005, 329–367Google Scholar
Edmondson, J., Rubinfeld, P., Preston, R., and Rajagopalan, V., “Superscalar Instruction Execution in the 21164 Alpha Microprocessor,” IEEE Micro, 15, 2, Apr. 1995, 33–43CrossRefGoogle Scholar
Gwennap, L., “Brainiacs, Speed Demons, and Farewell,” Microprocessor Report Newsletter, 13, 7, Dec. 1999Google Scholar
Gochman, S., Ronen, R., Anati, I., Berkovits, R., Kurts, T., Naveh, A., Saeed, A., Sperber, Z., and Valentine, R., “The Intel Pentium M Processor: Microarchitecture and Performance,” Intel Tech. Journal, 07, 2, May 2003, 21–39Google Scholar
Huck, J., Morris, D., Ross, J., Knies, A., Mulder, H., and Zahir, R., “Introducing the IA-64 Architecture,” IEEE Micro, 20, 5, Sep. 2000, 12–23CrossRefGoogle Scholar
Hwu, W.-m. and Patt, Y., “HPSm, A High-Performance Restricted Data Flow Architecture Having Minimal Functionality,” Proc. 13th Int. Symp. on Computer Architecture, 1986, 297–307CrossRefGoogle Scholar
Hinton, G., Sager, D., Upton, M., Boggs, D., Carmean, D., Kyker, A., and Roussel, P., “The Microarchitecture of the Pentium4 Processor,” Intel Tech. Journal, 1, Feb. 2001Google Scholar
,Intel Corp, “A Tour of the P6 Microarchitecture,” 1995, http://www.x86.org/ftp/manuals/686/p6tour.pdf
Keller, R., “Look-ahead Processors,” ACM Computing Surveys, 7, 4, Dec. 1975, 177–195CrossRefGoogle Scholar
Keshava, J. and Pentkovski, V., “Pentium III Processor Implementation Tradeoffs,” Intel Tech. Journal, 2, May 1999Google Scholar
Lam, M., “Software Pipelining: An Effective Scheduling Technique for VLIW Machines,” Proc. ACM SIGPLAN Conf. on Programming Language Design and Implementation, SIGPLAN Notices, 23, 7, Jul. 1988, 318–328Google Scholar
McNairy, C. and Soltis, D., “Itanium 2 Processor Microarchitecture,” IEEE Micro, 23, 2, Mar. 2003, 44–55CrossRefGoogle Scholar
Papworth, D., “Tuning the Pentium Pro Microarchitecture,” IEEE Micro, 16, 2, Mar. 1996, 8–15CrossRefGoogle Scholar
Patterson, D. and Séquin, C., “RISC I: A Reduced Instruction Set VLSI Computer,” Proc. 8th Int. Symp. on Computer Architecture, 1981, 443–457Google Scholar
Riseman, E. and Foster, C., “The Inhibition of Potential Parallelism by Conditional Jumps,” IEEE Trans. on Computers, C-12, 12, Dec. 1972, 1405–1411CrossRefGoogle Scholar
Sohi, G., “Instruction Issue Logic for High-Performance, Interruptible, Multiple Functional Unit, Pipelined Computers,” IEEE Trans. on Computers, C-39, 3, Mar. 1990, 349–359 (an earlier version with coauthor S. Vajapeyam was published in Proc. 14th Int. Symp. on Computer Architecture, 1987)CrossRefGoogle Scholar
Sharangpani, H. and Arora, K., “Itanium Processor Microarchitecture,” IEEE Micro, 20, 5, Sep. 2000, 24–43CrossRefGoogle Scholar
Smith, J. and Pleszkun, A., “Implementation of Precise Interrupts in Pipelined Processors,” IEEE Trans. on Computers, C-37, 5, May 1988, 562–573 (an earlier version was published in Proc. 12th Int. Symp. on Computer Architecture, 1985)CrossRefGoogle Scholar
Schlansker, M. and Rau, B., “EPIC: Explicitly Parallel Instruction Computing,” IEEE Computer, 33, 2, Feb. 2000, 37–45CrossRefGoogle Scholar
Smith, J. and Sohi, G., “The Microarchitecture of Superscalar Processors,” Proc. IEEE, 83, 12, Dec. 1995, 1609–1624CrossRefGoogle Scholar
Thornton, J., “Parallel Operation in the Control Data 6600,” AFIPS Proc. FJCC, pt. 2, vol. 26, 1964, 33–40 (reprinted as Chapter 39 of C. Bell and A. Newell, Computer Structures: Readings and Examples, McGraw-Hill, New York, 1971, and Chapter 43 of D. Siewiorek, C. Bell, and A. Newell, Computer Structures: Principles and Examples, McGraw-Hill, New York, 1982)Google Scholar
Tomasulo, R., “An Efficient Algorithm for Exploiting Multiple Arithmetic Units,” IBM Journal of Research and Development, 11, 1, Jan. 1967, 25–33CrossRefGoogle Scholar
Thornton, J., Design of a Computer: The Control Data 6600, Scott, Foresman and Co., Glenview, IL, 1970Google Scholar
Tjaden, G. and Flynn, M., “Detection and Parallel Execution of Independent Instructions,” IEEE Trans. on Computers, C-19, 10, Oct. 1970, 889–895CrossRefGoogle Scholar

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • Superscalar Processors
  • Jean-Loup Baer, University of Washington
  • Book: Microprocessor Architecture
  • Online publication: 05 June 2012
  • Chapter DOI: https://doi.org/10.1017/CBO9780511811258.004
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • Superscalar Processors
  • Jean-Loup Baer, University of Washington
  • Book: Microprocessor Architecture
  • Online publication: 05 June 2012
  • Chapter DOI: https://doi.org/10.1017/CBO9780511811258.004
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • Superscalar Processors
  • Jean-Loup Baer, University of Washington
  • Book: Microprocessor Architecture
  • Online publication: 05 June 2012
  • Chapter DOI: https://doi.org/10.1017/CBO9780511811258.004
Available formats
×