Skip to main content Accessibility help
×
Hostname: page-component-78c5997874-t5tsf Total loading time: 0 Render date: 2024-11-09T16:56:27.170Z Has data issue: false hasContentIssue false

References

Published online by Cambridge University Press:  06 January 2017

Zbigniew J. Czech
Affiliation:
Silesia University of Technology, Gliwice, Poland
Get access

Summary

Image of the first page of this content. For PDF version, please use the ‘Save PDF’ preceeding this image.'
Type
Chapter
Information
Publisher: Cambridge University Press
Print publication year: 2017

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ackerman, W. B. 1982. “Dataflow Languages.IEEE Computer 15(2): 15–25.Google Scholar
Adiga, N. R., Blumrich, M. A., Chen, D., et al. 2005. “Blue Gene/L Torus Interconnection Network.IBM Journal of Research and Development 49 (2/3): 265–276.Google Scholar
Adve, S. V. and Boehm, H. J.. 2011. “Memory Models.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1107–1110.
Adve, S. V. and Gharachorloo, K.. 1996. “Shared Memory Consistency Models: A Tutorial.IEEE Computer 29 (12): 66–76.Google Scholar
Agarwal, A. 1991. “Limits on Interconnection Network Performance.IEEE Transactions on Parallel and Distributed Systems 2 (4): 398–412.Google Scholar
Agerwala, T. and Arvind, N. I.. 1982. “Data Flow Systems: Guest Editor's Introduction.Computer 15 (2): 10–13.Google Scholar
Aho, A. V., Hopcroft, J. E., and Ullman, J. D.. 1974. The Design and Analysis of Computer Algorithms. Boston, MA: Addison-Wesley.
Ajima, Y., Sumimoto, S., and Shimizu, T.. 2009. “A 6D Mesh/Torus Interconnect for Exascale Computers.Computer 42 (11): 36–40.Google Scholar
Ajtai, M., Komlós, J., and Szemerédi, E.. 1983. “Sorting in c log(n) Parallel Steps.Combinatorica 3: 1–19.Google Scholar
Akers, S. B. and Krishnamurthy, B.. 1989. “A Group-theoretic Model for Symmetric Interconnection Networks.IEEE Transactions on Computers 38 (4): 555–566.Google Scholar
Akl, S. G. 1989. The Design and Analysis of Parallel Algorithms. Englewood Cliffs, NJ: Prentice Hall.
Akl, S. G. 1997. Parallel Computation. Models and Methods. Upper Saddle River, NJ: Prentice Hall.
Alexander, M. and Gardner, W., eds. 2009. Process Algebra for Parallel and Distributed Processing. Boca Raton, FL: Chapman & Hall/CRC.
Alexandrov, A., Ionescu, M. F., Schauser, K. E., and Scheiman, C. 1995. “LogGP: Incorporating Long Messages into the LogP Model.Proc. 7th ACM Symposium on Parallel Algorithms and Architectures, Santa Barbara, CA, 95–105.Google Scholar
Allen, R. and Kennedy, K.. 2002. Optimizing Compilers for Modern Architectures. San Francisco, CA: Morgan Kaufman.
Alt, H., Hagerup, T., Mehlhorn, K., and Preparata, F. P.. 1987. “Simulation of Idealized Parallel Computers on More Realistic Ones.SIAM Journal on Computing 16 (5): 808–835.Google Scholar
Amdahl, G. 1967. “Validity of the Single Processor Approach to Achieving Large Scale Computing Capabilities.” AFIPS Conference Proc., vol. 30. Washington D.C.: Thompson Books, 483–485.
Anaratone, M., Arnould, E., Gros, T., et al. 1986. “Warp Architecture and Implementation.Proc. of 13th Annual International Symposium on Computer Architecture, Computer Science Press, Tokyo, 346–356.Google Scholar
Anderson, D. P., Cobb, J., Korpela, E., et al. 2002. “SETI@home. An Experiment in Public-resource Computing.Communications of the ACM 45 (11): 56–61.Google Scholar
Anderson, T. E., Culler, D. E., and Patterson, D.. 1995. “A Case for NOW (Networks of Workstations).IEEE Micro 15 (1): 54–56.Google Scholar
Andrews, G. R. 1991. Concurrent Programming: Principles and Practice. Menlo Park, CA: Benjamin/Cummings.
Andrews, G. R. 2000. Foundations of Multithreaded, Parallel, and Distributed Programming. Reading, MA: Addison-Wesley.
Apt, K. R. and Olderog, E-R.. 1991. Verification of Sequential and Concurrent Programs. New York: Springer-Verlag.
Arvind, N. I. and Culler, D. E.. 1986. “Dataflow Architectures.Annual Review of Computer Science, vol. 1: 225–253.Google Scholar
Arvind, N. I., Gostelow, K. P., and Plouffe, W.. 1978. The ID-Report: An Asynchronous Programming Language and Computing Machine. Technical Report, 114. University of California at Irvine.
Nikhil, R. S. 1990. “Executing a Program on the MIT Tagged-token Dataflow Architecture.IEEE Transactions on Computers 39 (3): 300–318.Google Scholar
Attiya, H. and Welch, J.. 1998. Distributed Computing: Fundamentals, Simulations and Advanced Topics. London: McGraw-Hill.
Augen, J. 2002. “The Evolving Role of Information Technology in the Drug Discovery Process.Drug Discovery Today 7 (5): 315–323.Google Scholar
Baase, S. 1988. Computer Algorithms: Introduction to Design and Analysis. Boston, MA: Addison-Wesley.
Bacon, J. and Harris, T.. 2003. Operating Systems. Concurrent and Distributed Systems. Harlow, UK: Pearson Education, Addison-Wesley.
Bader, D. A., ed. 2008. Petascale Computing. Algorithms and Applications. Boca Raton, FL: Chapman & Hall/CRC.
Bader, M., Breuer, A., and Schreiber, M.. 2013. “Parallel Fully Adaptive Tsunami Simulations.” In Facing the Multicore-challenge III. Aspects of New Paradigms and Technologies in Parallel Computing, Lecture Notes in Computer Science. Vol. 7686, edited by Keller, R., Kramer, D., and Weiss, J-P. (Berlin, Heidelberg: Springer-Verlag), 137–138.
Baer, J-L. 2010. Microprocessor Architecture, Cambridge, NY: Cambridge University Press.
Bahi, J. M. 2008. Parallel Iterative Algorithms. From Sequential to Grid Computing. Boca Raton, FL: Chapman & Hall/CRC.
Barnes, G. H., Brown, R. M., Kato, M., et al. 1968. “The Illiac IV Computer.IEEE Transactions on Computers 17 (8): 746–757.Google Scholar
Barton, M. L. and Withers, G. R.. 1989. “Computing Performance as a Function of the Speed, Quantity and Cost of the Processors.Supercomputing ’89 Proc., 759–764.Google Scholar
Barz, H. W. 1983. “Implementing Semaphores by Binary Semaphores.ACM SIG-PLAN Notices 18 (2): 39–45.Google Scholar
Batcher, K. E. 1968. “Sorting Networks and Their Applications.Spring Joint Computer Conference, AFIPS Proc., 32: 307–314.Google Scholar
BBN Advanced Computers Incorporated. 1968. Butterfly Parallel Processor Overview, BBN Report No. 6148, March.
Beecroft, J., Homewood, M., and McLaren, M.. 1994. “Meiko CS-2 Interconnect Elan-Elite Design.Parallel Computing 20 (10–11): 1627–1638.Google Scholar
Bell, G. and Gray, J.. 2002. “What's Next in High-performance Computing.Communications of the ACM 45 (2): 91–95.Google Scholar
Bellman, R. 1957. Dynamic Programming. Princeton, NJ: Princeton University Press.
Ben-Ari, M. 2006. Principles of Concurrent and Distributed Programming, 2nd edn. Boston, MA: Addison-Wesley.
Bharadwaj, V., Ghose, D., Mani, V., and Robertazzi, T. G.. 1996. Scheduling Divisible Loads in Parallel and Distributed Systems. IEEE Computer Society Press, Los Alamitos, CA.
Bhatele, A. 2011. “Topology Aware Task Mapping.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 2057–2062.
Bilardi, G., Herley, K. T., Pietracaprina, A., Pucci, G., and Spirakis, P.w. 1996. “BSP vs LogP.8th ACM Symposium on Parallel Algorithms and Architectures, Padova, Italy, 25–32.Google Scholar
Bilardi, G., Pietracaprina, A., and Pucci, G.. 2008. “Decomposable BSP: A Bandwidth-latency Model for Parallel and Hierarchical Computation.” In Hand-book of Parallel Computing. Models, Algorithms and Applications, edited by Rajasekaran, S. and Reif, J. (Boca Raton, FL: Chapman & Hall/CRC), 2-1–2-21.
Bilardi, G. and Pietracaprina, A.. 2011. “Models of Computation, Theoretical.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1150–1158.
Bisseling, R. H. 2004. Parallel Scientific Computation. New York: Oxford University Press.
Biswas, R., Aftosmis, M., Kiris, C., and Shen, B-W.. 2008. “Petascale Computing: Impact on Future NASA Missions.” In Petascale Computing. Algorithms and Applications, edited by Bader, D. A. (Boca Raton, FL: Chapman & Hall/CRC), 29–46.
Biswas, R., Thigpen, W., Ciotti, R., Mehrotra, P., et al. 2013. “Pleiades: NASA's First Petascale Supercomputer.” In Contemporary High Performance Computing: From Petascale toward Exascale, edited by Vetter, J. S. (Chapman & Hall/CRC, Boca Raton, FL), 309–338.
Bokhari, S. H. 1987. “Multiprocessing the Sieve of Eratosthenes.Computer, April: 50–58.Google Scholar
Boppana, R. B. 1989. “Optimal Separations between Concurrent-write Parallel Machines.Proc. of the ACM Symposium on Theory of Computing, 320–326.Google Scholar
Borkar, S., Cohn, R., and Fox, G.. 1990. “Supporting Systolic and Memory Communication in iWARP.Proc. of 17th Annual International Symposium on Computer Architecture, Australia, May 1990, 70–81.Google Scholar
Borodin, A. 1977. “On Relating Time and Space to Size and Depth.SIAM Journal on Computing 6 (4): 733–744.Google Scholar
Borovska, P., Nakov, O., Markov, S., Ivanova, D., and Filipov, F.. 2007. “Performance Evaluation of TOFU System Area Network Design for High-performance Computer Systems.Proc. 5th European Conference on European Computing Conference, 186–216.Google Scholar
Bovet, D. P. and Crescenzi, P.. 1994. Introduction to the Theory of Complexity. Upper Saddle River, NJ: Prentice Hall.
Brent, R. P. 1974. “The Parallel Evaluation of General Arithmetic Expressions.Journal of the ACM 21 (2): 201–206.Google Scholar
Brinch Hansen, P. 1975. “The Programming Language Concurrent Pascal.IEEE Transactions on Software Engineering 2: 199–206.Google Scholar
Brooks, E. D. III. 1986. “The Butterfly Barrier.International Journal of Parallel Programming 15: 295–307.Google Scholar
Brucker, P. 2010. Scheduling Algorithms, 5th edn. Berlin, Heidelberg: Springer-Verlag.
Bruda, S. D. and Zhang, Y.. 2009. “Relations between Several Parallel Computational Models.Scalable Computing: Practice and Experience 10 (2): 163–172.Google Scholar
Burns, A. and Wellings, A.. 1998. Concurrency in Ada, 2nd edn. Cambridge: Cambridge University Press.
Buyya, R., Branson, K., Giddy, J., and Abramson, D.. 2003. “The Virtual Laboratory: A Toolset to Enable Distributed Molecular Modelling for Drug Design on the World-wide Grid.Concurrency and Computation: Practice and Experience 15 (1): 1–25.Google Scholar
Carmona, E. A. and Rice, M. D.. 1991. “Modeling the Serial and Parallel Fractions of a Parallel Algorithm.Journal of Parallel and Distributed Computing 13: 286–298.Google Scholar
Carver, R. H. and Tai, K-C.. 2006. Modern Multithreading. Implementing, Testing, and Debugging Multi-threaded Java and C++/Pthreads/Win32 Programs. Hoboken, NJ: Wiley-Interscience.
Casanova, H., Legrand, A., and Robert, Y.. 2009. Parallel Algorithms. Boca Raton, FL: CRC Press.
Chaderjian, N. M. and Buning, P. G.. 2011. “High Resolution Navier-Stokes Simulation of Rotor Wakes.Proceedings of the American Helicopter Society 67th Annual Forum.Google Scholar
Chaderjian, N. M. and Ahmad, J. U.. 2012. “Detached Eddy Simulation of the UH-60 Rotor Wake Using Adaptive Mesh Refinement.Proceedings of the American Helicopter Society 68th Annual Forum.Google Scholar
Chandra, R., Dagum, L., Kohr, D., et al. 2001. Parallel Programming in OpenMP. San Francisco, CA: Morgan Kaufmann, Academic Press.
Chapman, B., Jost, G., and van der Pas, R.. 2008. Using OpenMP. Portable Shared Memory Parallel Programming. Cambridge, MA: MIT Press.
Cheatham, T. E., Fahmy, A., Stepanescu, D., and Valiant, L.. 1995. “Bulk Synchronous Parallel Computing-A Paradigm for Transportable Software.” Proc. 28th Annual Hawaii Conference on System Sciences, Vol. II. Hoboken, NJ: IEEE Computer Society Press, 268–275.
Chen, S. S., Price, J. F., Zhao, W., Donelana, M. A., and Walsh, E. J.. 2007. “The CBLAST-Hurricane Program and the Next-generation Fully Coupled Atmosphere-wave-ocean Models for Hurricane Research and Prediction.Bull. Amer. Meteor. Soc. 88 (3): 311–317.Google Scholar
Cheng, J., Grossman, M., and McKercher, T.. 2014. Professional CUDA C Programming. New York: John Wiley & Sons, Inc.
Chlebus, B. S., Diks, K., Hagerup, T., and Radzik, T., 1988. “Efficient Simulations between Concurrent-read Concurrent-write PRAM Models.Proc. of the Symposium on Mathematical Foundations of Computer Science, 231–239.Google Scholar
Close, P. 1988. “The iPSC/2 Node Architecture.Proc. of the Conference on Hypercube Concurrent Computers and Applications, 43–55.Google Scholar
Cole, R. 1986. “Parallel Merge Sort.” Proc. of the 27th Annual Symposium on Foundations of Computer Science. Hoboken, NJ: IEEE Computer Society Press, 511–516.
Cole, R. 1988. “Parallel Merge Sort.SIAM Journal on Computing 4: 770–785.Google Scholar
Cole, R. 1993. “Parallel Merge Sort.” In Synthesis of Parallel Algorithms, edited by Reif, J. H. (San Mateo, CA: Morgan Kaufmann), 453–495.
Collins, W. D., Bitz, M. L., Blackmon, M. L., et al. 2006. “The Community Climate System Model version 3 (CCSM3).Journal of Climate 19: 2122–2143.Google Scholar
Convex Computer Corporation. 1993. Exemplar Architecture. Richardson, TX: Convex Computer Corporation.
Cook, S. A. 1979. “Deterministic CFL's are Accepted Simultaneously in Polynomial Time and Log Squared Space.Conference Record of the Eleventh Annual ACM Symposium on Theory of Computing, Atlanta, GA, April–May 1979, 338–345.Google Scholar
Cook, S. A., Dwork, C., and Reischuk, R.. 1986. “Upper and Lower Time Bounds for Parallel Random Access Machines without Simultaneous Writes.SIAM Journal on Computing 15: 87–97.Google Scholar
Cormen, T. H., Leiserson, C. E., and Rivest, R. L.. 1990. Introduction to Algorithms. Cambridge, MA: MIT Press.
Cormen, T. H., Leiserson, C. E., Rivest, R. L., and Stein, C.. 2009. Introduction to Algorithms, 3rd edn. Cambridge, MA: MIT Press.
Coulouris, G., Dollmore, J., and Kindberg, T.. 2005. Distributed Systems: Concepts and Design, 4th edn. Boston, MA: Addison-Wesley.
Courtois, P. J., Heymans, F., and Parnas, D. L.. 1971. “Concurrent Control with ‘Readers’ and ‘Writers’.Communications of the ACM 14 (10): 667–668.Google Scholar
Culler, D., Karp, R., Patterson, D., et al. 1993. “LogP: Towards a Realistic Model of Parallel Computation.4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, San Diego, CA, May 1993, 1–12.Google Scholar
Culler, D. E., Singh, J. P., and Gupta, A.. 1999. Parallel Computer Architecture. San Francisco, CA: Morgan Kaufamann.
Dally, W. J. 1991. “Performance Analysis of k-ary n-cube Interconnection Networks.IEEE Transactions on Computers 39 (6): 775–785.Google Scholar
Dally, W. J. and Towles, B.. 2004. Principles and Practices of Interconnection Networks. San Francisco, CA: Morgan Kaufmann.
Darema-Rogers, F., George, D., Norton, V. A., and Pfister, G.. 1984. “VM Parallel Environment.Proc. of the IBM Kingston Parallel Processing Symposium, November 27–29, 1984 (IBM Confidential).Google Scholar
Darema, F. 2001. “SPMD Model: Past, Present and Future.Recent Advances in Parallel Virtual Machine and Message Passing Interface, 8th European PVM/MPI Users’ Group Meeting, Santorini/Thera, Greece, LNCS 2131, September 23–26, 2001, p. 1.Google Scholar
Darte, A., Robert Y., Y., and Vivien, F.. 2000. Scheduling and Automatic Parallelization. Boston, MA: Birkhuser.
Dennis, J. B. 1980. “Dataflow Supercomputers.IEEE Computer 13: 48–56.Google Scholar
Dennis, J. B. 1983. “Maximum Pipelining of Array Operations on Static Data Flow Machines.Proc. of the International Conference on Parallel Processing, August 1983, 176–184.Google Scholar
Dennis, J. B. and van Horn, E. C.. 1966. “Programming Semantics for Multiprogrammed Computations.Communications of the ACM 9 (3): 143–155.Google Scholar
Dennis, J., and Loft, R.. 2009. “Optimizing High-resolution Climate Variability Experiments on the Cray XT4 and XT5 Systems at NICS and NERSC.Proceedings of the 51st Cray User Group Conference (CUG), 1–8.Google Scholar
Dijkstra, E. W. 1968. “Cooperating Sequential Processes.” In Programming Languages, edited by Genuys, F. (New York: Academic Press), 43–112.
Dijkstra, E. W. 1971. “Hierarchical Ordering of Sequential Processes.Acta Informatica 1 (2): 115–138.Google Scholar
Dijkstra, E. W. and Scholten, C. S.. 1980. “Termination Detection for Diffusing Computations.Information Processing Letters 11 (1): 1–4.Google Scholar
Dill, K. A., Ozkan, S. B., Weikl, T. R., Chodera, J. D., and Voelz, V. A.. 2007. “The Protein Problem: When Will It Be Solved?Current Opinion in Structured Biology 17 (3): 342–346.Google Scholar
Domeika, M. 2008. Software Development for Embedded Multi-core Systems. Burlington, MA: Newnes.
Donnellan, A., Mora, P., Matsu'ura, M., and Yin, X-C.. 2004. Computational Earthquake Science. Basel: Birkhuser.
Dongarra, J. 2013. “Visit to the National University for Defense Technology Changsha, China, University of Tennessee, Oak Ridge National Laboratory, June 3, 2013.” http://www.netlib.org/utk/people/JackDongarra/PAPERS/tianhe-2-dongarra-report.pdf.Google Scholar
Dongarra, J., Otto, S. W., Snir, M., and Walker, D.. 1995. An Introduction to the MPI standard, University of Tennessee Technical Report, CS-95-274, January 1995.Google Scholar
Dongarra, J., Foster, I., Fox, G., et al. ed. 2003. Sourcebook of Parallel Computing. San Francisco, CA: Morgan Kaufmann.
Dongarra, J., Sterling, T., Simon, H., and Strohmaier, E.. 2005. “High-performance Computing: Clusters, Constellations, MPPs, and Future Directions.Computing in Science & Engineering, March/April: 51–59.Google Scholar
Dongarra, J. and Luszczek, P.. 2011. “LINPACK Benchmark.” In Encyclopedia of Parallel Computing, vol. 2, edited by Padua, D. (New York: Springer-Verlag), 1033–1035.
Dorband, E. N., Hemsendorf, M., and Merritt, D.. 2003. “Systolic and Hyper-systolic Algorithms for the Gravitational N-body Problem, with an Application to Brownian Motion.J. Comput. Phys. 185: 484–511.Google Scholar
Downey, A. B. 2007. “The Little Book of Semaphore,v. 2.1.2. http://greenteapress.com/semaphores/.Google Scholar
Drake, J. B., Jones, P. W., Vertenstein, M., White, J. B. III, and Worley, P. H.. 2008. “Software Design for Petascale Climate Science.” In Petascale Computing. Algorithms and Applications, edited by Bader, D. A. (Boca Raton, FL: Chapman & Hall/CRC), 125–146.
Drozdowski, M. 2004. “Scheduling Parallel Tasks – Algorithms and Complexity.” In Handbook of Scheduling. Algorithms, Models and Performance Analysis, edited by Leung, J. Y-T. (Boca Raton, FL: Chapman & Hall/CRC), 25-1–25-25.
Dubois, M., Annavaram, M., and Stenstr´’om, P.. 2012. Parallel Computer Organization and Design. Cambridge: Cambridge University Press.
Dumancas, G. G. 2015. “Applications of Supercomputers in Sequence Analysis and Genome Annotation.” In Research and Applications in Global Supercomputing, edited by Segall, R. S., Cook, J. S. and Zhang, Q. (Hershey, PA: IGI Global), 149–175.
Dutot, P-F., Mounié, G., and Trystram, D.. 2004. “Scheduling Parallel Tasks Approximation Algorithms.” In Handbook of Scheduling. Algorithms, Models and Performance Analysis, edited by Leung, J.Y-T. (Boca Raton, FL: Chapman & Hall/CRC), 26-1–26-24.
Science. 2005. “Editorial: So Much More to Know.” Science 309: 78–102.
El-Ghazawi, T., Carlson, W., Stering, T., and Yelick, K,. 2005. UPC. Distributed Shared Memory Programming. Hoboken, NJ: John Wiley & Sons, Inc.
Endy, D. and Brent, R.. 2001. “Modelling Cellular Behaviour.Nature 409: 391–395.Google Scholar
Fatahalian, K. and Houston, M.. 2008. “A Closer Look at GPUs.Communications of the ACM 51 (10): 50–57.Google Scholar
Feng, T. Y. 1972. “Some Characteristics of Associative/Parallel Processing.Proc. of the 1972 Sagamore Computing Conference, 5–16.Google Scholar
Feng, T. Y. 1981. “A Survey of Interconnection Networks.IEEE Computer, December: 12–27.Google Scholar
Feo, J. T., ed. 1993. A Comparative Study of Parallel Programming Languages: The Salishan Problems. Amsterdam, The Netherlands: North-Holland.
Fich, F. E. 1993. “The Complexity of Computation on the Parallel Random Access Machine.” In Synthesis of Parallel Algorithms, edited by Reif, J. H. (San Mateo, CA: Morgan Kaufmann), 843–899.
Fich, F. E., Ragde, P., and Wigderson, A.. 1988. “Relations between Concurrent-write Models of Parallel Computation.SIAM Journal on Computing 7: 606–627.Google Scholar
Fishman, G. S. 1996. Monte Carlo: Concepts, Algorithms and Applications. New York: Springer-Verlag.
Flatt, H. P. and Kennedy, K.. 1989. “Performance of Parallel Processors.Parallel Computing 12: 1–20.Google Scholar
Flynn, M. J. 1966. “Very High Speed Computers.Proc. IEEE 54: 1901–1909.Google Scholar
Flynn, M. J. 1972. “Some Computer Organizations and Their Effectiveness.IEEE Transactions on Computing C-21: 948–960.Google Scholar
Flynn, M. J. 2011. “Flynn's Taxonomy.” In Encyclopedia of Parallel Computing, Vols 1–4 (New York: Springer-Verlag), 689–697.
Fortune, S. and Wyllie, J.. 1978. “Parallelism in Random Access Machines.” Proc. 10th Symp. Theory Computing. ACM, New York, 114–118.
Foster, I. T. 1995. Designing and Building Parallel Programs. Concepts and Tools for Parallel Software Engineering. Addison-Wesley, Reading, MA, http://www.mcs.anl.gov/~itf/dbpp/.
Foster, I. and Kesselman, C.. ed. 2004. The Grid 2: Blueprint for a New Computing Infrastructure, 2nd edn. San Francisco, CA: Elsevier.
Fountain, T. J. 1994. Parallel Computing Principles and Practice. Cambridge: Cambridge University Press.
Fox, G. C., Williams, R. D., and Messina, P. C.. 1994. Parallel Computing Works!. San Francisco, CA: Morgan Kaufmann.
Francez, N. 1980. “Distributed Termination.ACM Trans. Program. Lang. Syst. 2 (1): 42–55.Google Scholar
Frank, S., Burkhardt, H., and Rothnie, J.. 1993. “The KSR1: Bridging the Gap between Shared Memory and MPPs.Proc. of the COMPCON Digest of Papers, 285–294.Google Scholar
Furst, M., Saxe, J. B., and Sipser, M., 1984. “Parity, Circuits, and the Polynomial-time Hierarchy.Mathematical Systems Theory 17: 13–27.Google Scholar
Gabriel, E., Fagg, G. E., Bosilca, G., et al. 2004. “Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation.Proc. 11th European PVM/MPI Users’ Group Meeting, September 2004, Budapest, Hungary, 97–104.Google Scholar
Gajski, D., Padua, D. A., Kuck, D. J., and Kuhn, R. H.. 1982. “A Second Opinion on Data Flow Machines and Languages.IEEE Computer 15 (2): 58–69.Google Scholar
Galvin, P. B., Gagne, G., and Silberschatz, A.. 2013. Operating System Concepts, 9th edn. New York: John Wiley & Sons, Inc.
Gara, A. 2005. “Overview of the Blue Gene/L System Architecture.IBM Journal of Research and Development 49 (2/3): 195–212.Google Scholar
Gara, A. and Moreira, J. E.. 2011. “IBM Blue Gene ‘supercomputer’.” In Encyclopedia of Parallel Computing, vol. 2, edited by Padua, D. A. (New York: Springer-Verlag), 891–900.
Garey, M. R. and Johnson, D. S.. 1979. Computers and Intractability. A Guide to the Theory of NP-Completeness. New York: W. H. Freeman and Co.
Garland, M. 2011. “NVIDIA GPU.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A (New York: Springer-Verlag), 1339–1345.
Gaudiot, J. and Bic, L.. 1989. Advanced Topics in Data-flow Computing. Englewood Cliffs, NJ: Prentice Hall.
Gebali, F. 2011. Algorithms and Parallel Computing. Hoboken, NJ: John Wiley & Sons, Inc.
Geist, A., Beguelin, A., Dongarra, J., et al. 1994. PVM: Parallel Virtual Machine: A User's Guide and Tutorial for Networked Parallel Computing. Cambridge, MA: The MIT Press.
Geist A. 2011. “PVM (Parallel Virtual Machine).” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1647–1651.
Gent, P. R., Danabasoglu, G., and Donner, L. J., et al. 2011. “The Community Climate System Model Version 4.Journal of Climate 24(19): 4973–4991.Google Scholar
Ghosh, S. 2007. Distributed Systems. An Algorithmic Approach. Boca Raton, FL: Chapman & Hall/CRC.
Gibbons, A. 1993. “An Introduction to Distributed Memory Models of Parallel Computation.” In Lectures on Parallel Computation, edited by Gibbons, A. and Spirakis, P. (Cambridge: Cambridge University Press), 197–226.
Gibbons, A. and Rytter, W.. 1988. Efficient Parallel Algorithms. Cambridge: Cambridge University Press.
Gibbons, A. and Spirakis, P., eds. 1993. Lectures on Parallel Computation. Cambridge: Cambridge University Press.
Gilge, M. 2012. “IBM System Blue Gene Solution: Blue Gene/Q. Application Development.” March. www.ibm.com/redbooks/.Google Scholar
Glauert, J. A. 1978. “A Single Assignment Language for Dataflow Computing.” Master's Thesis, Manchester, UK: University of Manchester.
Goedecker, S. and Hoisie, A.. 2001. Performance Optimization of Numerically Intensive Codes. Philadelphia, PA: SIAM Publishing Company.
Goldschlager, L. M. 1982. “A Universal Interconnection Pattern for Parallel Computers.Journal of ACM 29: 1073–1086.Google Scholar
Goodman, S. E. and Hedetniemi, S. T.. 1977. Introduction to Design and Analysis of Algorithms. New York: McGraw-Hill.
Gottlieb, A., Grishman, R., Kruskal, C. P., et al. 1983. “The NUY Ultra-computer— Designing a MIMD Shared Memory Parallel Computer.IEEE Transactions on Parallel and Distributed Systems 32 (2): 175–189.Google Scholar
Gottlieb, A. 2011. “Ultracomputer, NYU.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 2095–2103.
Graham, R. L., Shipman, G. M., and Barrett, B. W., et al. 2006. “Open MPI: A High-performance, Heterogeneous MPI.Proc. 5th International Workshop on Algorithms, Models and Tools for Parallel Computing on Heterogeneous Networks, September 2006, Barcelona, Spain, 1–9.Google Scholar
Grama, A., Gupta, A., Karypis, G., and Kumar, V.. 2003. Introduction to Parallel Computing, 2nd edn. Harlow, UK: Addison-Wesley.
Grama, A. Y., Gupta, A., and Kumar, V.. 1993. “Isoefficiency: Measuring the Scalability of Parallel Algorithms and Architectures.IEEE Parallel and Distributed Technology 1 (3): 12–21.Google Scholar
Grama, A. and Kumar, V.. 2008. “Scalability of Parallel Programs.” In Handbook of Parallel Computing. Models, Algorithms and Applications, edited by Rajasekaran, S. and Reif, J. (Boca Raton, FL: Chapman & Hall/CRC), 43-1–43-16.
Greenlaw, R. 1993. “Polynomial Completeness and Parallel Computation.” In Synthesis of Parallel Algorithms, edited by Reif, J. H. (San Mateo, CA: Morgan Kaufmann), 901–953.
Greenlaw, R., Hoover, H. J., and Ruzzo, W. L.. 1995. Limits to Parallel Computation: P-Completeness Theory. Oxford: Oxford University Press. www.cs.armstrong.edu/-greenlaw/research/PARALLEL/.
Gropp, W. 2011. “MPI (Message Passing Interface).” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1184–1190.
Gropp, W., Huss-Lederman, S., Lumsdaine, A., et al. 1998. MPI-The Complete Reference: Vol. 2. The MPI Extensions, 2nd edn. Cambridge, MA: MIT Press.
Gropp, W., Lusk, E., and Skjellum, A.. 1999. Using MPI. Portable Parallel Programming with the Message-passing Interface, 2nd edn, Cambridge, MA: MIT Press.
Gropp, W., Lusk, E., and Thakur, R.. 1999. Using MPI-2. Advanced Features of the Message-passing Interface, 2nd edn. Cambridge, MA: MIT Press.
Gupta, A. and Kumar, V.. 1993. “Performance Properties of Large Scale Parallel Systems.Journal of Parallel and Distributed Computing 19: 234–244.Google Scholar
Gurd, J. R., Kirkham, C., and Watson, J.. 1985. “The Manchester Prototype Dataflow Computer.Communications of the ACM 28 (18): 36–45.Google Scholar
Gustafson, J. L. 1988. “Reevaluating Amdahl's Law.Communications of the ACM 31 (5): 532–533.Google Scholar
Gustafson, J. L., Montry, G. R., and Benner, R. E.. 1988. “Development of Parallel Methods for a 1024-processor Hypercube.SIAM Journal on Scientific and Statistical Computing 9 (4): 609–638.Google Scholar
Gustafson, J. L. 1992. “The Consequences of Fixed Time Performance Measurement.Proc. of the 25th Hawaii International Conference on System Sciences, Vol. III, 113–124.Google Scholar
Gustafson, J. L. 2011. “Brent's Theorem.” In Encyclopedia of Parallel Computing, vol. 1, edited by Padua, D. A. (New York: Springer-Verlag), 182–185.
Gustafson, J. L. 2011. “Moore's Law.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1177–1184.
Hager, G. and Wellein, G.. 2011. Introduction to High Performance Computing for Scientists and Engineers. Boca Raton, FL: Chapman & Hall/CRC.
Halfill, T. R. 2008. “Parallel Processing with CUDA.Microprocessor Report, January 28: 1–8 (www.MPRonline.com).Google Scholar
Hamacher, V. V., Vranesic, Z. G., and Zaky, S. G.. 2001. Computer Organization, 5th edn. New York: McGraw-Hill.
Handler, W. 1977. “The Impact of Classification Schemes on Computer Architecture.Proc. of the International Conference on Parallel Processing, August, 7–15.Google Scholar
Handy, J. 1998. The Cache Memory Book, 2nd edn. Orlando, FL: Academic Press.
Harris, T. J. 1994. “A Survey of PRAM Simulation Techniques.ACM Computing Surveys 26: 187–206.Google Scholar
Hennessy, J. L. and Patterson, D. A.. 2007. Computer Architecture. A Quantitative Approach, 4th edn. San Francisco, CA: Morgan Kaufmann.
Hensgen, D., Finkel, R., and Manber, U.. 1988. “Two Algorithms for Barrier Synchronization.International Journal of Parallel Programming 17 (1): 1–16.Google Scholar
Herley, K. T. and Bilardi, G.. 1988. “Deterministic Simulations of PRAMs on Bounded-degree Networks.Proc. of 26th Annual Allerton Conference on Communication, Control and Computation, Monticello, IL, 1084–1093.Google Scholar
Herlichy, M. and Shavit, N.. 2008. The Art of Multiprocessor Programming. Burlington, MA: Morgan Kaufmann.
Heroux, M. A., Raghavan, P., and Simon, H. D., eds. 2006. Parallel Processing for Scientific Computing. Philadelphia, PA: SIAM Publishing Company.
Hicks, J., Chiou, D., Ang, B., and Arvind, . 1992. Performance Studies of the Monsoon Dataflow Processor. CSF Memo 345-2, MIT, October.Google Scholar
Hill, M. 1998. “Multiprocessors Should Support Simple Memory-consistency Models.IEEE Computer Magazine 31: 28–34.Google Scholar
Hillis, D. 1985. The Connection Machine. Cambridge, MA: MIT Press.
Hiraki, K., Nishida, K., Sekiguchi, S., Shimada, T., and Tiba, T., 1987. “The SIGMA-1 Dataflow Supercomputer: A Challenge for New Generation Supercomputing Systems.Journal of Information Processing 10 (4): 219–226.Google Scholar
Hoare, C.A.R. 1974. “Monitors, an Operating System Structuring Concept.Communications of the ACM 17: 549–557;Google Scholar
Erratum.Communications of the ACM 18 (1975): 95.
Hoare, C. A. R. 1978. “Communicating Sequential Processes.Communications of the ACM 21 (8): 666–677.Google Scholar
Hoffman, F. M. and Hargrove, W. W.. 1999. “Multivariate Geographic Clustering Using a Beowulf-style Parallel Computer.Proc. of the International Conference on Parallel and Distributed Processing Techniques and Applications, June, 1292–1298.Google Scholar
Hromkovič, J. 2003. Algorithmics for Hard Problems. Introduction to Combinatorial Optimization, Randomization, Approximation and Heuristics. Berlin: Springer-Verlag.
Hwang, K. 1993. Advanced Computer Architecture, Parallelism, Scalability, Programmability. New York: McGraw-Hill.
Hwang, K. and Xu, Z.. 1998. Scalable Parallel Computing. McGraw-Hill, New York, 1998.
Hwang, K., Fox, G. C., and Dongarra, J. J.. 2012. Distributed and Cloud Computing. Waltham, MA Morgan Kaufman.
Hyndman, Donald and David, Hyndman. 2009. Natural Hazards and Disasters, 2nd edn. Belmont, CA: Brooks/Cole,
Inmos Ltd. 1988. Occam 2 Reference Manual. Englewood Cliffs, NJ: Prentice-Hall.
International Human Genome Sequencing Consortium. 2001. “Initial Sequencing and Analysis of the Human Genome.” Nature 409: 860–921.
International Organization for Standardization, Geneva. 1996. Information Technology-Portable Operating System Interface (POSIX) – Part 1: System Application Program Interface (API) [C Language], December.
JáJ á, J. 1992. An Introduction to Parallel Algorithms. Reading, MA: Addison-Wesley.
Jha, S. K. and Jana, P. K.. 2011. Study and Design of Parallel Algorithms for Interconnection Networks. Saarbr´’ucken, Germany: Lambert Academic Publishing.
Johnson, M. 1991. Superscalar Microprocessor Design. Upper Saddle River, NJ: Prentice-Hall.
Jones, G. A. and Goldsmith, M., 1989. Programming in Occam 2, 2nd edn. Engle-wood Cliffs, NJ: Prentice Hall.
Jordan, H. and Alaghband, G.. 2003. Fundamentals of Parallel Processing. Upper Saddle River, NJ: Prentice Hall.
Kalos, M. H. and Whitlock, P. A.. 2008. Monte Carlo Methods, 2nd edn. Weinheim: Wiley-VCH Verlag.
Kalyanaraman, A., Emrich, S. J., Schnable, P. S., and Aluru, S.. 2007. “Assembling Genomes on Large-scale Parallel Computers.Journal of Parallel and Distributed Computing 67, 1240–1255.Google Scholar
Karniadakis, G. E. and Kirby, R. M. II. 2007. Parallel Scientific Computing in C++ and MPI. A Seamless Approach to Parallel Algorithms and Their Implementation. New York: Cambridge University Press.
Karp, A. H. and Flatt, H. P.. 1990. “Measuring Parallel Processor Performance.Communications of the ACM 33 (5): 539–543.Google Scholar
Karp, R. M. and Ramachandran, V.. 1990. “Parallel Algorithms for Shared-memory Machines.” In Handbook of Theoretical Computer Science, vol. A, edited by van Leeuven, J. (Amsterdam, The Netherlands: Elsevier), 870–941.
Keller, R., Kramer, D., Weiss, J-P., eds. 2013. Facing the Multicore-challenge III. Aspects of New Paradigms and Technologies in Parallel Computing. Lecture Notes in Computer Science 7686. Berlin, Heidelberg: Springer-Verlag.
Kennedy, K. and Allen, J. R.. 2001. Optimizing Compilers for Modern Architectures: A Dependence-based Approach. San Francisco, CA: Morgan Kaufmann Pub.
Kessler, R. E. and Schwarzmeier, J. L.. 1993. “Cray T3D: A New Dimension for Cray Research.Proc. of the IEEE Computer Society International Conference, February, 176–182.Google Scholar
Kiris, C., Housman, J., Gusman, M., et al. 2011. “Best Practices for Aero-Database CFD Simulations of Ares V Ascent.” In 49th AIAA Aerospace Sciences Meeting, 1–21.Google Scholar
Kirk, D. B. and Hwu, W-M. W.. 2013. Programming Massively Parallel Processors. A Hands-on Approach, 2nd edn. Waltham, MA: Morgan Kaufmann.
Klie, H., Bangerth, W., Gail, X., et al. 2006. “Models, Methods and Middleware for Grid-enabled Multiphysics Oil Reservoir Management.Engineering with Computers 22 (3–4): 349–370.Google Scholar
Knuth, D. E. 1971. “Optimum Binary Search Trees.Acta Informatica 1 (1): 14–25.Google Scholar
Knuth, D. E. 1998. The Art of Computer Programming, Vol. 3. Sorting and Searching, 2nd edn. Reading, MA: Addison-Wesley.
Kodama, C., Terai, M., Noda, A. T., et al. 2014. “Scalable Rank-mapping Algorithm for an Icosahedral Grid System on the Massive Parallel Computer with a 3-D Torus Network.Parallel Computing 40: 362–373.Google Scholar
Koelbel, C. H., Loveman, D. B., Schreiber, R. S., Steele, G. L. Jr., and Zosel, M. E.. 1997. The High Performance Fortran Handbook. Cambridge, MA: MIT Press.
Komornicki, A., Mullen-Schulz, G., and Landon, D., 2009. Roadrunner: Hardware and Software Overview, IBM Technical Support Organization. www.redbooks.ibm.com/redpapers/pdfs/redp4477.pdf.Google Scholar
Kontoghiorghes, E. J. ed. 2006. Handbook of Parallel Computing and Statistics. Boca Raton, FL: Chapman & Hall/CRC.
Kruskal, C. P. and Snir, M.. 1986. “A Unified Theory of Interconnection Network Structure.Theoretical Computer Science 48 (3): 75–94.Google Scholar
Kshemkalyani, A. D. and Singhal, M.. 2008. Distributed Computing. Cambridge: Cambridge University Press.
Kučera, L. 1982. “Parallel Computation and Conflicts in Memory Access.Information Processing Letters 14: 93–96.Google Scholar
Kumar, V., Grama, A., Gupta, A., and Karypis, G., 1994. Introduction to Parallel Computing. Design and Analysis of Algorithms. Redwood City, CA: Benjamin/ Cummings.
Kumar, V. and Gupta, A.. 1994. “Analyzing Scalability of Parallel Algorithms and Architectures.Journal of Parallel and Distributed Computing 22: 379–391.Google Scholar
Kumar, V. and Singh, V.. 1991. “Scalability of Parallel Algorithms for the All-pairs Shortest-path Problem.Journal of Parallel and Distributed Computing 13: 124–138.Google Scholar
Kung, H. T. 1988. VLSI Array Processors. Upper Saddle River, NJ: Prentice Hall.
Kung, H. T. and Leiserson, C. E.. 1978. “Systolic Arrays (for VLSI).” In Sparse Matrix Proceedings, Knoxville, TN, SIAM, Philadelphia, edited by Duff, I. S. and Stewart, G. W. (US: Society for Industrial & Applied Mathematics), 256–282.
Kurzak, J., Bader, D. A., and Dongarra, J., eds. 2011. Scientific Computing with Multicore and Accelerators. Boca Raton, FL: Chapman & Hall/CRC.
Kwok, Y-K. and Ahmad, I.. 1999. “Benchmarking and Comparison of the Task Graph Scheduling Algorithms.Journal of Parallel and Distributed Computing 59: 381–422.Google Scholar
Ladner, R. E. 1975. “The Circuit Value Problem Is Log Space Complete for P.SIGACT News 7 (1): 18–20.Google Scholar
Lansdowne, S. T., Cousins, R. E., and Wilkinson, D. C.. 1987. “Reprogramming the Sieve of Eratosthenes.Computer, August: 90–91.Google Scholar
Lastovetsky, A. L. 2003. Parallel Computing on Heterogeneous Networks. Hoboken, NJ: John Wiley & Sons, Inc.
Laudon, J. P. and Lenoski, D.. 1997. “The SGI Origin: A ccNUMA Highly Scalable Server.Proc. of the 24th International Symposium on Computer Architecture, 241–251.Google Scholar
Lawrie, D. H. 1975. “Access and Alignment of Data in an Array Processor.IEEE Transactions on Computers C-24 (1): 1145–1155.Google Scholar
Lea, D. 1997. Concurrent Programming in Java. Design Principles and Patterns. Reading, MA: Addison-Wesley.
Karp, R. M. and Ramachandran, V.. 1990. “Parallel Algorithms for Shared-memory Machines.” In Handbook of Theoretical Computer Science, vol. A, edited by van Leeuwen, J. (Amsterdam, The Netherlands: Elsevier), chap. 17;
Vailant, L. G. 1990. “General Purpose Parallel Architectures.” In Handbook of Theoretical Computer Science, vol. A, edited by van Leeuwen, J. (Amsterdam, The Netherlands: Elsevier), chap. 18.Google Scholar
Leighton, F. T. 1992. Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes. San Mateo, CA: Morgan Kaufmann.
Leiserson, C. E. 1985. “Fat-trees: Universal Networks for Hardware-efficient Supercomputing.IEEE Transactions on Computers C-34 (10): 892–901.Google Scholar
Leung, J. Y-T., ed. 2004. Handbook of Scheduling. Algorithms, Models and Performance Analysis. Boca Raton, FL: Chapman & Hall/CRC.
Levesque, J. and Wagenbreth, G.. 2011. High Performance Computing. Programming and Applications, Chapman & Hall/CRC, Boca Raton, FL.
Lewis, B. and Berg, D.. 1998. Multithreaded Programming with Pthreads. Mountain View, CA: Sun Microsystems Press.
Li, K. 1986. “Shared Virtual Memory on Loosely Coupled Multiprocessor.Ph.D. thesis, Department of Computer Science, Yale University.Google Scholar
Li, K. and Hudak, P.. 1989. “Memory Coherence in Shared Virtual Memory Systems.ACM Transactions on Computer Systems 7: 321–359.Google Scholar
Lillevik, S. L. 1991. “The Touchstone 30 Gigaflop DELTA Prototype.DMCC April: 671–677.Google Scholar
Lin, C. and Snyder, L.. 2009. Principles of Parallel Programming. Boston, MA: Addison-Wesley.
Lindholm, E., Nickolls, J., Oberman, S., and Mntrym, J.. 2008. “NVIDIA Tesla: A Unified Graphics and Computing Architecture.IEEE Micro 28 (2): 39–55.Google Scholar
Loft, R., Andersen, A., Bryan, F., et al. 2015. “Yellowstone: A Dedicated Reitalic for Earth System Science.” In Contemporary High Performance Computing: From Petascale toward Exascale, edited by Vetter, J. S. (Chapman & Hall/CRC, Boca Raton, FL), vol. II, 185–224.
Lynch, N. A. 1996. Distributed Algorithms. San Francisco, CA: Morgan Kaufmann.
Lysne, O. and Sem-Jacobsen, F. O.. 2011. “Networks, Multistage.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1316–1321.
Makino, J. 2002. “An Efficient Parallel Algorithm for O(N2 ) Direct Summation Method and Its Variations on Distributed-memory Parallel Machines.New Astron. 7: 373–384.Google Scholar
Manber, U. 1989. Introduction to Algorithms—A Creative Approach. Boston, MA: Addison-Wesley.
Mandelbrot, B. B. 1980. “Fractal Aspects of the Iteration of z → λz(1 − z) for complex λ, z.Annals of the New York Academy of Sciences 357: 249–259.Google Scholar
Marinescu, D. C. and Rice, J. R.. 1994. “On High Level Characterization of Parallelism.Journal of Parallel and Distributed Computing 20: 107–113.Google Scholar
Marsh, D. R., Mills, M. J., Kinnison, D. E., et al. 2013. “Climate change from 1850 to 2005 simulated in CESM1 (WACCM).Journal of Climate, 26(19): 7372–7391.Google Scholar
Matsu'ura, M., Furumura, T., Okuda, H., et al. 2006. “Integrated Predictive Simulation System for Earthquake and Tsunami Disaster.SIAM 12th Conference on Parallel Processing for Scientific Computing (PP06), San Francisco, 2006, and also: Annual Report of the Earth Simulator Center, April 2005–March 2006, 407–410.Google Scholar
Mattson, T. G. 2003. “How Good Is OpenMP?Scientific Programming 11: 81–93.Google Scholar
Mattson, T. G., Sanders, B. A., and Massingill, B. L.. 2005. Patterns for Parallel Programming. Boston, MA: Addison-Wesley.
McKee, S. A. and Wisniewski, R. W.. 2011. “Memory Wall.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1110–1116.
Mellor-Crummey, J. M. and Scott, M. L.. 1991. “Algorithms for Scalable Synchronization on Shared-memory Multiprocessors.ACM Transactions on Computer Systems 9 (1): 21–65.Google Scholar
Message Passing Interface Forum. 1998. “MPI2: A Message Passing Interface Standard.” International Journal of High Performance Computing Applications 12 (1–2): 1–299.
Message Passing Interface Forum. 2012. “MPI: A Message-Passing Interface Standard, Version 3.0.” High Performance Computing Center Stuttgart (HLRS), September 21.
Milano, J. and Lembke, P., 2012. “IBM system Blue Gene Solution: Blue Gene/Q. Hardware Overview and Installation Planning.” March. www.ibm.com/redbooks.Google Scholar
Miller, R. and Boxer, L.. 2005. Algorithms. Sequential and Parallel. A Unified Approach, 2nd edn. Hingham, MA: Charles River Media Inc.
Mizuta, R., Uchiyama, T., Kamiguchi, K., Kitoh, A., and Noda, A.. 2005. “Changes in Extremes Indices over Japan due to Global Warming Projected by a Global 20-km-mesh Atmospheric Model.Scientific Online Letters on the Atmosphere (SOLA) 1: 153–156. doi: 10.2151/sola.2005-040.Google Scholar
Mogoules, F., Pan, J., Tan, K-A., and Kumar, A.. 2009. Introduction to Grid Computing. Boca Raton, FL: Chapman & Hall/CRC.
Moin, P. and Kim, J.. 1997. “Tackling Turbulence with Supercomputers.Scientific American 276: 62–68.Google Scholar
Moldovan, D. I. 1993. Parallel Processing from Applications to Systems. San Mateo, CA: Morgan Kaufmann.
Monacelli, G., Sessa, F., and Milite, A.. 2004. “An Integrated Approach to Evaluate Engineering Simulations and Ergonomic Aspects of a New Vehicle in a Virtual Environment: Physical and Virtual Correlation Methods.FISITA 2004 30th World Automotive Congress, 2004, Barcelona, Spain, 23–27.Google Scholar
Monien, B. and Sudborough, H.. 1988. “Comparing Interconnection Networks.Lecture Notes in Computer Science 324: 139–153.Google Scholar
Moore, G. E. 1965. “Cramming More Components onto Integrated Circuits.Electronics Magazine 38 (8): 114–117.Google Scholar
Morse, H. S. 1994. Practical Parallel Computing. Cambridge, MA: AP Professional.
Mukherjee, S. S., Banno, P., Lang, S., Spink, A., and Webb, D.. 2001. “The Alpha 21364 Network Architecture.Proc. of the Symposium on Hot Interconnects, August, 113–117.Google Scholar
Nakata, T., Kanoh, Y., Tatsukawa, K., et al. 1998. “Architecture and the Software Environment of Parallel Computer Cenju-4.NEC Research and Development Journal 39: 385–390.Google Scholar
nCUBE Corporation. 1990. nCUBE Processor Manual.
Nickolls, J. R. 1990. “The Design of the MasPar MP-1: A Cost-effective Massively Parallel Computer.Proc. COMPCON Digest of Paper, 25–28.Google Scholar
Nicol, D. M. and Willard, F. H.. 1988. “Problem Size, Parallel Architecture, and Optimal Speedup.Journal of Parallel and Distributed Computing 5: 404–420.Google Scholar
Nikhil, R. S. and Arvind, . 1989. “Can Dataflow Subsume von Neumann Computing?Proc. of the 16th Annual International Symposium on Computer Architecture, 262–272.Google Scholar
Niphanupudi, M.V., Norton, C. D., and Szymanski, B. K.. 1995. “Plasma Simulation on Networks of Workstations Using the Bulk Synchronous Parallel Model.Proc. of the Conference on Parallel and Distributed Processing Techniques and Applications, Athens, Georgia, 13–22.Google Scholar
Null, L. and Lobur, J.. 2015. The Essentials of Computer Organization and Architecture, 4th edn. Burlington, MA: Jones & Bartlett Learning.
Nussbaum, D. and Agarwal, A.. 1991. “Scalability of Parallel Machines.Communications of the ACM 34 (3): 57–61.Google Scholar
Nuth, P. R. and Dally, W. J.. 1992. “The J-machine Network.Proc. of the International Conference on Computer Design, October 1992, 420–423.Google Scholar
Nvidia, . 2015. CUDA C Programming Guide, PG-02829-001 v7.5, September. http://docs.nvidia.com/cuda/pdf/CUDA_C_Programming_Guide.pdf.
Nyland, L., Harris, M., and Prins, J.. 2007. “Fast N-body Simulations with CUDA.” In GPU Gems 3 (31), edited by Nguyen, H. (Addison-Wesley, eBook-BBL), 677–695.
Oden, J. T., Belytschko, T., Fish, J., et al. 2006. “Revolutionizing Engineering Science through Simulation.National Science Foundation Blue Ribbon Panel Report 65: 1–66.Google Scholar
OpenMP Application Program Interface, Version 2.5, May 2005. www.openmp.org.
OpenMP Application Program Interface, Version 3.0, May 2008. www.openmp.org.
OpenMP Application Program Interface, Version 3.1, July 2011. www.openmp.org.
OpenMP Application Program Interface, Version 4.0, July 2013. www.openmp.org.
OpenMP Application Program Interface, Version 4.1, July 2015. www.openmp.org.
Pacheco, P. S. 1997. Parallel Programming with MPI. San Francisco, CA: Morgan Kaufmann.
Pacheco, P. S. 2011. “An Introduction to Parallel Programming.” Burlington, MA: Morgan Kaufmann.
Padua, D. A. ed. 2011. Encyclopedia of Parallel Computing, Vols 1–4 (New York: Springer-Verlag).
Palmer, J. F. 1986. “The NCUBE Family of Parallel Supercomputers.Proc. of the International Conference on Computer Design, p. 107.Google Scholar
Papadimitriou, C. H. 1994. Computational Complexity. Reading, MA: AddisonWesley, chap. 15, “Parallel Computing.”
Parberry, I. 1987. Parallel Complexity Theory. London: Pitman/Wiley.
Parhami, B. 1999. Introduction to Parallel Processing. Algorithms and Architectures. New York: Plenum Press.
Parnas, D. L. 1975. “On a Solution to the Cigarette Smokers’ Problem without Conditional Statements.Communications of the ACM 18: 181–183.Google Scholar
Paterson, M. S. 1990. “Improved Sorting Networks with O(logN) Depth.Algorithmica 5 (1–4): 75–92.Google Scholar
Patil, S. 1971. Limitations and Capabilities of Dijkstra's Semaphore Primitives for Coordination among Processes. Technical report, Massachusetts Institute of Technology.Google Scholar
Patterson, D. A. and Hennessy, J. L.. 2013. Computer Organization and Design, 5th edn. Burlington, MA: Morgan Kaufmann.
Peitgen, H.-O. and Richter, P.. 1986. The Beauty of Science. Heidelberg: Springer-Verlag.
Pfister, G. F. 1998. In Search of Clusters. 2nd edn. Upper Saddle River, NJ: Prentice Hall.
Pfister, G. F., Brantley, W. C., George, D. A., et al. 1985. “The IBM Research Parallel Processor Prototype (RP3): Introduction and Architecture.Proc. of 1985 International Conference on Parallel Processing, 764–771.Google Scholar
Preparata, F. P. and Vuillemin, J.. 1981. “The Cube-connected Cycles: A Versatile Network for Parallel Computation.Communications of the ACM 24 (5): 300–309.Google Scholar
President's Information Technology Committee. 2005. Computational Science: Ensuring America's Competitiveness, June: 1–117.
Quinn, M. J. 1987. Designing Efficient Algorithms for Parallel Computers. New York: McGraw-Hill.
Quinn, M. J. 1994. Parallel Computing. Theory and Practice, 2nd edn. New York: McGraw-Hill.
Quinn, M. J. 2004. Parallel Programming in C with MPI and OpenMP, New York: McGraw-Hill.
Rajasekaran, S. and Reif, J., eds. 2008. Handbook of Parallel Computing. Models, Algorithms and Applications. Boca Raton, FL: Chapman & Hall/CRC.
Rajasekaran, S., Fiondella, L., Ahmed, M., and Ammar, R. A., eds. 2014. Multicore Computing. Boca Raton, FL: Chapman & Hall/CRC.
Ranade, A. G. 1987. “How to Emulate Shared Memory.Proc. of 28th Annual Symposium on the Foundations of Computer Science, Los Angeles, CA, 1987, 185–192.Google Scholar
Rauber, T. and R´’unger, G.. 2010. Parallel Programming for Multicore and Cluster Systems. Berlin: Springer-Verlag.
Reif, J. H., ed. 1993. Synthesis of Parallel Algorithms. San Mateo, CA: Morgan Kaufmann.
Reinders, J. R. 2011. “Systolic Arrays.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 2002–2011.
Reinders, J. R. 2011. “Warp and iWarp.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 2150–2159.
Reingold, E. M., Nievergelt, J., and Deo, N.. 1977. Combinatorial Algorithms: Theory and Practice. New York: Prentice Hall.
Riesen, R. and Maccabe, A. B.. 2011. “MIMD (Multiple Instruction, Multiple Data) Machines.” In Encyclopedia of Parallel Computing, vol. 3, edited by Padua, D. A. (New York: Springer-Verlag), 1140–1149.
Robert, Y. 2011. “Task Graph Scheduling.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 2013–2025.
Roberts, M. J., Vidale, P. L., Mizielinski, M. S., et al. 2015. “Tropical Cyclones in the UPSCALE Ensemble of High-Resolution Global Climate Models.Journal of Climate 28(2): 574–596.Google Scholar
Rochkind, M. J. 2004. Advanced UNIX Programming, 2nd edn. Boston, MA: Addison-Wesley.
Roosta, S. H. 2000. Parallel Processing and Parallel Algorithms. Theory and Computation. New York: Springer-Verlag.
Roscoe, A. W. 1998. The Theory and Practice of Concurrency. Upper Saddle River, NJ: Prentice Hall.
Rosner, J. 2015. “Methods of Parallelizing Selected Computer Vision Algorithms for Multi-core Graphics Processors.” Ph.D thesis, Silesian University of Technology, Gliwice, Poland. http://delibra.bg.polsl.pl/dlibra/.
Rumbaugh, J. 1977. “A Dataflow Multiprocessor.IEEE Transactions on Computers C-26: 1087–1095.Google Scholar
Sakaj, S., Kodama, Y., and Yamaguchi, Y.. 1991. “Prototype Implementation of a Highly Parallel Dataflow Machine EM-4.Proc. of the International Parallel Processing Symposium, 1991, 278–286.Google Scholar
Sanders, J. and Kandrot, E.. 2010. CUDA by Example. An Introduction to General-purpose GPU Programming. Upper Saddle River, NJ: Addison-Wesley.
Satoh, M., Tomita, H., Yashiro, H., et al. 2014. “The Non-hydrostatic Icosahedral Atmospheric Model: Description and Development.Progress in Earth and Planetary Science, 1(1): 1.Google Scholar
Savage, J. E. 1998. Models of Computation. Reading, MA: Addison-Wesley.
Savitch, W. J. and Stimson, M. J.. 1979. “Time Bounded Random Access Machines with Parallel Processing.Journal of the ACM 26: 103–118.Google Scholar
Schauser, K. E. and Scheiman, C. J.. 1995. “Experience with Active Messages on the Meiko CS-2.Proc. 9th International Symposium on Parallel Processing, April 1995, 140–149.Google Scholar
Schulz, M., Reuding, T., and Ertl, T.. 1998. “Analyzing Engineering Simulations in a Virtual Environment.IEEE Computer Graphics and Applications 18 (6): 46–52.Google Scholar
Schwartz, J. 1983. A Taxonomic Table of Parallel Computers Based on 55 Designs. New York: Courant Institute, New York University, November 1983.
“Science on a Grand Scale.” 2015. Science & Technology Review, Lawrance Liver-more National Laboratory, September, 4–11.
Scott, L. R., Clark, T., and Bagheri, B.. 2005. Scientific Parallel Computing. Princeton, NJ: Princeton University Press.
Scott, S. and Thorson, G.. 1996. “The Cray T3E Network: Adaptive Routing in a High Performance 3D Torus.Proc. of the Symposium on Hot Interconnects, August 1996, 147–156.Google Scholar
Seitz, C. L. 1985. “The Cosmic Cube.Communications of the ACM 28 (1): 22–33.Google Scholar
Sharp, J. A. 1985. Dataflow Computing. New York: John Wiley & Sons, Inc.
Shimokawabe, T., and Aoki, T.. 2010. “Multi-GPU Computing for Next-generation Weather forecasting – 145.0 TFlops 3990 GPUs on TSUBAME 2.0.TSUBAME e-Science Journal (ESJ) 2: 11–16.Google Scholar
Shiva, S. G. 2006. Advanced Computer Architectures. Boca Raton, FL: CRC Press.
Shonkwiler, R. W. and Lefton, L.. 2006. An Introduction to Parallel and Vector Scientific Computing. New York: Cambridge University Press.
Sima, D. 1997. “Superscalar Instruction Issue.IEEE Micro Magazine 17: 28–39.Google Scholar
Singh, J. P., Hennessy, J. L., and Gupta, A.. 1993. “Scaling Parallel Programs for Multiprocessors: Methodology and Examples.IEEE Computer 26 (7): 42–50.Google Scholar
Sinnen, O. 2007. Task Scheduling for Parallel Systems. Hoboken, NJ: John Wiley & Sons, Inc.
Sipser, M. 2006. Introduction to the Theory of Computation, 2nd edn. Boston, MA: Thomson Course Technology.
Skillicorn, D. B. 1988. “A Taxonomy for Computer Architectures.IEEE Computer 2146–2157.Google Scholar
Skillicorn, D. B. 2005. Foundations of Parallel Programming. Cambridge: Cambridge University Press.
Skillicorn, D., Hill, J. M. D., and McColl, W. F.. 1997. “Questions and Answers about BSP.Scientific Programming 6 (3): 249–274.Google Scholar
Slotnick, D. L., Borck, W. C., and McReynolds, R. C.. 1967. “The Solomon Computer.Proc. of the AFIPS Spring Joint Computer Conference, 22, New York, 1967, 97–107.Google Scholar
Smith, J. R. 1993. The Design and Analysis of Parallel Algorithms. New York: Oxford University Press.
Snir, M. 1985. “On Parallel Searching.SIAM Journal on Computing 15: 688–708.Google Scholar
Snir, M., Otto, S. W., Huss-Lederman, S., Walker, D. W., and Dongarra, J.. 1998. MPI-The Complete Reference: Vol. 1. The MPI Core, 2nd edn. Cambridge, MA: MIT Press.
Snir, M. 2011. “Reduce and Scan.” In Encyclopedia of Parallel Computing, vol. 4, edited by Padua, D. A. (New York: Springer-Verlag), 1728–1736.
Solihin, Y. 2016. Fundamentals of Parallel Multicore Architecture. Boca Raton, FL: Chapman & Hall/CRC.
Sottile, M. J., Mattson, T. G., and Rasmussen, C. E.. 2010. Introduction to Concurrency in Programming Languages. Boca Raton, FL: Chapman & Hall/CRC.
Stallings, W. 2013. Computer Organization and Architecture, 9th edn. Upper Saddle River, NJ: Pearson Education.
Stallings, W. 2012. Operating Systems. Internals and Design Principles, 8th edn. Upper Saddle River, NJ: Pearson Education.
van der Steen, A. J. and Dongarra, J. J.. 2006, 2007. Overview of Recent Supercomputers. www.top500.org/.
Sterling, T. L., Salmon, J., Becker, D. J., and Savarese, D. F.. 1999. How to Build a Beowulf. Cambridge, MA: MIT Press.
Stojmenović, I. 1996. “Direct Interconnection Networks.” In Parallel and Distributed Computing Handbook, edited by Zamoya, A. Y. (New York: McGraw-Hill), 537–567.
Sullivan, H. and Bashkow, T. R.. 1977. “A Large Scale, Homogeneous, Fully Distributed Parallel Machine.Proc. of the International Symposium on Computer Architecture, 1977, 105–124.Google Scholar
Sun, X-H. and Gustafson, J. L.. 1991. “Toward a Better Parallel Performance Metric.Parallel Computing 17: 1093–1109.Google Scholar
Sun, X-H. and Ni, L. M.. 1990. “Another View of Parallel Speedup.Supercomputing ’90 Proceedings, 324–333.Google Scholar
Sun, X-H. and Ni, L. M.. 1993. “Scalable Problems and Memory-bounded Speedup.Journal of Parallel and Distributed Computing 19: 27–37.Google Scholar
Sun, X-H. and Zhu, J.. 1995. “Performance Considerations of Shared Virtual Memory Machines.IEEE Transactions on Parallel and Distributed Systems 6 (11): 1185–1194.Google Scholar
Sun, X-H. and Rover, D. T.. 1994. “Scalability of Parallel Algorithm-machine Combinations.IEEE Transactions on Parallel and Distributed Systems 5 (6): 599–613.Google Scholar
Talbi, E-G. 2006. Parallel Combinatorial Optimization. Hoboken, NJ: Wiley-Interscience.
Tanenbaum, A. S. 2006. Structured Computer Organization, 5th edn. Upper Saddle River, NJ: Pearson Education, Prentice Hall.
Tanenbaum, A. S. 2009. Modern Operating Systems, 3rd edn. Upper Saddle River, NJ: Prentice Hall.
Tanenbaum, A. S. and van Steen, M.. 2007. Distributed Systems. Principles and Paradigms, 2nd edn. Upper Saddle River, NJ: Pearson Education.
Taubenfeld, G. 2006. Synchronization Algorithms and Concurrent Programming. Harlow, UK: Pearson Education, Prentice Hall.
Tel, G. 1994. Introduction to Distributed Algorithms. Cambridge: Cambridge University Press.
Thekkath, R., Singh, A. P., Singh, J. P., Hennessy, J., and John, S.. 1997. “An Application-driven Evaluation of the Convex Exemplar SP-1200.Proc. of the International Parallel Processing Symposium, June 1997, 8–17.Google Scholar
Thinking Machines Corporation. 1990. The CM-2 Technical Summary. Cambridge, MA: Thinking Machines Corporation.
Torán, J. 1993. “P-completeness.” In Lectures on Parallel Computation, edited by Gibbons, A. and Spirakis, P. (Cambridge: Cambridge University Press), 177–196.
Treleaven, P. C. 1985. “Control-driven, Data-driven and Demand-driven Computer Architecture.Parallel Computing 2 (3): 287–288.Google Scholar
Trono, J. A. and Taylor, W. E.. 2000. “Further comments on ‘A Correct and Unrestrictive Implementation of General Semaphores’.ACM SIGOPS Operating Systems Review 34 (3): 5–10.Google Scholar
Ungerer, T., Robiè, B., and Silc, J.. 2003. “A Survey of Processors with Explicit Multithreading.ACM Computing Surveys 35 (1): 29–63.Google Scholar
Valiant, L. G. 1990. “A Bridging Model for Parallel Computation.Communications of the ACM 33 (8): 103–111.Google Scholar
Valiant, L. G. 1990. “General Purpose Parallel Architectures.” In Handbook of Theoretical Computer Science, vol. A, edited by van Leeuven, J. (Amsterdam, The Netherlands: Elsevier), 944–971.
Van-Catledge, F. A. 1989. “Towards a General Model for Evaluating the Relative Performance Computer Systems.International Journal of Supercomputer Applications 3 (2): 100–108.Google Scholar
van Emde Boas, P. 1990. “Machine Models and Simulations.” In Handbook of Theoretical Computer Science, Vol. A, edited by van Leeuven, J. (Amsterdam, The Netherlands: Elsevier), 1–66.
Vazirani, V. V. 2003. Approximation Algorithms. Berlin: Springer-Verlag.
Venter, J. C., Adams, M. D., Myers, E. W., et al. 2001. “The Sequence of the Human Genome.Science 291: 1304–1351.Google Scholar
Vishkin, U. 1983. “Implementation of Simultaneous Memory Address Access in Models that Forbid It.Journal of Algorithms 4: 45–50.Google Scholar
Vishkin, U., Caragea, G. C., and Lee, B. C.. 2008. “Models for Advancing PRAM and Other Algorithms into Parallel Programs for a PRAM-on-chip Platform.” In Handbook of Parallel Computing. Models, Algorithms and Applications, edited by Rajasekaran, S. and Reif, J. (Boca Raton, FL: Chapman & Hall/CRC): 5-1–5-60.
Vos, J. B., Rizzi, A., Darracq, D., and Hirschel, E. H.. 2002. “Navier-Stokes Solvers in European Aircraft Design.Progress in Aerospace Sciences 38: 601–697.Google Scholar
Wah, W. and Akl, S. G.. 1992. “Simulating Multiple Memory Accesses in Logarithmic Time and Linear Space.The Computer Journal 35: 85–88.Google Scholar
Washington, W. M., Buja, L., and Craig, A.. 2009. “The Computational Future for Climate and Earth System Models: On the Path to Petaflop and Beyond.Phil. Trans. R. Soc. A 367: 833–846. doi:10.1098/rsta.2008.0219.Google Scholar
Wilkinson, B. and Allen, M.. 1999. Parallel Programming. Techniques and Applications Using Networked Workstations and Parallel Computers. Upper Saddle River, NJ: Prentice Hall.
Wilson, G. V. 1993. “A Glossary of Parallel Computing Terminology.IEEE Parallel & Distributed Technology February: 52–67.Google Scholar
Wilson, G. V. 1995. Practical Parallel Programming. Cambridge, MA: MIT Press.
Wilson, R. J. 1996. Introduction to Graph Theory, 4th edn. Harlow, UK: Addison Wesley Longman Ltd.
Winter, P. C., Hickey, G. J., and Fletcher, H. L.. 2002. Instant Notes. Genetics, 2nd edn. Milton Park, UK: BIOS Scientific Publishers.
Wolfe, M. 1996. High Performance Compilers for Parallel Computing. Addison-Wesley: Redwood City, CA.
Worley, P. H. 1990. “The Effect of Time Constraints on Scaled Speedup.SIAM Journal on Scientific and Statistical Computing 11 (5): 838–858.Google Scholar
Wulf, W. A. and Bell, C. G.. 1972. “C.mmp-A Multimicroprocessor.Proc. of AFIPS Conference, 765–777.Google Scholar
Xue, M., Droegemeier, K. K., and Weber, D.. 2008. “Numerical Prediction of High-impact Local Weather: A Driver for Petascale Computing.” In Petascale Computing. Algorithms and Applications, edited by Bader, D. A. (Boca Raton, FL: Chapman & Hall/CRC), 103–124.
Yokokawa, M., Shoji, F., and Hasegawa, Y.. 2015. “The K Computer.” In Contemporary High Performance Computing: From Petascale toward Exascale, edited by Vetter, J. S. (Chapman & Hall/CRC, Boca Raton, FL), vol. II, 115–139.
Zhou, X. 1989. “Bridging the Gap between Amdahl's Law and Sandia Laboratory's Result.Communications of the ACM 32 (8): 1014–1015.Google Scholar
Zorbas, J. R., Reble, D. J., and VanKooten, R. E.. 1989. “Measuring the Scalability of Parallel Computer Systems.Supercomputing ’89 Proc., 832–841.Google Scholar

Save book to Kindle

To save this book to your Kindle, first ensure [email protected] is added to your Approved Personal Document E-mail List under your Personal Document Settings on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part of your Kindle email address below. Find out more about saving to your Kindle.

Note you can select to save to either the @free.kindle.com or @kindle.com variations. ‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi. ‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.

Find out more about the Kindle Personal Document Service.

  • References
  • Zbigniew J. Czech
  • Book: Introduction to Parallel Computing
  • Online publication: 06 January 2017
  • Chapter DOI: https://doi.org/10.1017/9781316795835.011
Available formats
×

Save book to Dropbox

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Dropbox.

  • References
  • Zbigniew J. Czech
  • Book: Introduction to Parallel Computing
  • Online publication: 06 January 2017
  • Chapter DOI: https://doi.org/10.1017/9781316795835.011
Available formats
×

Save book to Google Drive

To save content items to your account, please confirm that you agree to abide by our usage policies. If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account. Find out more about saving content to Google Drive.

  • References
  • Zbigniew J. Czech
  • Book: Introduction to Parallel Computing
  • Online publication: 06 January 2017
  • Chapter DOI: https://doi.org/10.1017/9781316795835.011
Available formats
×