Hostname: page-component-745bb68f8f-kw2vx Total loading time: 0 Render date: 2025-01-26T04:09:02.098Z Has data issue: false hasContentIssue false

Implicitly threaded parallelism in Manticore

Published online by Cambridge University Press:  27 January 2011

MATTHEW FLUET
Affiliation:
Computer Science Department, Rochester Institute of Technology, Rochester, NY, USA (e-mail: [email protected])
MIKE RAINEY
Affiliation:
Department of Computer Science, University of Chicago, Chicago, IL, USA (e-mail: [email protected])
JOHN REPPY
Affiliation:
Department of Computer Science, University of Chicago, Chicago, IL, USA (e-mail: [email protected])
ADAM SHAW
Affiliation:
Department of Computer Science, University of Chicago, Chicago, IL, USA (e-mail: [email protected])
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

The increasing availability of commodity multicore processors is making parallel computing ever more widespread. In order to exploit its potential, programmers need languages that make the benefits of parallelism accessible and understandable. Previous parallel languages have traditionally been intended for large-scale scientific computing, and they tend not to be well suited to programming the applications one typically finds on a desktop system. Thus, we need new parallel-language designs that address a broader spectrum of applications. The Manticore project is our effort to address this need. At its core is Parallel ML, a high-level functional language for programming parallel applications on commodity multicore hardware. Parallel ML provides a diverse collection of parallel constructs for different granularities of work. In this paper, we focus on the implicitly threaded parallel constructs of the language, which support fine-grained parallelism. We concentrate on those elements that distinguish our design from related ones, namely, a novel parallel binding form, a nondeterministic parallel case form, and the treatment of exceptions in the presence of data parallelism. These features differentiate the present work from related work on functional data-parallel language designs, which have focused largely on parallel problems with regular structure and the compiler transformations—most notably, flattening—that make such designs feasible. We present detailed examples utilizing various mechanisms of the language and give a formal description of our implementation.

Type
Articles
Copyright
Copyright © Cambridge University Press 2011

References

Acar, U. A., Blelloch, G. E. & Blumofe, R. D. (2000) The data locality of work stealing. In Proceedings of the 12th ACM Annual Symposium on Parallel Algorithms and Architectures. ACM, pp. 112.Google Scholar
Armstrong, J., Virding, R., Wikström, C. & Williams, M. (1996) Concurrent Programming in ERLANG, 2nd ed. Hertfordshire, UK: Prentice Hall International.Google Scholar
Arora, N. S., Blumofe, R. D. & Plaxton, C. G. (1998) Thread Scheduling for Multiprogrammed Multiprocessors.CrossRefGoogle Scholar
Barth, P., Nikhil, R. S. & Arvind, (1991) M-structures: Extending a parallel, non-strict, functional language with state. In Functional Programming Languages and Computer Architecture (fpca '91) New York, NY. Lecture Notes in Computer Science, vol. 523. Springer-Verlag, pp. 538568.CrossRefGoogle Scholar
Barton, R., Adkins, D., Prokop, H., Frigo, M., Joerg, C., Renard, M., Dailey, D. & Leiserson, C. (1998) Cilk Pousse. Viewed on March 20, 2008 at 2:45 PM.Google Scholar
Bergstrom, L., Fluet, M., Rainey, M., Reppy, J. & Shaw, A. (2010) Lazy tree splitting. In Proceedings of the 15th ACM SIGPLAN International Conference on Functional Programming, New York, NY. ACM, pp. 93104.CrossRefGoogle Scholar
Blelloch, G. E. (1996) Programming parallel algorithms, Commun. ACM, 39 (3): 8597.CrossRefGoogle Scholar
Blelloch, G. E., Chatterjee, S., Hardwick, J. C., Sipelstein, J. & Zagha, M. (1994) Implementation of a portable nested data-parallel language, J. Parallel Distrib. Comput., 21 (1): 414.CrossRefGoogle Scholar
Blelloch, G. E. & Greiner, J. (1996) A provable time and space efficient implementation of NESL. In Proceedings of the 1996 ACM SIGPLAN International Conference on Functional Programming, New York, NY. ACM, pp. 213225.Google Scholar
Blumofe, R. D., Joerg, C. F., Kuszmaul, B. C., Leiserson, C. E., Randall, K. H. & Zhou, Y. (1995) Cilk: An efficient multithreaded runtime system. In Proceedings of the 5th ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, New York, NY. ACM, pp. 207216.Google Scholar
Blumofe, R. D. & Leiserson, C. E. (1999) Scheduling multithreaded computations by work stealing, J. ACM, 46 (5): 720748.CrossRefGoogle Scholar
Boehm, H.-J., Atkinson, R. & Plass, M. (1995) Ropes: An alternative to strings, Software Pract. Ex., 25 (12): 13151330.CrossRefGoogle Scholar
Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M. & Hanrahan, P. (2004) Brook for GPUs: Stream computing on graphics hardware, Proc. ACM SIGGRAPH 2004, 23 (3): 777786.CrossRefGoogle Scholar
Chakravarty, M. M. T. & Keller, G. (2000) More types for nested data parallel programming. In Proceedings of the Fifth ACM SIGPLAN International Conference on Functional Programming, New York, NY. ACM, pp. 94105.CrossRefGoogle Scholar
Chakravarty, M. M. T., Keller, G., Leshchinskiy, R. & Pfannenstiel, W. (2001) Nepal—Nested data parallelism in Haskell. In Proceedings of the 7th International Euro-Par Conference on Parallel Computing, New York, NY. Lecture Notes in Computer Science, vol. 2150. Springer-Verlag, pp. 524534.Google Scholar
Chakravarty, M. M. T., Leshchinskiy, R., Peyton Jones, S., Keller, G. & Marlow, S. (2007) Data parallel Haskell: A status report. In Proceedings of the ACM SIGPLAN Workshop on Declarative Aspects of Multicore Programming, New York, NY. ACM, pp. 1018.Google Scholar
Dailey, D. & Leiserson, C. E. (2002) Using Cilk to write multiprocessor chess programs, J. Int. Comput. Chess Assoc., 24 (4): 236237.Google Scholar
Danaher, J. S., Lee, I.-T. A. & Leiserson, C. E. (2006) Programming with Exceptions in JCilk, Sci. Comput. Program., 63 (2): 147171.CrossRefGoogle Scholar
Dean, J. & Ghemawat, S. (December 2004) MapReduce: Simplified data processing on large clusters. In Proceedings of the Sixth Symposium on Operating Systems Design and Implementation (OSDI '04), Berkely, CA. USENIX, pp. 137150.Google Scholar
Feeley, M. (1993) An Efficient and General Implementation of Futures on Large Scale Shared-Memory Multiprocessors. PhD thesis, Brandeis University, Waltham, MA, USA.Google Scholar
Fluet, M., Ford, N., Rainey, M., Reppy, J., Shaw, A. & Xiao, Y. (2007a) Status report: The Manticore project. In Proceedings of the 2007 ACM SIGPLAN Workshop on ML, New York, NY. ACM, pp. 1524.CrossRefGoogle Scholar
Fluet, M., Rainey, M. & Reppy, J. (2008a) A scheduling framework for general-purpose parallel languages. In Proceedings of the 13th ACM SIGPLAN International Conference on Functional Programming, New York, NY. ACM, pp. 241252.CrossRefGoogle Scholar
Fluet, M., Rainey, M., Reppy, J. & Shaw, A. (2008b) Implicitly-threaded parallelism in Manticore. In Proceedings of the 13th ACM SIGPLAN International Conference on Functional Programming, New York, NY. ACM, pp. 119130.CrossRefGoogle Scholar
Fluet, M., Rainey, M., Reppy, J., Shaw, A. & Xiao, Y. (2007b) Manticore: A heterogeneous parallel language. In Proceedings of the ACM SIGPLAN Workshop on Declarative Aspects of Multicore Programming, New York, NY. ACM, pp. 3744.Google Scholar
Frigo, M., Leiserson, C. E. & Randall, K. H. (June 1998) The implementation of the Cilk-5 multithreaded language. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation (PLDI '98), New York, NY. ACM, pp. 212223.Google Scholar
Gansner, E. R. & Reppy, J. H. (eds). (2004) The Standard ML Basis Library. Cambridge, England: Cambridge University Press.CrossRefGoogle Scholar
Gaudiot, J.-L., DeBoni, T., Feo, J., Bohm, W., Najjar, W. & Miller, P. (1997) The Sisal model of functional programming and its implementation. In Proceedings of the 2nd AIZU International Symposium on Parallel Algorithms/Architecture Synthesis (pAs '97), Los Alamitos, CA. IEEE Computer Society, pp. 112123.CrossRefGoogle Scholar
GHC. (November 1998) The Glasgow Haskell Compiler [online]. Available at: http://www.haskell.org/ghc Accessed 10 December 2010.Google Scholar
, R. H. Jr. (1984) Implementation of multilisp: Lisp on a multiprocessor. In Conference Record of the 1984 ACM Symposium on Lisp and Functional Programming. New York, NY. ACM, pp. 917.Google Scholar
Halstead, R. H. Jr. (1985) Multilisp: A language for concurrent and symbolic computation, ACM Trans. Program. Lang. Syst., 7: 501538.CrossRefGoogle Scholar
Hammond, K. (1991) Parallel SML: A Functional Language and its Implementation in Dactl. Cambridge, MA: MIT Press.Google Scholar
Harris, T., Marlow, S., Peyton Jones, S. & Herlihy, M. (2005) Composable memory transactions. In Proceedings of the 2005 ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, New York, NY. ACM, pp. 4860.Google Scholar
Hauser, C., Jacobi, C., Theimer, M., Welch, B. & Weiser, M. (December 1993). Using threads in interactive systems: A case study. In Proceedings of the 14th ACM Symposium on Operating System Principles. ACM, pp. 94105.Google Scholar
Hedqvist, P. (June 1998) A Parallel and Multithreaded ERLANG Implementation. MPhil thesis, Computer Science Department, Uppsala University, Uppsala, Sweden.Google Scholar
Jones, M. P. & Hudak, P. (August 1993) Implicit and Explicit Parallel Programming in Haskell. Tech. Rep. YALEU/DCS/RR-982, Yale University.Google Scholar
Krishnamurthy, A., Culler, D. E., Dusseau, A., Goldstein, S. C., Lumetta, S., von Eicken, T. & Yelick, K. (1993) Parallel programming in split-C. In Supercomputing '93: Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, New York, NY. ACM, pp. 262273.CrossRefGoogle Scholar
Le Fessant, F. & Maranget, L. (1998) Compiling join-patterns. In Proceedings of the Third International Workshop on High-Level Concurrent Languages (HLCL '98). Electronic Notes in Theoretical Computer Science, vol. 16, no. 3. Elsevier Science, pp. 205224.Google Scholar
Leroy, X. & Pessaux, F. (2000) Type-based analysis of uncaught exceptions, ACM Trans. Program. Lang. Syst., 22 (2): 340377.CrossRefGoogle Scholar
Leshchinskiy, R., Chakravarty, M. M. T. & Keller, G. (2006) Higher order flattening. In International Conference on Computational Science (ICCS '06), Alexandrov, V., van Albada, D., Sloot, P. & Dongarra, J. (eds), LNCS, no. 3992. New York, NY. Springer-Verlag, pp. 920928.Google Scholar
Mandel, L. & Maranget, L. (December 2008). The JoCaml Language Release 3.11 Documentation and User's Manual [online]. Available at: http://jocaml.inria.fr/manual/index.html Accessed 10 December 2010.Google Scholar
McCarthy, J. (1963) A basis for a mathematical theory of computation. In Computer Programming and Formal Systems, Braffort, P. & Hirschberg, D. (eds), North-Holland, Amsterdam, pp. 3370.CrossRefGoogle Scholar
Milner, R., Tofte, M., Harper, R. & MacQueen, D. (1997) The Definition of Standard ML (revised). Cambridge, MA: MIT Press.CrossRefGoogle Scholar
Mirani, R. & Hudak, P. (1995) First-class schedules and virtual maps. In Fpca '95: Proceedings of the Seventh International Conference on Functional Programming Languages and Computer Architecture, New York, NY. ACM, pp. 7885.CrossRefGoogle Scholar
Mirani, R. & Hudak, P. (2004) First-class monadic schedules, ACM Trans. Program. Lang. Syst., 26 (4): 609651.CrossRefGoogle Scholar
Mohr, E., Kranz, D. A. & Halstead, R. H. Jr. (1990) Lazy task creation: A technique for increasing the granularity of parallel programs. In Conference Record of the 1990 ACM Conference on Lisp and Functional Programming. New York, NY. ACM, pp. 185197.CrossRefGoogle Scholar
Nikhil, R. S. (July 1991). ID Language Reference Manual. Cambridge, MA: Laboratory for Computer Science, MIT.Google Scholar
Nikhil, R. S. & Arvind, (2001) Implicit Parallel Programming in pH San Francisco, CA: Morgan Kaufmann.Google Scholar
Osborne, R. B. (1990) Speculative computation in multilisp. In Conference record of the 1990 ACM Conference on Lisp and Functional Programming. New York, NY. ACM, pp. 198208.CrossRefGoogle Scholar
PeytonJones, S. Jones, S., Gordon, A. & Finne, S. (1996) Concurrent Haskell. In Conference Record of the 23rd Annual ACM Symposium on Principles of Programming Languages (popl '96). New York, NY. ACM, pp. 295308.Google Scholar
PeytonJones, S. Jones, S., Reid, A., Henderson, F., Hoare, T. & Marlow, S. (1999) A semantics for imprecise exceptions. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation (PLDI '99), New York, NY. ACM, pp. 2536.CrossRefGoogle Scholar
Pike, R., Dorward, S., Griesemer, R. & Quinlan, S. (2005) Interpreting the data: Parallel analysis with sawzall, Sci. Program. J., 13 (4): 227298.Google Scholar
Rainey, M. (August 2010) Effective Scheduling Techniques for High-Level Parallel Programming Languages [online]. PhD thesis, University of Chicago. Available at: http://manticore.cs.uchicago.edu Accessed 10 December 2010.Google Scholar
Reppy, J., Russo, C. & Xiao, Y. (2009) Parallel Concurrent ML. In Proceedings of the 14th ACM SIGPLAN International Conference on Functional Programming, New York, NY. ACM, pp. 257268.CrossRefGoogle Scholar
Reppy, J. H. (1991) CML: A higher-order concurrent language. In Proceedings of the SIGPLAN Conference on Programming Language Design and Implementation (PLDI '91), New York, NY. ACM, pp. 293305.Google Scholar
Reppy, J. H. (1999) Concurrent Programming in ML. Cambridge, England: Cambridge University Press.CrossRefGoogle Scholar
Shaw, A. (July 2007). Data Parallelism in Manticore [online]. MPhil thesis, University of Chicago. Available at: http://manticore.cs.uchicago.edu Accessed 10 December 2010.Google Scholar
Spoonhower, D., Blelloch, G. E., Gibbons, P. B. & Harper, R. (2008) Beyond nested parallelism: Tight bounds on work-stealing overheads for parallel futures. In Proceedings of the 20th ACM Annual Symposium on Parallel Algorithms and Architectures, New York, NY. ACM.Google Scholar
Tarditi, D., Puri, S. & Oglesby, J. (2006) Accelerator: Using data parallelism to program GPUs for general-purpose uses, Sigops Pper. Syst. Rev., 40 (5): 325335.CrossRefGoogle Scholar
Trinder, P. W., Hammond, K., Loidl, H.-W. & Peyton Jones, S. L. (1998) Algorithm + strategy = parallelism, J. Funct. Program., 8 (1): 2360.CrossRefGoogle Scholar
Tzannes, A., Caragea, G. C., Barua, R. & Vishkin, U. (2010) Lazy binary-splitting: A run-time adaptive work-stealing scheduler. In Proceedings of the 2010 ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming, New York, NY. ACM, pp. 179190.Google Scholar
Yi, K. (1998) An abstract interpretation for estimating uncaught exceptions in Standard ML programs, Sci. Comput. Program., 31 (1): 147173.CrossRefGoogle Scholar
Submit a response

Discussions

No Discussions have been published for this article.