Hostname: page-component-78c5997874-s2hrs Total loading time: 0 Render date: 2024-11-09T14:56:10.410Z Has data issue: false hasContentIssue false

Jargon of Hadoop MapReduce scheduling techniques: a scientific categorization

Published online by Cambridge University Press:  15 March 2019

Muhammad Hanif
Affiliation:
Division of Computer Science and Engineering, Hanyang University, Seoul, Republic of Korea; e-mail: [email protected], [email protected]
Choonhwa Lee
Affiliation:
Division of Computer Science and Engineering, Hanyang University, Seoul, Republic of Korea; e-mail: [email protected], [email protected]

Abstract

Recently, valuable knowledge that can be retrieved from a huge volume of datasets (called Big Data) set in motion the development of frameworks to process data based on parallel and distributed computing, including Apache Hadoop, Facebook Corona, and Microsoft Dryad. Apache Hadoop is an open source implementation of Google MapReduce that attracted strong attention from the research community both in academia and industry. Hadoop MapReduce scheduling algorithms play a critical role in the management of large commodity clusters, controlling QoS requirements by supervising users, jobs, and tasks execution. Hadoop MapReduce comprises three schedulers: FIFO, Fair, and Capacity. However, the research community has developed new optimizations to consider advances and dynamic changes in hardware and operating environments. Numerous efforts have been made in the literature to address issues of network congestion, straggling, data locality, heterogeneity, resource under-utilization, and skew mitigation in Hadoop scheduling. Recently, the volume of research published in journals and conferences about Hadoop scheduling has consistently increased, which makes it difficult for researchers to grasp the overall view of research and areas that require further investigation. A scientific literature review has been conducted in this study to assess preceding research contributions to the Apache Hadoop scheduling mechanism. We classify and quantify the main issues addressed in the literature based on their jargon and areas addressed. Moreover, we explain and discuss the various challenges and open issue aspects in Hadoop scheduling optimizations.

Type
Review
Copyright
© Cambridge University Press, 2019 

Access options

Get access to the full version of this content by using one of the access options below. (Log in options will check for institutional or personal access. Content may require purchase if you do not have access.)

References

Ahmad, F., Chakradhar, S. T., Raghunathan, A. & Vijaykumar, T. N. 2014. ShuffleWatcher: shuffle-aware scheduling in multi-tenant MapReduce clusters. In 2014 USENIX Annual Technical Conference (USENIX ATC 14), 1–13. https://www.usenix.org/conference/atc14/technical-sessions/presentation/ahmad.Google Scholar
Althebyan, Q., ALQudah, O., Jararweh, Y. & Yaseen, Q. 2014. Multi-threading based MapReduce tasks scheduling. In 2014 5th International Conference on Information and Communication Systems (ICICS), 16. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber= 6841943.Google Scholar
Amazon! 2016a. Amazon! Elastic Block Store (EBS) – AWS Block Storage. https://aws.amazon.com/rds/ [accessed January 18, 2016].Google Scholar
Amazon! 2016b. Amazon! Relational Database Service (RDS). https://aws.amazon.com/rds/. [accessed January 18, 2016]Google Scholar
Amazon! 2016c. Amazon! Simple Storage Service (S3) – Object Storage. https://aws.amazon.com/s3/. [accessed January 18, 2016]Google Scholar
Amazon! 2016d. Elastic Compute Cloud (EC2). https://aws.amazon.com/ec2/. [accessed January 11, 2016]Google Scholar
Anjos, J. C. S., Carrera, I., Kolberg, W., Tibola, A. L., Arantes, L. B. & Geyer, C. R. 2015. MRA++: scheduling and data placement on MapReduce for heterogeneous environments. Future Generation Computer Systems 42, 2235, http://dx.doi.org/10.1016/j.future.2014.09.001.Google Scholar
Apache! 2015a. Apache Hadoop: Capacity Scheduler. https://hadoop.apache.org/docs/r2.7.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html [accessed December 31, 2015].Google Scholar
Apache! 2015b. Apache Hadoop: Fair Scheduler. https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html [accessed December 31,2015].Google Scholar
Apache! 2015c. ApacheTM Hadoop®! http://hadoop.apache.org/ [accessed December 31, 2015].Google Scholar
Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., Lee, Q., Patterson, D., Rabkin, A., Stoica, I. & Zaharia, M. 2010. A view of cloud computing. Communications of the ACM 53(4), 5058.Google Scholar
Arslan, E., Shekhar, M. & Kosar, T. 2014. Locality and network-aware reduce task scheduling for data-intensive applications. In 2014 5th International Workshop on Data-Intensive Computing in the Clouds, 1724. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=7017949.Google Scholar
Balmin, A. & Beyer, K. S. Adaptive MapReduce using situation-aware mappers. In EDBT ‘12 Proceedings of the 15th International Conference on Extending Database Technology, 420–431.Google Scholar
Bezerra, A., Hernandez, P., Espinosa, A. & Moure, J. C. 2013. Job scheduling for optimizing data locality in Hadoop clusters. In Proceedings of the 20th European MPI User’s Group Meeting on – EuroMPI ‘13, 271. http://dl.acm.org/citation.cfm?doid= 2488551.2488591.Google Scholar
Bincy, P. A. & Binu, A. 2013. Survey on job schedulers in Hadoop cluster. IOSR Journal of Computer Engineering (IOSR-JCE) 15(1), 4650, http://www.iosrjournals.org/iosr-jce/papers/Vol15-issue1/I01514650.pdf?id=7558.Google Scholar
Bortnikov, E., Frank, A., Hillel, E. & Rao, S. 2012. Predicting execution bottlenecks in map-reduce clusters. In Proceedings of 4th USENIX Conference on Hot Topics in Cloud Computing. http://dl.acm.org/citation.cfm?id= 2342781.Google Scholar
Bruno, R. & Ferreira, P. 2014. SCADAMAR: scalable and data-efficient internet MapReduce. In Proceedings of the 2nd International Workshop on CrossCloud Systems, 2. ACM.Google Scholar
Chen, Q., Zhang, D., Guo, M., Deng, Q. & Guo, S. 2010. SAMR: a self-adaptive MapReduce scheduling algorithm in heterogeneous environment. In Proceedings – 10th IEEE International Conference on Computer and Information Technology, CIT-2010, 7th IEEE International Conference on Embedded Software and Systems, ICESS-2010, ScalCom-2010, (Cit), 27362743.Google Scholar
Chen, Q., Liu, C. & Xiao, Z. 2014. Improving MapReduce performance using smart speculative execution strategy. IEEE Transactions on Computers 63(4), 954967.Google Scholar
Chen, T. Y., Wei, H. W., Wei, M. F., Chen, Y. J., Hsu, T. S. & Shih, W. K. 2013. LaSA: a locality-aware scheduling algorithm for Hadoop-MapReduce resource assignment. In Proceedings of the 2013 International Conference on Collaboration Technologies and Systems, CTS 2013, 342346.Google Scholar
Chintapalli, S. R. 2014. Analysis of Data Placement Strategy based on Computing Power of Nodes on Heterogeneous Hadoop Clusters. Doctoral dissertation, Auburn University.Google Scholar
Chu, C. T., Kim, S. K., Lin, Y. A., Yu, Y., Bradski, G., Olukotun, K. & Ng, A. Y. 2007. Map-Reduce for machine learning on multicore. Advances in Neural Information Processing Systems 19, 281288.Google Scholar
Dean, J. & Ghemawat, S. 2008. MapReduce. Communications of the ACM 51(1), 107. http://dl.acm.org/citation.cfm?id= 1327452.1327492.Google Scholar
Douglas, C., Murthy, A. C., Douglas, C., Agarwal, S., Konar, M., Evans, R., Graves, T., Lowe, J., Shah, H., Seth, S. & Saha, B. 2013. Apache Hadoop YARN – Yet Another Resource Negotiator. In Proceedings – IEEE Fourth International Conference on eScience, 277–284. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber= 736768.Google Scholar
Ekanayake, J., Pallickara, S. & Fox, G. 2008. MapReduce for data intensive scientific analyses. In 2008 IEEE Fourth International Conference on eScience, 277–284. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber= 736768.Google Scholar
Facebook! 2015. Under the Hood: Scheduling MapReduce jobs more efficiently with Corona. https://www.facebook.com/notes/facebook-engineering/under-the-hoodscheduling-mapreduce-jobs-more-efficiently-withcorona/10151142560538920[accessed December 31, 2015].Google Scholar
Geetha, J., UdayBhaskar, N. & ChennaReddy, P. 2016. Data-local reduce task scheduling. Procedia Computer Science 85, 598605.Google Scholar
Ghodsi, A., Zaharia, M., Hindman, B., Konwinski, A., Shenker, S. & Stoica, I. 2011. Dominant resource fairness: fair allocation of multiple resource types. In Nsdi, 11, 24–24. http://www.usenix.org/events/nsdi11/tech/fullpapers/Ghodsi.pdf.Google Scholar
Gu, L., Tang, Z. & Xie, G. 2014. The implementation of MapReduce scheduling algorithm based on priority. Parallel Computational Fluid Dynamics, (61103047), 100–111. http://link.springer.com/chapter/10.1007/978-3-642-53962-69.Google Scholar
Gu, T., Zuo, C., Liao, Q., Yang, Y. & Li, T. 2013. Improving MapReduce performance by data prefetching in heterogeneous or shared environments. International Journal of Grid and Distributed Computing 6(5), 7182, http://www.sersc.org/journals/IJGDC/vol6no5/7.pdf.Google Scholar
Gulati, A., Shanmuganathan, G., Holler, A. M. & Ahmad, I. 2011. Cloud-scale resource management: challenges and techniques. HotCloud 2011, 16 papers2://publication/uuid/EE3F25DD-34BB-4C32-9F0C-1FA53AAB86FD.Google Scholar
Gunelius, S. 2015. Per day information processed. http://aci.info/2014/07/12/the-data-explosion-in-2014-minute-by-minute-infographic/ [accessed December 31, 2015].Google Scholar
Hammoud, M., Rehman, M. S. & Sakr, M. F. 2012. Center-of-gravity reduce task scheduling to lower MapReduce network traffic. In Proceedings – 2012 IEEE 5th International Conference on Cloud Computing, CLOUD 2012, 4958.Google Scholar
Hammoud, M. & Sakr, M. F. 2011. Locality-aware reduce task scheduling for MapReduce. In Proceedings – 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011, 570–576.Google Scholar
Hanif, M. & Lee, C. 2016. An efficient key partitioning scheme for heterogeneous MapReduce clusters. In 2016 18th International Conference on Advanced Communication Technology (ICACT), 364–367. IEEE.Google Scholar
He, C., Lu, Y. & Swanson, D. 2011. Matchmaking: a new MapReduce scheduling technique. In Proceedings – 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011, 40–47.Google Scholar
Hindman, B., Konwinski, A., Zaharia, M., Ghodsi, A., Joseph, A.D., Katz, R.H., Shenker, S. & Stoica, I. 2011. Mesos: a platform for fine-grained resource sharing in the data center. NSDI, 11, 22–22. http://static.usenix.org/events/nsdi11/tech/fullpapers/Hindmannew.pdfnhttps://www.usenix.org/conference/nsdi11/mesos-platform-fine-grained-resource-sharing-data-center.Google Scholar
Ibrahim, S., Jin, H., Lu, L., Wu, S., He, B. & Qi, L. 2010. LEEN: locality/fairness-aware key partitioning for MapReduce in the cloud. In Proceedings – 2nd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2010, (2), 17–24.Google Scholar
Ibrahim, S., Jin, H., Lu, L., He, B., Antoniu, G. & Wu, S. 2012. Maestro: replica-aware map scheduling for MapReduce. In Proceedings – 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2012, 435–442.Google Scholar
Jiang, W. Z. & Sheng, Z. Q. 2012. A new task scheduling algorithm in hybrid cloud environment. In International Conference on Cloud and Service Computing, 45–49. http://dl.acm.org/citation.cfm?id= 2469449.2469626.Google Scholar
Jin, J., Luo, J., Song, A., Dong, F. & Xiong, R. 2011. BAR: an efficient data locality driven task scheduling algorithm for cloud computing. In Proceedings – 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, CCGrid 2011, 295–304.Google Scholar
Jin, S., Yang, S. & Jia, Y. 2012. Optimization of task assignment strategy for map-reduce. In Proceedings of 2nd International Conference on Computer Science and Network Technology, ICCSNT 2012, 57-61.Google Scholar
Jung, H. & Nakazato, H. 2014. Dynamic scheduling for speculative execution to improve MapReduce performance in heterogeneous environment. In 2014 IEEE 34th International Conference on Distributed Computing Systems Workshops (ICDCSW), 119–124.Google Scholar
Kc, K. & Anyanwu, K. 2010. Scheduling Hadoop jobs to meet deadlines. In Proceedings – 2nd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2010, 388–392.Google Scholar
Ko, S. Y. & Cho, B. 2009. On availability of intermediate data in cloud computations. Solutions, 66, http://portal.acm.org/citation.cfm?id= 1855574.Google Scholar
Kondikoppa, P., Chiu, C. H., Cui, C., Xue, L. & Park, S. J. 2012. Network-aware scheduling of MapReduce framework on distributed clusters over high speed networks. In Proceedings of the 2012 workshop on Cloud services, federation, and the 8th open cirrus summit, 39–44. http://doi.acm.org/10.1145/2378975.2378985.Google Scholar
Lee, G., Chun, B. & Katz, R. H. 2011. Heterogeneity-aware resource allocation and scheduling in the cloud. In Proceedings of HotCloud, 1, 47–52. http://www.usenix.org/events/hotcloud11/tech/finalfiles/Lee.pdf.Google Scholar
Li, H. PWBRR Algorithm of Hadoop Platform.Google Scholar
Li, W., Yang, H., Luan, Z. & Qian, D. 2011. Energy prediction for mapreduce workloads. In 2011 IEEE Ninth International Conference on Dependable, Autonomic and Secure Computing (DASC), 443–448. IEEE.Google Scholar
Liang, A., Xiao, L. & Li, R. 2013. An energy-aware dynamic clustering-based scheduling algorithm for parallel tasks on clusters. International Journal of Advancements in Computing Technology, 5(5), 785792, http://www.aicit.org/ijact/global/paperdetail.html?jname=IJACT&q=2412.Google Scholar
Liu, H. 2011. Cutting MapReduce Cost with Spot Market. USENIX HotCloud'11, 5.Google Scholar
Mackey, G., Sehrish, S., Bent, J., Lopez, J., Habib, S. & Wang, J. 2008. Introducing map-reduce to high end computing. In 2008 3rd Petascale Data Storage Workshop, 3, 1–6. http://ieeexplore.ieee.org/articleDetails.jsp?arnumber= 4811889.Google Scholar
Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C. & Byers, A. H. 2011. Big data: the next frontier for innovation, competition, and productivity. McKinsey Global Institute, (June), 156.Google Scholar
Matsunaga, A., Tsugawa, M. & Fortes, J. 2008. CloudBLAST: combining MapReduce and virtualization on distributed resources for bioinformatics applications. In 2008 IEEE Fourth International Conference on eScience, 222–229. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber= 4736761.Google Scholar
Morton, K., Balazinska, M. & Grossman, D. 2010. ParaTimer: a progress indicator for MapReduce DAGs. In Proceedings of the 2010 International Conference on Management of Data, 507–518. papers://b48995dc-e14b-47dc-9998-dcf47f651d40/P aper/p66.Google Scholar
Nanduri, R., Maheshwari, N., Reddyraja, A. & Varma, V. 2011. Job aware scheduling algorithm for MapReduce framework. In Proceedings – 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011, (November), 724–729.Google Scholar
Nita, M. C., Pop, F., Voicu, C., Dobre, C. & Xhafa, F. 2015. MOMTH: multi-objective scheduling algorithm of many tasks in Hadoop. Cluster Computing, 18(3), 1–14. http://dl.acm.org/citation.cfm?id= 2740070.2626334.Google Scholar
Palanisamy, B., Singh, A., Liu, L. & Jain, B. 2011. Purlieus: locality-aware resource allocation for MapReduce in a cloud. In 2011 International Conference for High Performance Computing, Networking, Storage and Analysis (SC), 1–11.Google Scholar
Park, J., Lee, D., Kim, B., Huh, J. & Maeng, S. 2012. Locality-aware dynamic VM reconfiguration on MapReduce clouds. In HPDC ‘12: Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing SE – HPDC ‘12, 27–36. http://dx.doi.org/10.1145/2287076.2287082.Google Scholar
Phan, L. T., Zhang, Z., Loo, B. T. & Lee, I. 2010. Real-time MapReduce scheduling. Technical Reports (CIS), (January). http://repository.upenn.edu/cisreports/942.Google Scholar
Polo, J., Carrera, D., Becerra, Y., Torres, J., Ayguadé, E., Steinder, M. & Whalley, I. 2010. Performance-driven task co-scheduling for MapReduce environments. In Proceedings of the 2010 IEEE/IFIP Network Operations and Management Symposium, NOMS 2010, 373–380.Google Scholar
Rao, B. T., Sridevi, N. V., Reddy, V. K. & Reddy, L. S. S. 2012. Performance issues of heterogeneous Hadoop clusters in cloud computing. XI(Viii), 6. http://arxiv.org/abs/1207.0894.Google Scholar
Rao, B. T. & Reddy, L. S. S. 2012. Survey on improved scheduling in Hadoop MapReduce in cloud environments. International Journal of Computer Applications 34(9), 2933, http://adsabs.harvard.edu/abs/2012arXiv1207.0780T.Google Scholar
Ren, X. 2015. Speculation-Aware Resource Allocation for Cluster Schedulers. CITP, California, 2015.Google Scholar
Sandholm, T. & Lai, K. 2010. Dynamic Proportional Share Scheduling in Hadoop. Job scheduling Strategies for Parallel Processing 2010. Springer Berlin Heidelberg, 110131.Google Scholar
Seo, S., Jang, I., Woo, K., Kim, I., Kim, J. S. & Maeng, S. 2009. HPMR: prefetching and pre-shuffling in shared MapReduce computation environment. In 2009 IEEE International Conference on Cluster Computing and Workshops, 1–8. http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber= 5289171.Google Scholar
Shafer, J., Rixner, S. & Cox, A. L. 2010. The Hadoop distributed filesystem: balancing portability and performance. In ISPASS 2010 – IEEE International Symposium on Performance Analysis of Systems and Software, 122–133.Google Scholar
Shang, F., Chen, X. & Yan, C. 2017. A Strategy for Scheduling Reduce Task Based on Intermediate Data Locality of the MapReduce. Cluster Computing.Google Scholar
Su, Y. L., Chen, P. C., Chang, J. B. & Shieh, C. K. 2011. Variable-sized map and locality-aware reduce on public-resource grids. Future Generation Computer Systems 27(6), 843849, http://dx.doi.org/10.1016/j.future.2010.09.001.Google Scholar
Sun, R., Yang, J., Gao, Z. & He, Z. 2014. A virtual machine based task scheduling approach to improving data locality for virtualized Hadoop. In 2014 IEEE/ACIS 13th International Conference on Computer and Information Science (ICIS), 297–302.Google Scholar
Sun, X., He, C. & Lu, Y. 2012. ESAMR: an enhanced self-adaptive mapreduce scheduling algorithm. In Proceedings of the International Conference on Parallel and Distributed Systems – ICPADS, 148–155.Google Scholar
Suresh, S. & Gopalan, N. 2014. An optimal task selection scheme for Hadoop scheduling. IERI Procedia 10, 7075, http://dx.doi.org/10.1016/j.ieri.2014.09.093.Google Scholar
Tanenbaum, A. S. 2009. Modern Operating Systems. Education, 2. http://www.amazon.com/dp/0136006639.Google Scholar
Tang, X., Wang, L. & Geng, Z. 2015. A reduce task scheduler for MapReduce with minimum transmission cost based on sampling. Evaluation. 8(1), 110.Google Scholar
Tang, Z., Zhou, J., Li, K. and Li, R. 2012. MTSD: a task scheduling algorithm for MapReduce base on deadline constraints. In Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012, 2012–2018.Google Scholar
Teng, F., Magoulès, F., Yu, L. & Li, T. 2014. A novel real-time scheduling algorithm and performance analysis of a MapReduce-based cloud. The Journal of Supercomputing 69(2), 739765, http://link.springer.com/10.1007/s11227-014-1115-z.Google Scholar
Tian, C., Zhou, H., He, Y. & Zha, L. 2009. A dynamic MapReduce scheduler for heterogeneous workloads. In 8th International Conference on Grid and Cooperative Computing, GCC 2009, 218–224.Google Scholar
Tiwari, N., Sarkar, S., Bellur, U. & Indrawan, M. 2015. Classification framework of MapReduce scheduling algorithms. ACM Computing Surveys 47(3), 138, http://dl.acm.org/citation.cfm?doid= 2737799.2693315.Google Scholar
Wei, H. W., Wu, T. Y., Lee, W. T. & Hsu, C. W. 2015. Shareability and locality aware scheduling algorithm in Hadoop for mobile cloud computing. Journal of Information Hiding and Multimedia Signal Processing 6, 12151230.Google Scholar
Wolf, J., Nabi, Z., Nagarajan, V., Saccone, R., Wagle, R., Hildrum, K., Pring, E. & Sarpatwar, K. 2014. The X-flex cross-platform scheduler: who’s the fairest of them all? In Proceedings of the Middleware Industry Track, 1. ACM.Google Scholar
Xia, Y., Wang, L., Zhao, Q. & Zhang, G. 2011. Research on job scheduling algorithm in Hadoop. Journal of Computational Information Systems 7(16), 57695775.Google Scholar
Xie, J., Yin, S., Ruan, X., Ding, Z., Tian, Y., Majors, J., Manzanares, A. & Qin, X. 2010. Improving MapReduce performance through data placement in heterogeneous Hadoop clusters. In 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and PhD Forum (IPDPSW), 1–9. IEEE.Google Scholar
Yoo, D. & Sim, K. M. 2011. A comparative review of job scheduling for MapReduce. In CCIS2011 – Proceedings: 2011 IEEE International Conference on Cloud Computing and Intelligence Systems, 353–358.Google Scholar
Yu, X. & Hong, B. 2013. Bi-Hadoop: extending Hadoop to improve support for binary-input applications. In Proceedings – 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, CCGrid 2013, 245–252.Google Scholar
Zaharia, M., Borthakur, D. et al.. 2010. Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling. In Proceedings of the 5th European Conference on Computer Systems, 265–278. http://portal.acm.org/citation.cfm?id= 1755913.1755940.Google Scholar
Zaharia, M., Konwinski, A., Joseph, A. D., Katz, R. H. & Stoica, I. 2008. Improving MapReduce performance in heterogeneous environments. In Osdi, 8(4), 29–42. http://www.usenix.org/event/osdi08/tech/fullpapers/zaharia/zahariahtml/.Google Scholar
Zaharia, M., Borthakur, D., Sarma, J. S., Elmeleegy, K., Shenker, S. & Stoica, I. 2009. Job scheduling for multi-user MapReduce clusters. EECS Department University of California Berkeley Tech Rep UCBEECS200955 Apr, (UCB/EECS-2009-55), 2009-55. http://www.eecs.berkeley.edu/P ubs/T echRpts/2009/EECS-2009-55.pdf.Google Scholar
Zaharia, M., Chowdhury, M., Franklin, M. J., Shenker, S. & Stoica, I. 2010. Spark: cluster computing with working sets. In HotCloud'10 Proceedings of the 2nd USENIX Conference on Hot Topics in Cloud Computing, 10.Google Scholar
Zhang, X., Feng, Y., Feng, S., Fan, J. & Ming, Z. 2011. An effective data locality aware task scheduling method for MapReduce framework in heterogeneous environments. In Proceedings – 2011 International Conference on Cloud and Service Computing, CSC 2011, 235–242.Google Scholar