Presentation | 2016-09-13 Performance Analysis of MapReduce Shuffling Harunobu Daikoku, Hideyuki Kawashima, Osamu Tatebe, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | This paper analyzes shuffling performance of Apache Spark, which is one of the most popular MapReduce implementations in recent years. The performance of Sort-based Shuffle and Hash-based Shuffle, which are the two shuffle implementations provided by Spark 1.6.2, are evaluated and compared in terms of network I/O and disk I/O. The evaluation results showed that, while there was little difference between those two implementations in regards to network I/O, and more frequent disk I/O operations were confirmed on Hash-based implementation compared to Sort-based implementation, Hash-based Shuffle showed better performance than Sort-based Shuffle in terms of the overall execution time. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | MapReduce / Shuffle / Apache Spark / Performance Analysis |
Paper # | DE2016-15 |
Date of Issue | 2016-09-06 (DE) |
Conference Information | |
Committee | DE |
---|---|
Conference Date | 2016/9/13(3days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | Keio Univ. (Hiyoshi Campus) |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | Big Data Management, Information Retrieval, Knowledge Discovery, etc. |
Chair | Masato Oguchi(Ochanomizu Univ.) |
Vice Chair | Makoto Onizuka(Osaka Univ.) / Masashi Toyoda(Univ. of Tokyo) |
Secretary | Makoto Onizuka(Kyushu Univ.) / Masashi Toyoda(Kogakuin Univ.) |
Assistant | Mayuki Ueda(Univ. of Marketing and Distrbution Science) / Shingo Otsuka(Kanagawa Inst. of Tech.) |
Paper Information | |
Registration To | Technical Committee on Data Engineering |
---|---|
Language | JPN |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | Performance Analysis of MapReduce Shuffling |
Sub Title (in English) | |
Keyword(1) | MapReduce |
Keyword(2) | Shuffle |
Keyword(3) | Apache Spark |
Keyword(4) | Performance Analysis |
1st Author's Name | Harunobu Daikoku |
1st Author's Affiliation | University of Tsukuba(Univ. Tsukuba) |
2nd Author's Name | Hideyuki Kawashima |
2nd Author's Affiliation | University of Tsukuba(Univ. Tsukuba) |
3rd Author's Name | Osamu Tatebe |
3rd Author's Affiliation | University of Tsukuba(Univ. Tsukuba) |
Date | 2016-09-13 |
Paper # | DE2016-15 |
Volume (vol) | vol.116 |
Number (no) | DE-214 |
Page | pp.pp.19-24(DE), |
#Pages | 6 |
Date of Issue | 2016-09-06 (DE) |