Presentation 2016-09-13
Performance Analysis of MapReduce Shuffling
Harunobu Daikoku, Hideyuki Kawashima, Osamu Tatebe,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) This paper analyzes shuffling performance of Apache Spark, which is one of the most popular MapReduce implementations in recent years. The performance of Sort-based Shuffle and Hash-based Shuffle, which are the two shuffle implementations provided by Spark 1.6.2, are evaluated and compared in terms of network I/O and disk I/O. The evaluation results showed that, while there was little difference between those two implementations in regards to network I/O, and more frequent disk I/O operations were confirmed on Hash-based implementation compared to Sort-based implementation, Hash-based Shuffle showed better performance than Sort-based Shuffle in terms of the overall execution time.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) MapReduce / Shuffle / Apache Spark / Performance Analysis
Paper # DE2016-15
Date of Issue 2016-09-06 (DE)

Conference Information
Committee DE
Conference Date 2016/9/13(3days)
Place (in Japanese) (See Japanese page)
Place (in English) Keio Univ. (Hiyoshi Campus)
Topics (in Japanese) (See Japanese page)
Topics (in English) Big Data Management, Information Retrieval, Knowledge Discovery, etc.
Chair Masato Oguchi(Ochanomizu Univ.)
Vice Chair Makoto Onizuka(Osaka Univ.) / Masashi Toyoda(Univ. of Tokyo)
Secretary Makoto Onizuka(Kyushu Univ.) / Masashi Toyoda(Kogakuin Univ.)
Assistant Mayuki Ueda(Univ. of Marketing and Distrbution Science) / Shingo Otsuka(Kanagawa Inst. of Tech.)

Paper Information
Registration To Technical Committee on Data Engineering
Language JPN
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) Performance Analysis of MapReduce Shuffling
Sub Title (in English)
Keyword(1) MapReduce
Keyword(2) Shuffle
Keyword(3) Apache Spark
Keyword(4) Performance Analysis
1st Author's Name Harunobu Daikoku
1st Author's Affiliation University of Tsukuba(Univ. Tsukuba)
2nd Author's Name Hideyuki Kawashima
2nd Author's Affiliation University of Tsukuba(Univ. Tsukuba)
3rd Author's Name Osamu Tatebe
3rd Author's Affiliation University of Tsukuba(Univ. Tsukuba)
Date 2016-09-13
Paper # DE2016-15
Volume (vol) vol.116
Number (no) DE-214
Page pp.pp.19-24(DE),
#Pages 6
Date of Issue 2016-09-06 (DE)