Summary

International Conference on Emerging Technologies for Communications

2020

Session Number:P2

Session:

Number:P2-1

Hadoop I/O Performance Improvement with ext4 on HDD in fully distributed mode

Makoto Nakagami,  Jose A.B. Fortes,  Saneyasu Yamaguchi,  

pp.-

Publication Date:2020/12/2

Online ISSN:2188-5079

DOI:10.34385/proc.63.P2-1

PDF download

PayPerView

Summary:
In our previous work, we showed that the Ext4 filesystem did not actively utilize free spaces that were obtained by file deletion and this feature declined the performance of applications that repeated sequential storage accesses such as Hadoop jobs. We then proposed a method for improving such applications by actively placing files in the faster zones of hard disk drives in Ext4 and evaluated the proposed method with a Hadoop application with a stand-alone mode, which is the pseudo-distributed mode. In this paper, we evaluate the method in a more practical situation, i.e. fully distributed mode, and show that the method works effectively in this situation.