Scalable Machine Learning on Compact Data Representations

Yasuo Tabei

Summary

The 2018 International Symposium on Information Theory and Its Applications　(ISITA2018)

2018

Session Number:Mo-AM-1-2

Session:

Number:Mo-AM-1-2.1

Scalable Machine Learning on Compact Data Representations

Yasuo Tabei,

pp.26-30

Publication Date:2018/10/18

Online ISSN:2188-5079

DOI:10.34385/proc.55.Mo-AM-1-2.1

PDF download

Summary:

With massive high-dimensional data now common-place in research and industry, there is a strong and growing demand for more scalable computational techniques for data analysis and knowledge discovery. In this paper, we review scalable algorithms for learning statistical models on high-dimensional data. Especially, we introduce two techniques of lossless and lossy compressions. The first one is a method using grammar compression. Grammar compression is a lossless compression for texts and has been successfully applied to binary data matrices for scalable learning of statistical models. The second one is a method of lossy compressions named feature maps (FMs). Recently, quite a few number of FMs for kernel approximations have been proposed and have been used in practical applications. Those methods, of which we present a brief survey in this paper, open the door for large-scale analyses of massive and high-dimensional data.