Presentation 2023-12-21
A linear time approximation of Wasserstein distance with word embedding selection
Sho Otao, Makoto Yamada,
PDF Download Page PDF download Page Link
Abstract(in Japanese) (See Japanese page)
Abstract(in English) Wasserstein distance, which can be computed by solving the optimal transport problem, is a powerful method for measuring the distance between distributions. In the NLP community, it is referred as word mover's distance to measure dissimilarity between documents, treating documents as word distributions. One of the key challenges of Wasserstein distance is its computational cost since it needs cubic time. Although the Sinkhorn algorithm is a powerful tool to speed up to compute the Wasserstein distance, it still requires square time. Recently, a linear time approximation of the Wasserstein distance including the sliced Wasserstein and the tree-Wasserstein distance has been proposed. However, the linear time approximation method suffers when the dimensionality of input vectors is high. In this study, we propose a method to combine feature selection and tree approximation of Wasserstein distance to handle high-dimensional problems and compute dissimilarity between documents rapidly. More specifically, we concatenate multiple word embeddings and automatically select useful word embeddings from a concatenated embedding in a tree approximation of Wasserstein distance. To this end, we approximate Wasserstein distance for each word vector by tree approximation technique, and select the discriminative (i.e., large Wasserstein distance) word embeddings by solving an entropic regularized maximization problem. Through our synthetic experiments, we confirmed the efficacy of feature selection in our proposed method. Through our experiments on document classification, our proposed method outperformed the method that directly uses the concatenated embedding and achieved consistently high performance on all datasets.
Keyword(in Japanese) (See Japanese page)
Keyword(in English) Optimal Transport / Group Feature Selection / Document Classification / Word Embedding
Paper # IBISML2023-38
Date of Issue 2023-12-13 (IBISML)

Conference Information
Committee IBISML
Conference Date 2023/12/20(2days)
Place (in Japanese) (See Japanese page)
Place (in English) National Institute of Informatics
Topics (in Japanese) (See Japanese page)
Topics (in English) machine learning, etc.
Chair Masashi Sugiyama(Univ. of Tokyo)
Vice Chair Toshihiro Kamishima(AIST) / Koji Tsuda(Univ. of Tokyo)
Secretary Toshihiro Kamishima(NTT) / Koji Tsuda(Hokkaido Univ.)
Assistant Yoshinobu Kawahara(Osaka Univ.) / Taiji Suzuki(Univ.of Tokyo)

Paper Information
Registration To Technical Committee on Information-Based Induction Sciences and Machine Learning
Language ENG-JTITLE
Title (in Japanese) (See Japanese page)
Sub Title (in Japanese) (See Japanese page)
Title (in English) A linear time approximation of Wasserstein distance with word embedding selection
Sub Title (in English)
Keyword(1) Optimal Transport
Keyword(2) Group Feature Selection
Keyword(3) Document Classification
Keyword(4) Word Embedding
1st Author's Name Sho Otao
1st Author's Affiliation Kyoto University(Kyoto Univ.)
2nd Author's Name Makoto Yamada
2nd Author's Affiliation Okinawa Institute of Science and Technology(OIST)
Date 2023-12-21
Paper # IBISML2023-38
Volume (vol) vol.123
Number (no) IBISML-311
Page pp.pp.50-57(IBISML),
#Pages 8
Date of Issue 2023-12-13 (IBISML)