Presentation | 2023-12-21 A linear time approximation of Wasserstein distance with word embedding selection Sho Otao, Makoto Yamada, |
---|---|
PDF Download Page | PDF download Page Link |
Abstract(in Japanese) | (See Japanese page) |
Abstract(in English) | Wasserstein distance, which can be computed by solving the optimal transport problem, is a powerful method for measuring the distance between distributions. In the NLP community, it is referred as word mover's distance to measure dissimilarity between documents, treating documents as word distributions. One of the key challenges of Wasserstein distance is its computational cost since it needs cubic time. Although the Sinkhorn algorithm is a powerful tool to speed up to compute the Wasserstein distance, it still requires square time. Recently, a linear time approximation of the Wasserstein distance including the sliced Wasserstein and the tree-Wasserstein distance has been proposed. However, the linear time approximation method suffers when the dimensionality of input vectors is high. In this study, we propose a method to combine feature selection and tree approximation of Wasserstein distance to handle high-dimensional problems and compute dissimilarity between documents rapidly. More specifically, we concatenate multiple word embeddings and automatically select useful word embeddings from a concatenated embedding in a tree approximation of Wasserstein distance. To this end, we approximate Wasserstein distance for each word vector by tree approximation technique, and select the discriminative (i.e., large Wasserstein distance) word embeddings by solving an entropic regularized maximization problem. Through our synthetic experiments, we confirmed the efficacy of feature selection in our proposed method. Through our experiments on document classification, our proposed method outperformed the method that directly uses the concatenated embedding and achieved consistently high performance on all datasets. |
Keyword(in Japanese) | (See Japanese page) |
Keyword(in English) | Optimal Transport / Group Feature Selection / Document Classification / Word Embedding |
Paper # | IBISML2023-38 |
Date of Issue | 2023-12-13 (IBISML) |
Conference Information | |
Committee | IBISML |
---|---|
Conference Date | 2023/12/20(2days) |
Place (in Japanese) | (See Japanese page) |
Place (in English) | National Institute of Informatics |
Topics (in Japanese) | (See Japanese page) |
Topics (in English) | machine learning, etc. |
Chair | Masashi Sugiyama(Univ. of Tokyo) |
Vice Chair | Toshihiro Kamishima(AIST) / Koji Tsuda(Univ. of Tokyo) |
Secretary | Toshihiro Kamishima(NTT) / Koji Tsuda(Hokkaido Univ.) |
Assistant | Yoshinobu Kawahara(Osaka Univ.) / Taiji Suzuki(Univ.of Tokyo) |
Paper Information | |
Registration To | Technical Committee on Information-Based Induction Sciences and Machine Learning |
---|---|
Language | ENG-JTITLE |
Title (in Japanese) | (See Japanese page) |
Sub Title (in Japanese) | (See Japanese page) |
Title (in English) | A linear time approximation of Wasserstein distance with word embedding selection |
Sub Title (in English) | |
Keyword(1) | Optimal Transport |
Keyword(2) | Group Feature Selection |
Keyword(3) | Document Classification |
Keyword(4) | Word Embedding |
1st Author's Name | Sho Otao |
1st Author's Affiliation | Kyoto University(Kyoto Univ.) |
2nd Author's Name | Makoto Yamada |
2nd Author's Affiliation | Okinawa Institute of Science and Technology(OIST) |
Date | 2023-12-21 |
Paper # | IBISML2023-38 |
Volume (vol) | vol.123 |
Number (no) | IBISML-311 |
Page | pp.pp.50-57(IBISML), |
#Pages | 8 |
Date of Issue | 2023-12-13 (IBISML) |