大会名称
2023年 総合大会
大会コ-ド
2023G
開催年
2023
発行日
2023-02-28
セッション番号
D-20
セッション名
情報論的学習理論と機械学習
講演日
2023/3/8
講演場所(会議室等)
2号館 2401教室
講演番号
D-20-19
タイトル
Domain Adaptation for Japanese Sentence Embedding Models with Contrastive Learning
著者名
◎Zihao ChenHisashi HandaKimiaki Shirahama
キーワード
Domain adaptation, Contrastive learning, Data generation, Low-resource language
抄録
We propose a novel Japanese sentence representation framework JCSE that creates training data for domain adaptation by generating sentences and synthesizing them with sentences available in a target domain. Specifically, a pre-trained data generator is finetuned to the target domain using our collected corpus. It is then used to generate contradictory sentence pairs that are used in contrastive learning for adapting a Japanese language model to a specific task in the target domain. The experimental results show that JCSE achieves significant performance improvement surpassing direct transfer and other training strategies. We believe that JCSE paves a practicable way for domain-specific downstream tasks in low-resource languages like Japanese.
本文pdf
PDF download   

PayPerView