연구성과

학술발표
논문명 SCOPA: Soft Code-Switching and Pairwise Alignment for Zero-shot Cross-lingual Transfer
개최일 2021.11.01
학술회의명 30th ACM International Conference on Information and Knowledge Management
책임교수
구분 구두발표
제1저자 Dohyeon Lee, Jaeseong Lee, Gyewon Lee
교신저자 Seung-Won Hwang
공동저자 Seung-Won Hwang, Byung-Gon Chun
국내/국외 국외
개최국가 N/A
주관기관

The recent advent of cross-lingual embeddings, such as multilingual BERT (mBERT), provides a strong baseline for zero-shot crosslingual transfer. There also exists increasing research attention to reduce the alignment discrepancy of cross-lingual embeddings between source and target languages, via generating code-switched sentences by substituting randomly selected words in the source languages with their counterparts of the target languages. Although these approaches improve the performance, naïvely code-switched sentences can have inherent limitations. In this paper, we propose SCOPA, a novel technique to improve the performance of zero-shot cross-lingual transfer. Instead of using the embeddings of codeswitched sentences directly, SCOPA mixes them softly with the embeddings of original sentences. In addition, SCOPA utilizes an additional pairwise alignment objective, which aligns the vector differences of word pairs instead of word-level embeddings, in order to transfer contextualized information between different languages while preserving language-specific information. Experiments on the PAWS-X and MLDoc dataset show the effectiveness of SCOPA. 

04620 서울특별시 중구 필동로1길 30 동국대학교 Knowledge Science 연구센터(KSRC) Tel.02-2290-1441
Copyright© 2021 DONGGUK UNIVERSITY. ALL RIGHTS RESERVED.

×