Measuring Catastrophic Forgetting in Cross-Lingual Classification: Transfer Paradigms and Tuning Strategies
Cross-lingual transfer leverages knowledge from a resource-rich source language, commonly English, to enhance performance in less-resourced target languages. Two widely used strategies are: Cross-Lingual Validation (CLV), which involves training on the source language and validating on the target la...
| 發表在: | IEEE Access |
|---|---|
| Main Authors: | , , , |
| 格式: | Article |
| 語言: | 英语 |
| 出版: |
IEEE
2025-01-01
|
| 主題: | |
| 在線閱讀: | https://ieeexplore.ieee.org/document/10892119/ |
| _version_ | 1849484226642051072 |
|---|---|
| author | Boshko Koloski Blaz Skrlj Marko Robnik-Sikonja Senja Pollak |
| author_facet | Boshko Koloski Blaz Skrlj Marko Robnik-Sikonja Senja Pollak |
| author_sort | Boshko Koloski |
| collection | DOAJ |
| container_title | IEEE Access |
| description | Cross-lingual transfer leverages knowledge from a resource-rich source language, commonly English, to enhance performance in less-resourced target languages. Two widely used strategies are: Cross-Lingual Validation (CLV), which involves training on the source language and validating on the target language, and Intermediate Training (IT), where models are first fine-tuned on the source language and then further trained on the target language. While both strategies have been studied, their effects on encoder-based models for classification tasks remain underexplored. In this paper, we systematically compare these strategies across six multilingual classification tasks, evaluating downstream performance, catastrophic forgetting, and both zero-shot and full-shot scenarios. Additionally, we contrast parameter-efficient adapter methods with full-parameter fine-tuning. Our results show that IT generally performs better in the target language, whereas CLV more effectively preserves source-language knowledge across multiple cross-lingual transfers. These findings underscore the trade-offs between optimizing target performance and mitigating catastrophic forgetting. |
| format | Article |
| id | doaj-art-6cc6dca4e6824ef9a3b501ed0729550c |
| institution | Directory of Open Access Journals |
| issn | 2169-3536 |
| language | English |
| publishDate | 2025-01-01 |
| publisher | IEEE |
| record_format | Article |
| spelling | doaj-art-6cc6dca4e6824ef9a3b501ed0729550c2025-08-20T03:11:14ZengIEEEIEEE Access2169-35362025-01-0113335093352010.1109/ACCESS.2025.354360810892119Measuring Catastrophic Forgetting in Cross-Lingual Classification: Transfer Paradigms and Tuning StrategiesBoshko Koloski0https://orcid.org/0000-0002-7330-0579Blaz Skrlj1https://orcid.org/0000-0002-9916-8756Marko Robnik-Sikonja2https://orcid.org/0000-0002-1232-3320Senja Pollak3Jožef Stefan Institute, Ljubljana, SloveniaJožef Stefan Institute, Ljubljana, SloveniaFaculty of Computer and Information Science, University of Ljubljana, Ljubljana, SloveniaJožef Stefan Institute, Ljubljana, SloveniaCross-lingual transfer leverages knowledge from a resource-rich source language, commonly English, to enhance performance in less-resourced target languages. Two widely used strategies are: Cross-Lingual Validation (CLV), which involves training on the source language and validating on the target language, and Intermediate Training (IT), where models are first fine-tuned on the source language and then further trained on the target language. While both strategies have been studied, their effects on encoder-based models for classification tasks remain underexplored. In this paper, we systematically compare these strategies across six multilingual classification tasks, evaluating downstream performance, catastrophic forgetting, and both zero-shot and full-shot scenarios. Additionally, we contrast parameter-efficient adapter methods with full-parameter fine-tuning. Our results show that IT generally performs better in the target language, whereas CLV more effectively preserves source-language knowledge across multiple cross-lingual transfers. These findings underscore the trade-offs between optimizing target performance and mitigating catastrophic forgetting.https://ieeexplore.ieee.org/document/10892119/Cross-lingual learningcatastrophic-forgettingdocument classification |
| spellingShingle | Boshko Koloski Blaz Skrlj Marko Robnik-Sikonja Senja Pollak Measuring Catastrophic Forgetting in Cross-Lingual Classification: Transfer Paradigms and Tuning Strategies Cross-lingual learning catastrophic-forgetting document classification |
| title | Measuring Catastrophic Forgetting in Cross-Lingual Classification: Transfer Paradigms and Tuning Strategies |
| title_full | Measuring Catastrophic Forgetting in Cross-Lingual Classification: Transfer Paradigms and Tuning Strategies |
| title_fullStr | Measuring Catastrophic Forgetting in Cross-Lingual Classification: Transfer Paradigms and Tuning Strategies |
| title_full_unstemmed | Measuring Catastrophic Forgetting in Cross-Lingual Classification: Transfer Paradigms and Tuning Strategies |
| title_short | Measuring Catastrophic Forgetting in Cross-Lingual Classification: Transfer Paradigms and Tuning Strategies |
| title_sort | measuring catastrophic forgetting in cross lingual classification transfer paradigms and tuning strategies |
| topic | Cross-lingual learning catastrophic-forgetting document classification |
| url | https://ieeexplore.ieee.org/document/10892119/ |
| work_keys_str_mv | AT boshkokoloski measuringcatastrophicforgettingincrosslingualclassificationtransferparadigmsandtuningstrategies AT blazskrlj measuringcatastrophicforgettingincrosslingualclassificationtransferparadigmsandtuningstrategies AT markorobniksikonja measuringcatastrophicforgettingincrosslingualclassificationtransferparadigmsandtuningstrategies AT senjapollak measuringcatastrophicforgettingincrosslingualclassificationtransferparadigmsandtuningstrategies |
