The Influence of M-BERT and Sizes on the Choice of Transfer Languages in Parsing

In this thesis, we explore the impact of M-BERT and different transfer sizes on the choice of different transfer languages in dependency parsing. In order to investigate our research questions, we conduct a series of experiments on the treebanks in Universal Dependencies with UUParser. The main...

Full description

Bibliographic Details
Main Author:	Zhang, Yifei
Format:	Others
Language:	English
Published:	Uppsala universitet, Institutionen för lingvistik och filologi 2021
Subjects:	Language Technology (Computational Linguistics) Språkteknologi (språkvetenskaplig databehandling)
Online Access:	http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-446094

id	ndltd-UPSALLA1-oai-DiVA.org-uu-446094
record_format	oai_dc
spelling	ndltd-UPSALLA1-oai-DiVA.org-uu-4460942021-06-18T05:30:39ZThe Influence of M-BERT and Sizes on the Choice of Transfer Languages in ParsingengZhang, YifeiUppsala universitet, Institutionen för lingvistik och filologi2021Language Technology (Computational Linguistics)Språkteknologi (språkvetenskaplig databehandling)In this thesis, we explore the impact of M-BERT and different transfer sizes on the choice of different transfer languages in dependency parsing. In order to investigate our research questions, we conduct a series of experiments on the treebanks in Universal Dependencies with UUParser. The main conclusions and contributions of this study are as follows: First, we train a variety of languages in several different scripts with M-BERT being added into the parsing framework, which is one of the most state-of-the-art deep learning models based on the Transformer architecture. In general, we get advancing results with M-BERT compared with the randomly initialized embedding in UUParser. Second, since it is a common way to choose a source language, which is 'close' to the target language in cross-lingual parsing, we try to explore what 'close' languages actually are, as there is not a definition for 'close'. In our study, we explore how strongly the parsing results are correlated with the different linguistic distances between the source and target languages. The relevant data is queried from URIEL Database. We find that the parsing performance is more dependent on inventory, syntactic and featural distance than on the geographic, genetic and phonological distance in zero-shot experiments. In the few-shot prediction, the parsing accuracy shows stronger correlation with inventory and syntactic distance than with others. Third, we vary the training sizes in few-shot experiments with M-BERT being added to see how the parsing results are influenced. We find that it is very obvious that few-shot experiments outperform zero-shot experiments. With the source sizes being cut, all parsing scores decrease. However, we do not see a linear drop of the results. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-446094application/pdfinfo:eu-repo/semantics/openAccess
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Language Technology (Computational Linguistics) Språkteknologi (språkvetenskaplig databehandling)
spellingShingle	Language Technology (Computational Linguistics) Språkteknologi (språkvetenskaplig databehandling) Zhang, Yifei The Influence of M-BERT and Sizes on the Choice of Transfer Languages in Parsing
description	In this thesis, we explore the impact of M-BERT and different transfer sizes on the choice of different transfer languages in dependency parsing. In order to investigate our research questions, we conduct a series of experiments on the treebanks in Universal Dependencies with UUParser. The main conclusions and contributions of this study are as follows: First, we train a variety of languages in several different scripts with M-BERT being added into the parsing framework, which is one of the most state-of-the-art deep learning models based on the Transformer architecture. In general, we get advancing results with M-BERT compared with the randomly initialized embedding in UUParser. Second, since it is a common way to choose a source language, which is 'close' to the target language in cross-lingual parsing, we try to explore what 'close' languages actually are, as there is not a definition for 'close'. In our study, we explore how strongly the parsing results are correlated with the different linguistic distances between the source and target languages. The relevant data is queried from URIEL Database. We find that the parsing performance is more dependent on inventory, syntactic and featural distance than on the geographic, genetic and phonological distance in zero-shot experiments. In the few-shot prediction, the parsing accuracy shows stronger correlation with inventory and syntactic distance than with others. Third, we vary the training sizes in few-shot experiments with M-BERT being added to see how the parsing results are influenced. We find that it is very obvious that few-shot experiments outperform zero-shot experiments. With the source sizes being cut, all parsing scores decrease. However, we do not see a linear drop of the results.
author	Zhang, Yifei
author_facet	Zhang, Yifei
author_sort	Zhang, Yifei
title	The Influence of M-BERT and Sizes on the Choice of Transfer Languages in Parsing
title_short	The Influence of M-BERT and Sizes on the Choice of Transfer Languages in Parsing
title_full	The Influence of M-BERT and Sizes on the Choice of Transfer Languages in Parsing
title_fullStr	The Influence of M-BERT and Sizes on the Choice of Transfer Languages in Parsing
title_full_unstemmed	The Influence of M-BERT and Sizes on the Choice of Transfer Languages in Parsing
title_sort	influence of m-bert and sizes on the choice of transfer languages in parsing
publisher	Uppsala universitet, Institutionen för lingvistik och filologi
publishDate	2021
url	http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-446094
work_keys_str_mv	AT zhangyifei theinfluenceofmbertandsizesonthechoiceoftransferlanguagesinparsing AT zhangyifei influenceofmbertandsizesonthechoiceoftransferlanguagesinparsing
_version_	1719411158557917184

The Influence of M-BERT and Sizes on the Choice of Transfer Languages in Parsing

Similar Items