Word and Relation Embedding for Sentence Representation

abstract: In recent years, several methods have been proposed to encode sentences into fixed length continuous vectors called sentence representation or sentence embedding. With the recent advancements in various deep learning methods applied in Natural Language Processing (NLP), these representatio...

Full description

Bibliographic Details
Other Authors: Rath, Trideep (Author)
Format: Dissertation
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/2286/R.I.44259
id ndltd-asu.edu-item-44259
record_format oai_dc
spelling ndltd-asu.edu-item-442592018-06-22T03:08:33Z Word and Relation Embedding for Sentence Representation abstract: In recent years, several methods have been proposed to encode sentences into fixed length continuous vectors called sentence representation or sentence embedding. With the recent advancements in various deep learning methods applied in Natural Language Processing (NLP), these representations play a crucial role in tasks such as named entity recognition, question answering and sentence classification. Traditionally, sentence vector representations are learnt from its constituent word representations, also known as word embeddings. Various methods to learn the distributed representation (embedding) of words have been proposed using the notion of Distributional Semantics, i.e. “meaning of a word is characterized by the company it keeps”. However, principle of compositionality states that meaning of a sentence is a function of the meanings of words and also the way they are syntactically combined. In various recent methods for sentence representation, the syntactic information like dependency or relation between words have been largely ignored. In this work, I have explored the effectiveness of sentence representations that are composed of the representation of both, its constituent words and the relations between the words in a sentence. The word and relation embeddings are learned based on their context. These general-purpose embeddings can also be used as off-the- shelf semantic and syntactic features for various NLP tasks. Similarity Evaluation tasks was performed on two datasets showing the usefulness of the learned word embeddings. Experiments were conducted on three different sentence classification tasks showing that our sentence representations outperform the original word-based sentence representations, when used with the state-of-the-art Neural Network architectures. Dissertation/Thesis Rath, Trideep (Author) Baral, Chitta (Advisor) Li, Baoxin (Committee member) Yang, Yezhou (Committee member) Arizona State University (Publisher) Computer science Artificial intelligence Natural Language Processing sentence classification Sentence embeddings word2vec word and relation embedding eng 58 pages Masters Thesis Computer Science 2017 Masters Thesis http://hdl.handle.net/2286/R.I.44259 http://rightsstatements.org/vocab/InC/1.0/ All Rights Reserved 2017
collection NDLTD
language English
format Dissertation
sources NDLTD
topic Computer science
Artificial intelligence
Natural Language Processing
sentence classification
Sentence embeddings
word2vec
word and relation embedding
spellingShingle Computer science
Artificial intelligence
Natural Language Processing
sentence classification
Sentence embeddings
word2vec
word and relation embedding
Word and Relation Embedding for Sentence Representation
description abstract: In recent years, several methods have been proposed to encode sentences into fixed length continuous vectors called sentence representation or sentence embedding. With the recent advancements in various deep learning methods applied in Natural Language Processing (NLP), these representations play a crucial role in tasks such as named entity recognition, question answering and sentence classification. Traditionally, sentence vector representations are learnt from its constituent word representations, also known as word embeddings. Various methods to learn the distributed representation (embedding) of words have been proposed using the notion of Distributional Semantics, i.e. “meaning of a word is characterized by the company it keeps”. However, principle of compositionality states that meaning of a sentence is a function of the meanings of words and also the way they are syntactically combined. In various recent methods for sentence representation, the syntactic information like dependency or relation between words have been largely ignored. In this work, I have explored the effectiveness of sentence representations that are composed of the representation of both, its constituent words and the relations between the words in a sentence. The word and relation embeddings are learned based on their context. These general-purpose embeddings can also be used as off-the- shelf semantic and syntactic features for various NLP tasks. Similarity Evaluation tasks was performed on two datasets showing the usefulness of the learned word embeddings. Experiments were conducted on three different sentence classification tasks showing that our sentence representations outperform the original word-based sentence representations, when used with the state-of-the-art Neural Network architectures. === Dissertation/Thesis === Masters Thesis Computer Science 2017
author2 Rath, Trideep (Author)
author_facet Rath, Trideep (Author)
title Word and Relation Embedding for Sentence Representation
title_short Word and Relation Embedding for Sentence Representation
title_full Word and Relation Embedding for Sentence Representation
title_fullStr Word and Relation Embedding for Sentence Representation
title_full_unstemmed Word and Relation Embedding for Sentence Representation
title_sort word and relation embedding for sentence representation
publishDate 2017
url http://hdl.handle.net/2286/R.I.44259
_version_ 1718701489808998400