Natural language techniques for error correction

Dealing with human errors such as spelling or grammar mistakes is a necessary part of natural language processing. The aim of this project was to investigate how far error detection and correction could proceed when the system purview was set a sub-sentential stretch of text. This restriction comes...

Full description

Bibliographic Details
Main Author:	Bowden, T. G.
Published:	University of Cambridge 1997
Subjects:	006.3
Online Access:	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.596815

id	ndltd-bl.uk-oai-ethos.bl.uk-596815
record_format	oai_dc
spelling	ndltd-bl.uk-oai-ethos.bl.uk-5968152015-03-20T06:09:35ZNatural language techniques for error correctionBowden, T. G.1997Dealing with human errors such as spelling or grammar mistakes is a necessary part of natural language processing. The aim of this project was to investigate how far error detection and correction could proceed when the system purview was set a sub-sentential stretch of text. This restriction comes from cooperative error handling: detecting/correcting errors just after user entry, as the user is entering further text. Short context, or shallow, processing is also interesting because it is potentially cheaper and faster than a full-scale parse and because sentential constraints become less reliable when the 'sentence' is ill-formed. There has been no previous report on the effectiveness of local syntactic constraints on general (English) ill-formedness. Additionally all error processing programmes, other than some working in very restricted domains, have been post-processors rather than cooperative. Being post-processors, previous programs have been concerned with errors left undetected, after some degree of proofreading. Cooperative processing is also aimed at the errors people spend time backtracking to catch. In the absence of existent suitable data, a corpus of keystrokes made by subjects entering a piece of text was collated; errors were classified as caught or uncaught and various interesting analyses emerged. For context-less processing, a method based on morphological error rules and another on binary positional trigrams were devised and compared. Then to incorporate context, local syntactic constraints based on tag information were implemented, using bigram and triggram co-occurrence checks with a Markov tagging procedure. The tag-based constraints were compared with a partial parsing method. These error handlers were evaluated on data from the Keystroke Corpus and on other data manufactured and collected. The morphological error rules and tag-based checks using very short context were the most promising. As far as current comparison allows, there being a scarcity of reported results in this area, the short context techniques implemented here compared well against full-parsing error handlers. Ideas outlined for future work include a method for further identifying detected word scope errors and a practical, usable cooperative corrector based on an extension of an existing commercial application.006.3University of Cambridgehttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.596815Electronic Thesis or Dissertation
collection	NDLTD
sources	NDLTD
topic	006.3
spellingShingle	006.3 Bowden, T. G. Natural language techniques for error correction
description	Dealing with human errors such as spelling or grammar mistakes is a necessary part of natural language processing. The aim of this project was to investigate how far error detection and correction could proceed when the system purview was set a sub-sentential stretch of text. This restriction comes from cooperative error handling: detecting/correcting errors just after user entry, as the user is entering further text. Short context, or shallow, processing is also interesting because it is potentially cheaper and faster than a full-scale parse and because sentential constraints become less reliable when the 'sentence' is ill-formed. There has been no previous report on the effectiveness of local syntactic constraints on general (English) ill-formedness. Additionally all error processing programmes, other than some working in very restricted domains, have been post-processors rather than cooperative. Being post-processors, previous programs have been concerned with errors left undetected, after some degree of proofreading. Cooperative processing is also aimed at the errors people spend time backtracking to catch. In the absence of existent suitable data, a corpus of keystrokes made by subjects entering a piece of text was collated; errors were classified as caught or uncaught and various interesting analyses emerged. For context-less processing, a method based on morphological error rules and another on binary positional trigrams were devised and compared. Then to incorporate context, local syntactic constraints based on tag information were implemented, using bigram and triggram co-occurrence checks with a Markov tagging procedure. The tag-based constraints were compared with a partial parsing method. These error handlers were evaluated on data from the Keystroke Corpus and on other data manufactured and collected. The morphological error rules and tag-based checks using very short context were the most promising. As far as current comparison allows, there being a scarcity of reported results in this area, the short context techniques implemented here compared well against full-parsing error handlers. Ideas outlined for future work include a method for further identifying detected word scope errors and a practical, usable cooperative corrector based on an extension of an existing commercial application.
author	Bowden, T. G.
author_facet	Bowden, T. G.
author_sort	Bowden, T. G.
title	Natural language techniques for error correction
title_short	Natural language techniques for error correction
title_full	Natural language techniques for error correction
title_fullStr	Natural language techniques for error correction
title_full_unstemmed	Natural language techniques for error correction
title_sort	natural language techniques for error correction
publisher	University of Cambridge
publishDate	1997
url	http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.596815
work_keys_str_mv	AT bowdentg naturallanguagetechniquesforerrorcorrection
_version_	1716796569199575040

Natural language techniques for error correction

Similar Items