From Natural Language Specifications to Program Input Parsers

We present a method for automatically generating input parsers from English specifications of input file formats. We use a Bayesian generative model to capture relevant natural language phenomena and translate the English specification into a specification tree, which is then translated into a C++ i...

Full description

Bibliographic Details
Main Authors:	Lei, Tao (Contributor), Long, Fan (Contributor), Barzilay, Regina (Contributor), Rinard, Martin C. (Contributor)
Other Authors:	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science (Contributor)
Format:	Article
Language:	English
Published:	Association for Computational Linguistics (ACL), 2013-07-22T15:40:26Z.
Subjects:	Article
Online Access:	Get fulltext


LEADER	02025 am a22002653u 4500
001	79643
042			\|a dc
100	1	0	\|a Lei, Tao \|e author
100	1	0	\|a Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science \|e contributor
100	1	0	\|a Lei, Tao \|e contributor
100	1	0	\|a Long, Fan \|e contributor
100	1	0	\|a Barzilay, Regina \|e contributor
100	1	0	\|a Rinard, Martin C. \|e contributor
700	1	0	\|a Long, Fan \|e author
700	1	0	\|a Barzilay, Regina \|e author
700	1	0	\|a Rinard, Martin C. \|e author
245	0	0	\|a From Natural Language Specifications to Program Input Parsers
260			\|b Association for Computational Linguistics (ACL), \|c 2013-07-22T15:40:26Z.
856			\|z Get fulltext \|u http://hdl.handle.net/1721.1/79643
520			\|a We present a method for automatically generating input parsers from English specifications of input file formats. We use a Bayesian generative model to capture relevant natural language phenomena and translate the English specification into a specification tree, which is then translated into a C++ input parser. We model the problem as a joint dependency parsing and semantic role labeling task. Our method is based on two sources of information: (1) the correlation between the text and the specification tree and (2) noisy supervision as determined by the success of the generated C++ parser in reading input examples. Our results show that our approach achieves 80.0\% F-Score accuracy compared to an F-Score of 66.7\% produced by a state-of-the-art semantic parser on a dataset of input format specifications from the ACM International Collegiate Programming Contest (which were written in English for humans with no intention of providing support for automated processing)
520			\|a National Science Foundation (U.S.) (Grant IIS-0835652)
520			\|a Battelle Memorial Institute (PO #300662)
546			\|a en_US
655	7		\|a Article
773			\|t Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (ACL 2013)

From Natural Language Specifications to Program Input Parsers

Similar Items