Quantification of textual comprehension difficulty with an information theory-based algorithm

Textual comprehension is often not adequately acquired despite intense didactic efforts. Textual comprehension quality is mostly evaluated using subjective criteria. Starting from the assumption that word usage statistics may be used to infer the probability of successful semantic representations, w...

Full description

Bibliographic Details
Main Authors:	Costa, K.M (Author), da Silva Filho, M. (Author), Ribeiro, L.B (Author), Rodrigues, A.R (Author)
Format:	Article
Language:	English
Published:	Palgrave Macmillan Ltd. 2019
Online Access:	View Fulltext in Publisher


LEADER	02038nam a2200169Ia 4500
001	10.1057-s41599-019-0311-0
008	220511s2019 CNT 000 0 und d
020			\|a 20551045 (ISSN)
245	1	0	\|a Quantification of textual comprehension difficulty with an information theory-based algorithm
260		0	\|b Palgrave Macmillan Ltd. \|c 2019
856			\|z View Fulltext in Publisher \|u https://doi.org/10.1057/s41599-019-0311-0
520	3		\|a Textual comprehension is often not adequately acquired despite intense didactic efforts. Textual comprehension quality is mostly evaluated using subjective criteria. Starting from the assumption that word usage statistics may be used to infer the probability of successful semantic representations, we hypothesized that textual comprehension depended on words with high occurrence probability (high degree of familiarity), which is typically inversely proportional to their information entropy. We tested this hypothesis by quantifying word occurrences in a bank of words from Portuguese language academic theses and using information theory tools to infer degrees of textual familiarity. We found that the lower and upper bounds of the database were delimited by low-entropy words with the highest probabilities of causing incomprehension (i.e., nouns and adjectives) or facilitating semantic decoding (i.e., prepositions and conjunctions). We developed an openly available software suite called CalcuLetra for implementing these algorithms and tested it on publicly available denotative text samples (e.g., articles, essays, and abstracts). We propose that the quantitative model presented here may apply to other languages and could be a tool for supporting automated textual comprehension evaluations, and potentially assisting the development of teaching materials or the diagnosis of learning disorders. © 2019, The Author(s).
700	1		\|a Costa, K.M. \|e author
700	1		\|a da Silva Filho, M. \|e author
700	1		\|a Ribeiro, L.B. \|e author
700	1		\|a Rodrigues, A.R. \|e author
773			\|t Palgrave Communications

Quantification of textual comprehension difficulty with an information theory-based algorithm

Similar Items