Constructing a Chinese text readability formula with multi-level linguistic features

No Thumbnail Available

Date

2012-11-15

Authors

Hong, J. F.
Tseng, H. C.
Li, Y. S.
Sung, Y.T.

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Previous readability research have adopted shallow linguistic features, which cannot fully reflect the complex process of reading comprehension. Given the differences between alphabetic and Chinese writing systems, the current study aims to select adequate features for Chinese texts, and construct a classifying model for Chinese texts. This study adopts 34 linguistic features under 10 dimensions selected from 4 levels, including word, semantics, syntax, and cohesion. Traditional linear and non-linear analyses did not address or compare the performances of uni-dimensional and multi-dimensional linguistic features. We therefore adopt discriminant analyses (DA) for linear analyses and the Support Vector Machine (SVM) for non-linear analyses to construct readability formulas, with both uni-dimensional and multidimensional features. By comparing the four approaches, this study shows that with multidimensional linguistic features, SVM can construct a relatively better mathematical readability model with an accuracy rate of texts classification reaching 69.95%.

Description

Keywords

Citation

Collections