Fitting Chinese syllable-to-character mapping spectrum by the beta rank function |
| |
Authors: | Wentian Li |
| |
Institution: | The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institute for Medical Research, North Shore LIJ Health System, Manhasset, 350 Community Drive, NY 11030, USA |
| |
Abstract: | We define the syllable-to-character mapping spectrum in Chinese as the normalized number of characters per syllable ranked from high to low. This spectrum provides a statistical characterization of the relationship between spoken and written Chinese. We have shown that two functions, the logarithmic function and the beta rank function, fit the syllable-to-character mapping spectrum well. The beta rank function is even better than the logarithmic function judged by two measures of data-fitting performance: the sum of square errors, and Akaike information criterion. We comment on why the beta rank function is a good fitting function for many range-limited ranking data, whereas for range-open data it may be out-performed by other functions, such as a power-law function in the case of Zipf’s law. |
| |
Keywords: | Zipf&rsquo s law Beta rank function Akaike information criterion Chinese language |
本文献已被 ScienceDirect 等数据库收录! |
|