An empirical study of Chinese language networks |
| |
Authors: | Shuigeng Zhou Guobiao Hu Zhongzhi Zhang Jihong Guan |
| |
Institution: | a Department of Computer Science and Engineering, Fudan University, Shanghai 200433, China b Shanghai Key Lab of Intelligent Information Processing, Fudan University, Shanghai 200433, China c Department of Computer Science and Technology, Tongji University, 4800 Cao’an Road, Shanghai 201804, China |
| |
Abstract: | Chinese is spoken by the largest number of people in the world, and it is regarded as one of the most important languages. In this paper, we explore the statistical properties of Chinese language networks (CLNs) within the framework of complex network theory. Based on one of the largest Chinese corpora, i.e. People’s Daily Corpus, we construct two networks (CLN1 and CLN2) from two different respects, with Chinese words as nodes. In CLN1, a link between two nodes exists if they appear next to each other in at least one sentence; in CLN2, a link represents that two nodes appear simultaneously in a sentence. We show that both networks exhibit small-world effect, scale-free structure, hierarchical organization and disassortative mixing. These results indicate that in many topological aspects Chinese language shapes complex networks with organizing principles similar to other previously studied language systems, which shows that different languages may have some common characteristics in their evolution processes. We believe that our research may shed some new light into the Chinese language and find some potentially significant implications. |
| |
Keywords: | 89 75 Hc 89 75 -k 89 75 Fb |
本文献已被 ScienceDirect 等数据库收录! |
|