
International Symposium on Nonlinear Theory and its Applications


Session Number:A2L-B



Structural Equivalence Between Co-occurrences of Characters and Words in the Chinese Language

Yuming Shi,  Wei Liang,  Jing Liu,  Chi K. Tse,  


Publication Date:2008/9/7

Online ISSN:2188-5079


PDF download (362.9KB)

Complex networks are constructed for studying the co-occurrence of characters and words in the Chinese language. Two types of networks are investigated. In the first type, nodes correspond to Chinese characters, and in the second type, nodes correspond to Chinese words. Moreover, edges correspond to connections of characters and/or words that occur consecutively. Networks are built from a collection of Chinese texts of four different styles, namely, essays, novels, popular science articles, and news reports. Their statistical properties are studied in terms of some complex network parameters, including average degree, diameter, average path length, clustering coefficient, degree distribution, as well as connected subnetworks. It is found that although these two kinds of networks have different parameter values, they display qualitatively similar properties, such as exhibition of small-world and scale-free features. This qualitative equivalence between the network of Chinese characters and the network of Chinese words provides a valid basis on which either types of networks can be used for comparing different languages regardless of the incompatibility of the linguistic roles that words play in the Chinese language and in other languages.