TY - CHAP
T1 - Applying functional partition in the investigation of lexical tonal-pattern categories in an under-resourced Chinese dialect
AU - Wu, Junru
AU - Chen, Yiya
AU - van Heuven, Vincent J.
AU - Schiller, Niels O.
N1 - Funding Information:
Acknowledgements. J. Wu’s work was supported by a PhD Scholarship sponsored by Talent and Training China-Netherlands Program, by “Chenguang Program” supported by Shanghai Education Development Foundation and Shanghai Municipal Education Commission, and by Shanghai Philosophy and Social Sciences Fund (Grant number 2017BYY001). We would like to thank the support to Yiya Chen from the European Research Council (ERC-Starting Grant 206198).
Publisher Copyright:
© 2018, Springer Nature Singapore Pte Ltd.
PY - 2018
Y1 - 2018
N2 - The present study applied functional partition to investigate disyllabic lexical tonal-pattern categories in an under-resourced Chinese dialect, Jinan Mandarin. A Two-Stage partitioning procedure was introduced to process a multi-speaker corpus that contains irregular lexical variants in a semi-automatic way. In the first stage, a program provides suggestions for the phonetician to decide the lexical tonal variants for the recordings of each word, based on the result of a functional k-means partitioning algorithm and tonal information from an available pronunciation dictionary of a related Chinese dialect, i.e. Standard Chinese. The second stage iterates a functional version of k-means partitioning with Silhouette-based criteria to abstract an optimal number of tonal patterns from the whole corpus, which also allows the phoneticians to adjust the results of the automatic procedure in a controlled way and so redo partitioning for a subset of clusters. The procedure yielded eleven disyllabic tonal patterns for Jinan Mandarin, representing the tonal system used by contemporary Jinan Mandarin speakers from a wide range of age groups. The procedure used in this paper is different from previous linguistic descriptions, which were based on more elderly speakers’ pronunciations. This method incorporates phoneticians’ linguistic knowledge and preliminary linguistic resources into the procedure of partitioning. It can improve the efficiency and objectivity in the investigation of lexical tonal-pattern categories when building pronunciation dictionaries for under-resourced languages.
AB - The present study applied functional partition to investigate disyllabic lexical tonal-pattern categories in an under-resourced Chinese dialect, Jinan Mandarin. A Two-Stage partitioning procedure was introduced to process a multi-speaker corpus that contains irregular lexical variants in a semi-automatic way. In the first stage, a program provides suggestions for the phonetician to decide the lexical tonal variants for the recordings of each word, based on the result of a functional k-means partitioning algorithm and tonal information from an available pronunciation dictionary of a related Chinese dialect, i.e. Standard Chinese. The second stage iterates a functional version of k-means partitioning with Silhouette-based criteria to abstract an optimal number of tonal patterns from the whole corpus, which also allows the phoneticians to adjust the results of the automatic procedure in a controlled way and so redo partitioning for a subset of clusters. The procedure yielded eleven disyllabic tonal patterns for Jinan Mandarin, representing the tonal system used by contemporary Jinan Mandarin speakers from a wide range of age groups. The procedure used in this paper is different from previous linguistic descriptions, which were based on more elderly speakers’ pronunciations. This method incorporates phoneticians’ linguistic knowledge and preliminary linguistic resources into the procedure of partitioning. It can improve the efficiency and objectivity in the investigation of lexical tonal-pattern categories when building pronunciation dictionaries for under-resourced languages.
KW - K-means partition
KW - Pattern recognition
KW - Phonetics
KW - Pronunciation dictionary
KW - Tone
UR - http://www.scopus.com/inward/record.url?scp=85042122191&partnerID=8YFLogxK
U2 - 10.1007/978-981-10-8111-8_3
DO - 10.1007/978-981-10-8111-8_3
M3 - Contribution to conference proceedings
AN - SCOPUS:85042122191
SN - 9789811081101
T3 - Communications in Computer and Information Science
SP - 24
EP - 35
BT - Man-Machine Speech Communication - 14th National Conference, NCMMSC 2017, Revised Selected Papers
A2 - Li, Ya
A2 - Zheng, Thomas Fang
A2 - Bao, Changchun
A2 - Wang, Dong
A2 - Tao, Jianhua
PB - Springer-Verlag Italia
T2 - 14th National Conference on Man-Machine Speech Communication, NCMMSC 2017
Y2 - 11 October 2017 through 13 October 2017
ER -