Abstract
In this paper we explore the Japanese dialect by applying PMI Levenshtein distance to 2400 localities and 37 items of the Linguistic Atlas of Japan Database (LAJDB). Ward’s clustering and t-distributed stochastic neighbor embedding (t-SNE) were applied to the average Levenshtein distances among the 2400 localities. Using these techniques we obtained an area map which divides the dialect landscape in areas, and a RGB map that visualized the dialect landscape a as dialect continuum. On the basis of this huge data set we are able to generate very detailed results. Using this methodology we do not need to make subjective choices but provide objective results. We found five optimal groups representing the Tohuku dialects, Eastern dialect, Western dialect, Kyushu dialect and Ryukyuan dialect. We especially focused on the the Ryukyuan varieties and found a division in three groups: Amami dialects, Okinawan dialects and Southern Ryukyuan dialects. Both on the global level and on a more detailed level the results mainly confirmed results from earlier studies.
Original language | English |
---|---|
Pages (from-to) | 1-44 |
Number of pages | 44 |
Journal | Studies in Geolinguistics |
Volume | 3 |
DOIs | |
Publication status | Published - 13 Oct 2023 |
Keywords
- dialectometry; Japanese dialects; Japanese dialectology; dialect classification; dialect areas