当前位置: X-MOL 学术Russian Linguistics › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Ukrainian standard variants in the 20th century: stylometry to the rescue
Russian Linguistics ( IF 0.9 ) Pub Date : 2022-10-14 , DOI: 10.1007/s11185-022-09262-9
M. Zaidan Lahjouji-Seppälä , Achim Rabus , Ruprecht von Waldenfels

In this study, we use the General Regionally Annotated Corpus of Ukrainian (GRAC, www.uacorpus.org) as an experimental field for testing stylometric approaches for variationist analysis. While, in the last years, quantitative methods such as binomial mixed-effects regression models as well as machine-learning methods such as random forests have gained considerable popularity in corpus linguistics, methods from stylometry have not been used for variation-linguistic analysis very often. Using data from GRAC, we show that a stylometric approach can be useful to analyze the diachronic development of Standard Ukrainian in the 20th century. We take departure from the two main variants of Standard Ukrainian used in the interwar period in Soviet Ukraine, on the one hand, and Western Ukraine as it was part of the Polish republic, on the other. We ask: what can stylometry tell us about how these standards differed and about their subsequent fate in enlarged Soviet Ukraine after WWII?

Our analysis shows that certain specifically Western Ukrainian features common during the first decades of the 20th century did not find their way into the post-WWII standard, while others were retained. Moreover, we show that, by and large, stylometry shows a stronger continuity of the Eastern than the Western standard.

Methodologically, we demonstrate that stylometry can be used as a tool to start corpus-linguistic research from a bird’s-eye view and in an inductive manner, without formulating any hypotheses regarding particular variables, and later zoom in on hitherto unknown variables representing regional or diachronic differences.



中文翻译:

20 世纪乌克兰语标准变体:文体测量法的救援

在本研究中,我们使用乌克兰语通用区域注释语料库(GRAC,www.uacorpus.org)作为测试变异分析的文体计量方法的实验场。尽管近年来,二项式混合效应回归模型等定量方法以及随机森林等机器学习方法在语料库语言学中获得了相当大的普及,但文体测量学方法尚未经常用于变异语言分析。使用 GRAC 的数据,我们表明文体测量方法可用于分析 20 世纪标准乌克兰语的历时发展。一方面,我们与两次世界大战期间苏维埃乌克兰使用的标准乌克兰语的两个主要变体不同,另一方面,西乌克兰是波兰共和国的一部分。我们要问:关于这些标准有何不同,以及它们在二战后扩大的苏联乌克兰的后续命运,造型测量法能告诉我们什么?

我们的分析表明,20 世纪头几十年常见的某些特定的西乌克兰特征并未纳入二战后的标准,而其他特征则被保留。此外,我们发现,总的来说,文体测量法显示出东方标准比西方标准更强的连续性。

在方法论上,我们证明文体测量学可以作为一种工具,从鸟瞰角度以归纳的方式开始语料库语言研究,而无需对特定变量提出任何假设,然后放大代表区域或历时的迄今为止未知的变量差异。

更新日期:2022-10-14
down
wechat
bug