Historical Methods: A Journal of Quantitative and Interdisciplinary History ( IF 1.6 ) Pub Date : 2021-11-11 , DOI: 10.1080/01615440.2021.1985027 Jonas Helgertz 1, 2, 3 , Joseph Price 4 , Jacob Wellington 1 , Kelly J Thompson 1 , Steven Ruggles 1, 5 , Catherine A Fitch 1
Abstract
This paper presents a probabilistic method of record linkage, developed using the U.S. full count censuses of 1900 and 1910 but applicable to many sources of digitized historical records. The method links records using a two-step approach, first establishing high confidence matches among men by exploiting a comprehensive set of individual and contextual characteristics. The method then proceeds to link both men and women by leveraging links between households established in the first step. While only the first stage links can be directly comparable to other popular methods in research on the U.S., our method yields both considerably higher linkage rates and greater accuracy while only performing negligibly worse than other algorithms in resembling the target population.
中文翻译:
连接美国历史人口普查的新策略:IPUMS 多代纵向面板的案例研究
抽象的
本文提出了一种记录关联的概率方法,该方法是根据美国 1900 年和 1910 年的全面人口普查开发的,但适用于数字化历史记录的许多来源。该方法使用两步方法链接记录,首先通过利用一套全面的个人和背景特征在男性之间建立高置信度匹配。然后,该方法通过利用第一步中建立的家庭之间的联系来连接男性和女性。虽然只有第一阶段的链接可以直接与美国研究中的其他流行方法进行比较,但我们的方法产生了相当高的链接率和更高的准确度,而在与目标群体相似的方面仅比其他算法差得可以忽略不计。