International Journal of Computer Vision ( IF 11.6 ) Pub Date : 2024-12-02 , DOI: 10.1007/s11263-024-02299-x Zehui Liao, Shishuai Hu, Yutong Xie, Yong Xia
Noise transition matrix estimation is a promising approach for learning with label noise. It can infer clean posterior probabilities, known as Label Distribution (LD), based on noisy ones and reduce the impact of noisy labels. However, this estimation is challenging, since the ground truth labels are not always available. Most existing methods estimate a global noise transition matrix using either correctly labeled samples (anchor points) or detected reliable samples (pseudo anchor points). These methods heavily rely on the existence of anchor points or the quality of pseudo ones, and the global noise transition matrix can hardly provide accurate label transition information for each sample, since the label noise in real applications is mostly instance-dependent. To address these challenges, we propose an Instance-dependent Label Distribution Estimation (ILDE) method to learn from noisy labels for image classification. The method’s workflow has three major steps. First, we estimate each sample’s noisy posterior probability, supervised by noisy labels. Second, since mislabeling probability closely correlates with inter-class correlation, we compute the inter-class correlation matrix to estimate the noise transition matrix, bypassing the need for (pseudo) anchor points. Moreover, for a precise approximation of the instance-dependent noise transition matrix, we calculate the inter-class correlation matrix using only mini-batch samples rather than the entire training dataset. Third, we transform the noisy posterior probability into instance-dependent LD by multiplying it with the estimated noise transition matrix, using the resulting LD for enhanced supervision to prevent DCNNs from memorizing noisy labels. The proposed ILDE method has been evaluated against several state-of-the-art methods on two synthetic and three real-world noisy datasets. Our results indicate that the proposed ILDE method outperforms all competing methods, no matter whether the noise is synthetic or real noise.
中文翻译:
用于标签噪声学习的实例依赖标签分布估计
噪声过渡矩阵估计是一种很有前途的标签噪声学习方法。它可以根据有噪声的概率推断干净的后验概率,称为标签分布 (LD),并减少有噪声标签的影响。但是,这种估计具有挑战性,因为真值标签并不总是可用。大多数现有方法使用正确标记的样本(锚点)或检测到的可靠样本(伪锚点)来估计全局噪声过渡矩阵。这些方法严重依赖于锚点的存在或伪锚点的质量,并且全局噪声过渡矩阵很难为每个样本提供准确的标签过渡信息,因为实际应用中的标签噪声主要依赖于实例。为了应对这些挑战,我们提出了一种实例依赖性标签分布估计 (ILDE) 方法,从嘈杂的标签中学习以进行图像分类。该方法的工作流程有三个主要步骤。首先,我们估计每个样本的噪声后验概率,由噪声标签监督。其次,由于错误标记概率与类间相关性密切相关,因此我们计算类间相关矩阵来估计噪声转换矩阵,绕过了对(伪)锚点的需求。此外,为了精确近似实例相关的噪声过渡矩阵,我们仅使用小批量样本而不是整个训练数据集来计算类间相关矩阵。第三,我们将噪声后验概率与估计的噪声转换矩阵相乘,将其转换为实例依赖的 LD,使用得到的 LD 进行增强监督,以防止 DCNN 记住噪声标签。 所提出的 ILDE 方法已在两个合成数据集和三个真实世界噪声数据集上与几种最先进的方法进行了评估。我们的结果表明,所提出的 ILDE 方法优于所有竞争方法,无论噪声是合成噪声还是真实噪声。