当前位置: X-MOL 学术Communication Research › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Inferring human vision in a human-like way: Key factors influencing the cognitive processing of level-1 visual perspective-taking
Communication Research ( IF 4.9 ) Pub Date : 2024-11-28 , DOI: 10.1177/00936502241302569
Song Zhou, Huaqi Yang, Ming Ye, Ning Ding, Tao Liu

The advancement of artificial intelligence (AI) has expanded the potential for human-machine communication and collaboration in complex contexts, necessitating AI to exhibit human-like behavior in order to align with its human counterpart. Consequently, understanding human behavioral traits becomes advantageous for developing AI agents that resemble humans. This study investigated how individuals process visual information from others to inform the future design of intelligent vision systems. Through four experiments, participants were tasked with assessing whether a given number corresponds to the number of balls while manipulating the gaze direction of an avatar by averting its eyes or altering its head orientation. The results indicate that participant response times were influenced regardless of the avatar’s gaze direction. Specifically, when the avatar was positioned with its back facing the balls, any disparity in participant performance across different conditions is eliminated. These findings suggest that implicit level-1 visual perspective-taking may not primarily rely on gaze direction but rather on perceiving affordances within the environment. Such insights contribute to a deeper understanding of cognitive mechanisms underlying level-1 visual perspective-taking and can serve as a theoretical foundation for advancing AI vision algorithms in human-machine communication and collaboration.

中文翻译:


以类似人类的方式推断人类视觉:影响 1 级视觉透视认知加工的关键因素



人工智能 (AI) 的进步扩大了复杂环境中人机通信和协作的潜力,需要 AI 表现出类似人类的行为,以便与人类保持一致。因此,了解人类行为特征对于开发类似于人类的 AI 代理变得有利。本研究调查了个人如何处理来自他人的视觉信息,以告知智能视觉系统的未来设计。通过四项实验,参与者的任务是评估给定的数字是否与球的数量相对应,同时通过转移眼睛或改变头部方向来操纵化身的注视方向。结果表明,无论头像的注视方向如何,参与者的反应时间都会受到影响。具体来说,当虚拟形象的背部面向球的位置时,参与者在不同条件下的表现差异就会被消除。这些发现表明,隐式 1 级视觉透视可能主要不依赖于注视方向,而是依赖于感知环境中的可供性。这些见解有助于更深入地理解 1 级视觉透视背后的认知机制,并可以作为在人机通信和协作中推进 AI 视觉算法的理论基础。
更新日期:2024-11-28
down
wechat
bug