当前位置: X-MOL 学术Journal of Vocational Behavior › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
“Anything you can do, I can do”: Examining the use of ChatGPT in situational judgement tests for professional program admission
Journal of Vocational Behavior ( IF 5.2 ) Pub Date : 2024-06-24 , DOI: 10.1016/j.jvb.2024.104013
Harley Harwood , Nicolas Roulin , Muhammad Zafar Iqbal

We explored the transformative impact of ChatGPT on applicants' responses and performance in situational judgement tests (SJTs), as well as the role played by faking-prevention mechanisms, in two complementary studies. Study 1 examined how the availability of ChatGPT influenced response content and performance of real applicants ( = 107,805), who completed an SJT for admission before vs. after the release of the technology. We found only small differences in content (e.g., slightly less “authentic” words used) and performance (slight score improvements when controlling for response length, no differences otherwise). In Study 2, we used an experimental approach with ( = 138) Prolific participants completing a mock SJT, while being instructed to use ChatGPT when responding (vs. use online resources or no resources). We found only slightly higher SJT scores for the ChatGPT users, but no difference in response content. Additionally, GPTZero (i.e., a popular AI detection tool) struggled to detect ChatGPT content, and generated many false positives, in both studies. This research advances our understanding of how the release and popularization of ChatGPT can influence applicant behaviors. Given the “arms race” nature of applicant selection, they also highlight the importance of designing assessments to prevent or limit faking. Yet, the ever-evolving nature of AI calls for continuous research on the topic.

中文翻译:


“你能做的,我也能做”:检验 ChatGPT 在专业课程入学情境判断测试中的使用



我们在两项补充研究中探讨了 ChatGPT 对申请人在情境判断测试 (SJT) 中的反应和表现的变革性影响,以及防作假机制所发挥的作用。研究 1 研究了 ChatGPT 的可用性如何影响真实申请者 (= 107,805) 的回复内容和表现,这些申请者在技术发布之前和之后完成了 SJT 入学考试。我们发现内容(例如,使用的“真实”词语稍少)和性能(控制响应长度时分数略有提高,其他方面没有差异)方面只有很小的差异。在研究 2 中,我们使用了一种实验方法,让 (= 138) 名多产参与者完成模拟 SJT,同时被指示在响应时使用 ChatGPT(相对于使用在线资源或不使用资源)。我们发现 ChatGPT 用户的 SJT 分数仅略高,但响应内容没有差异。此外,GPTZero(即流行的人工智能检测工具)在两项研究中都难以检测 ChatGPT 内容,并产生了许多误报。这项研究加深了我们对 ChatGPT 的发布和普及如何影响申请人行为的理解。鉴于申请人选择的“军备竞赛”性质,他们还强调了设计评估以防止或限制造假的重要性。然而,人工智能不断发展的本质要求对该主题进行持续研究。
更新日期:2024-06-24
down
wechat
bug