The study compares LLMs and human interviewers across two experiments involving children recalling a witnessed mock event. It examines both how rapport is established and how questions are formulated during interviews, two key elements in obtaining reliable testimony.
In the first experiment, LLM-led interviews resulted in higher verbal engagement from children and more detailed responses. While children rated both human and AI interviewers similarly, the LLM’s approach to rapport-building appeared to support more accurate recall in later stages of the interview.
In the second experiment, where interviewers generated their own memory questions, LLMs asked fewer questions and relied less on recommended open-ended prompts compared to trained human interviewers. However, individual AI-generated questions were often efficient in eliciting correct details, though they also led to more errors related to central aspects of the event.
Overall, the findings suggest that LLMs have potential to support investigative interviewing, particularly in building rapport and enhancing engagement. At the same time, limitations in question formulation highlight the need for human oversight. The study points towards hybrid approaches, where AI tools complement rather than replace trained professionals in sensitive interview settings.
Read the article (open access)