Understanding virtual speakers

Research output: ThesisDoctoral Thesis (compilation)

Abstract

This thesis addresses how verbal comprehension is affected by seeing the speaker and in particular when the speaker is an animated virtual speaker. Two people visually co-present – one talking and the other listening, trying to comprehend what is said – is a central and critical scenario whether one is interested in human cognition, communication or learning. Papers I & II are focused on the effect on comprehension of seeing a virtual speaker displaying visual speech cues (lip and head movements accompanying speech). The results presented indicate a positive effect in the presence of background babble noise but no effect in its absence. The results presented in paper II also indicate that the effect of seeing the virtual speaker is at least as effective as seeing a real speaker, that the exploitation of visual speech cues by a virtual speaker may require some adaptation but is not affected by subjective perception of the virtual speakers’ social traits. Papers III & IV focus on the effect of the temporal coordination of speech and gesture on memory encoding of speech, and the feasibility of a novel methodology to address this question. The objective of the methodology is the precise manipulating of individual gestures within naturalistic speech and gesture sequences recorded by motion capture and reproduced by virtual speakers. Results in paper III indicate that such temporal manipulations can be realized without subjective perception of the animation as unnatural as long as the shifted (manipulated) gestural movements temporally overlap with some speech (not pause or hesitation). Results of paper IV were that words accompanied by associated gestures in their original synchrony or gestures arriving earlier were more likely to be recalled. This mirrors the temporal coordination patterns that are common in natural speech-gesture production. Paper V explores how factual topics are comprehended and approached metacognitively when presented in different media, including a video of an animated virtual speaker with synthesized speech. They study made use of an interface where differences in information transience and navigation options are minimized between the media. Results indicate improved comprehension and a somewhat stronger tendency to repeat material when also seeing, compared to only listening to, the virtual speaker. Instances of navigation behaviours were, however, overall scarce and only tentative conclusions could be drawn regarding differences in metacognitive approaches between media. Paper VI presents a virtual replication of a choice blindness experimental paradigm. The results show that the level of detail of the presentation of a virtual environment and a speaker may affect self-reported presence as well the level of trust exhibited towards the speaker. The relevance of these findings is discussed with regards to how comprehension is affected by visible speakers in general and virtual speakers specifically, as well as possible consequences for the design and implementation of virtual speakers in educational applications and as research instruments.

Details

Authors
Organisations
Research areas and keywords

Subject classification (UKÄ) – MANDATORY

  • Psychology
  • Learning

Keywords

  • Verbal Comprehension, Multimodality, Audiovisual integration, Gesture, Educational Technology
Original languageEnglish
QualificationDoctor
Awarding Institution
Supervisors/Assistant supervisor
Award date2020 Feb 21
Publisher
  • Lund University (Media-Tryck)
Print ISBNs978-91-88899-84-2
Electronic ISBNs978-91-88899-84-3
Publication statusPublished - 2020 Jan 24
Publication categoryResearch

Bibliographic note

Defence details Date: 2020-02-21 Time: 10:15 Place: LUX C121 External reviewer Name: Catherine Pelachaud Title: professor Affiliation: Sorbonne Université ---

Related research output

View all (4)