Times are displayed in (UTC-06:00) Central Time (US & Canada) Change
About this srcd poster session
| Panel information |
|---|
| Panel 11. Language, Communication |
Abstract
Adults and children struggle with identifying words in unfamiliarly accented speech (Cristia et al., 2012). One overlooked source of information that listeners might use to compensate for this difficulty is visual speech. The visual speech signal provides both phonetic (i.e., speech sound) and temporal (i.e., onset/offset) information (Baart et al., 2014) that can reinforce or supplement the auditory signal. Previous work has demonstrated that, when conditions provide a challenge for comprehension (e.g., environmental noise), people rely more on visual information (Bejjanki et al., 2011; Sumby & Pollack, 1954; Hollich et al., 2005). To our knowledge, few studies have examined listeners’ reliance on visual input for comprehending unfamiliarly accented speech (Zheng & Samuel, 2018) and none have done so with children.
We examine whether children rely more on visual input when processing non-native vs. native-accented speech. We focus on the “Visual Fill-in Effect” (Jerger et al., 2014), whereby information from the visual signal (e.g., an onset consonant “b”) can be added to the auditory signal (e.g., “az”) to form a complete percept (e.g., “baz”). If children rely on visual speech cues more for non-native accented speech, then we predict there will be a larger Visual Fill-in Effect for non-native accented speech than for native-accented speech.
Five- and six-year-old monolingual English-speaking children view a speaker on a screen and hear them produce non-words. They are asked to repeat each non-word they hear out loud to the researcher. In each trial, participants are presented with one of two stimulus types: 1) audio-visual (video), or 2) audio-only (still picture). For both stimulus types, the first consonant in the word is either intact (e.g., “baz”) or partial (e.g., “az”). In total, participants complete four blocks of 24 trials, with the stimulus types pseudo-randomly presented across trials in each block. Participants’ productions are blind-coded offline by another researcher. The Visual Fill-in Effect is computed as the proportion of partial audiovisual trials in which the child produces an onset consonant minus the proportion of partial audio-only trials in which the child produces an onset consonant.
Data collection and analyses for children in the non-native condition are ongoing. Here, we present data for the native condition (n=37). Overall, children in the native condition show a significant Visual Fill-in Effect (p <.001), with no change in the magnitude of the effect as a function of age (in months) or vocabulary size. However, females showed a stronger Visual Fill-in Effect compared to males (p = .023). This difference was not attributed to attentional differences during the task, as the proportion of onset-consonant productions in the intact trials were not significantly different between males and females (p’s > .05). One potential explanation for this difference is that females may rely more on visual information for speech processing (Irwin et al., 2006). The results of the non-native speaker condition will reveal whether the use of visual speech increases for more challenging speech, and whether this differs as a function of vocabulary size and previous exposure to other accent varieties.
Author information
| Author | Role |
|---|---|
| Madeline N. Wiseman, University of Waterloo | Presenting author |
| Ashley Avarino, University of Waterloo | Non-presenting author |
| Katherine White, University of Waterloo | Non-presenting author |
⇦ Back to session
Children’s reliance on visual input for native and non-native accented speech
Submission Type
Individual Poster Presentation
Description
| Session Title | Poster Session 12 |
| Poster # | 16 |