Differences in intelligibility of non-native directed speech and hearing impaired directed speech for non-native listeners

Differences in intelligibility of non-native directed speech and hearing impaired directed speech for non-native listeners

Title	Differences in intelligibility of non-native directed speech and hearing impaired directed speech for non-native listeners
Publication Type	thesis
School or College	College of Humanities
Department	Linguistics
Author	Dickman, Sadie Moon
Date	2009-11-13
Description	Within the field of clear speech research, non-native, hearing impaired, and child-directed speech are often referred to as types of 'clear speech.' However, although some research has compared the acoustic properties of these types of speech, no research has directly compared their intelligibility. In the present study, non-native listeners completed a sentence transcription task for non-native and hearing impaired directed speech. Results showed no significant difference in performance between the two speaking conditions, indicating that the phonological adjustments talkers make when addressing non-native versus hearing impaired listeners do not have any significantly different effect on intelligibility.
Type	Text
Publisher	University of Utah
Subject	Speech; Intelligibility
Dissertation Institution	University of Utah
Dissertation Name	MA
Language	eng
Relation is Version of	Digital reproduction of "Differences in intelligibility of non-native directed speech and hearing impaired directed speech for non-native listeners" J. Willard Marriott Library Special Collections P27.5 2009 .D53
Rights Management	© Sadie Moon Dickman
Format	application/pdf
Format Medium	application/pdf
Format Extent	98,934 bytes
Identifier	us-etd2,129528
Source	Original: University of Utah J. Willard Marriott Library Special Collections
Conversion Specifications	Original scanned on Epson GT-30000 as 400 dpi to pdf using ABBYY FineReader 9.0 Professional Edition.
ARK	ark:/87278/s63f549h
DOI	https://doi.org/doi:10.26053/0H-8KB4-XFG0
Setname	ir_etd
ID	193901
OCR Text	Show DIFFERENCES IN INTELLIGIBILITY OF NON-NATIVE HEARING LISTENERS by Sadie Moon Dickman A thesis submitted to the faculty of The University of Utah in partial fulfillment of the requirements for the degree of Master of Arts Department of Linguistics The University of Utah December 2009 DIRECTED SPEECH AND HEARJNG IMPAIRED DIRECTED SPEECH FOR NON-NATIVE the sis in partial fulfillment of the requirements for the degree of Department of Linguistics December 2009 Copyright © Sadie Moon Dickman 2009 All Rights Reserved THE U N I V E R S I T Y OF U T A H G R A D U A T E SCHOOL SUPERVISORY COMMITTEE APPROVAL of a thesis submitted by Sadie Moon Dickman This thesis has been read by each member of the following supervisory committee and by majority vote has been found to be satisfactory. Mary Ann Christison Johahrja Watzmger-Tharp UNIVERSITY UTAH GRADUATE SC HOOL Chair: Rachel Hayes-Harb MaryAnn ~-Tharp THE U N I V E R S I T Y OF U T A H G R A D U A T E SCHOOL FINAL READING APPROVAL To the Graduate Council of the University of Utah: I have read the thesis of Sadie Moon Dickman in i t s f m a i fo rm and have found that (1) its format, citations, and bibliographic style are consistent and acceptable; (2) its illustrative materials including figures, tables, and charts are in place; and (3) the final manuscript is satisfactory to the supervisory committee and is ready for submission to The Graduate School. Date Rachel Hayes-Harb Chair: Supervisory Committee Approved for the Major Department Chair/Dean Approved for the Graduate Council Charles A.*Wight Dean of The Graduate School UN VERS IT Y UTAH GRADUA T E SCHOOL APPROVAL 1 have read the thesis of :--:-__ --,-S_ad-,-i_e_M_o_o_n-:--:-D-:-ic"lan::-an_--;-:_----;_ in its fmal form [ennat, Harh Corrunittee Edwara Rubin ChairlDean A:Wight ABSTRACT Within the field of clear speech research, non-native, hearing impaired, and child-directed speech are often referred to as types of 'clear speech.' However, although some research has compared the acoustic properties of these types of speech, no research has directly compared their intelligibility. In the present study, non-native listeners completed a sentence transcription task for non-native and hearing impaired directed speech. Results showed no significant difference in performance between the two speaking conditions, indicating that the phonological adjustments talkers make when addressing non-native versus hearing impaired listeners do not have any significantly different effect on intelligibility. childdirected of'c1ear speech. ' study. For Heather G. Many thanks to Dr. Rachel Hayes-Harb Harh CONTENTS ABSTRACT iv INTRODUCTION 1 7 Research question 12 METHODS 13 Participants 13 Stimuli 14 Procedures 18 RESULTS 19 DISCUSSION 25 Speaking condition 25 Participants and counterbalancing group 26 Talkers 26 Keyword type 27 IMPLICATIONS 28 LIMITATIONS 29 FUTURE DIRECTIONS 30 Appendices A. STIMULI 31 B. SCRIPTED PASSAGE FOR TARGET LISTENER VIDEO 32 REFERENCES 33 .............................................................. ........................................................ ............................................................................................................. I BACKGROUND .................................................................................................... ........... 3 Non-native directed speech " ................... "" .... , ... ,"'" .... , .. , .......................... ................. Hearing impaired directed speech ............................................................................... 8 Research question ............................................................................................. ......... 12 METHODS ...................................................................................................................... 13 Participants ...................... .......................................................................................... 13 Stimuli ................... ...... ....................................... .... ................................................... 14 Procedures ................................................ .............. ......................................... ........... 18 RESULTS ............ ............................................................................................................ 19 DISCUSSION ......................................................................................... ......................... 25 Speaking condition .................................................................................................... 25 Participants and counterbalancing group ................................................................... 26 Talkers .................... ................. ....................... ................................................. .......... 26 Keyword type .......... ........................................ ................................................. .......... 27 IMPLICATIONS ............................................................................................................. 28 LIMITATIONS ................................................................................................................ 29 FUTURE DIRECTIONS ................................................... .............................................. 30 Appendices A. STIMULI ....... ........ ............. .......................................... ..... .......................... ...... .. ....... 31 B. SCRIPTED PASSAGE FOR TARGET LISTENER VIDEO ................................. .. 32 .................. .............................................................................................. INTRODUCTION Previous research has shown that different populations of listeners have particular communicative needs, and that speakers often make attempts to accommodate these needs, with mixed results (Bradlow, Torretta, and Pisoni, 1996). For example, people talk louder and more slowly in noisy environments (known as the Lombard effect) and talk differently to infants than to adults (Scarborough et al., 2007; Uther, Knoll, and Burnham, 2007). Importantly, these adjustments reflect the speaker's theory about what the listener needs. This theory is likely informed by a number of factors, including the quality and extent of the speaker's experience with target listeners; an instinctive sense of listener needs, which varies greatly across speakers; real-time feedback from the listener; and the look and sound of the listener. A large body of work has examined speakers' ability to phonologically alter their speech in an attempt to accommodate the needs of listeners as well as the measured intelligibility benefits of these changes. Within this 'clear speech' literature, different types of clear speech are typically conflated, including non-native directed and hearing impaired directed speech, resulting in a variety of methods used to elicit the phenomenon referred to generally as 'clear speech'. Some researchers ask talkers to speak as though to a hearing impaired listener (e.g., Picheny et al., 1985; Krause and Braida, 2002), others to a non-native listener (e.g., Uther et al., 2007; Biersack, Kempe, and Knapton, 2005), and still others elicit speech directed to "a hearing impaired or non-native listener" (e.g., Bradlow and Alexander, 2006; Bradlow and Bent, aI., aI., 2 2002). However, although some research has begun to compare the acoustic properties of these types of speech, it is not yet clear whether talkers make different phonological adjustments to their speech when they address different target audiences (e.g., non-native and hearing impaired listeners). The present study directly compares the intelligibility of non-native and hearing impaired directed speech. If these two types of clear speech do result in significantly different intelligibility scores for particular listener populations, this has important implications for the methodological designs of clear speech research as well as theoretical implications for the field of clear speech research. li steners). BACKGROUND The phenomenon of speech accommodation, or 'clear speech', has been the subject of a multitude of linguistic studies for decades. However, the implicit assumption that all clear speech studies are investigating the same phenomenon (or similar phenomena) is complicated by several factors. Researchers use a variety of methods to elicit what they all term 'clear speech.' Variously, researchers elicit speech directed toward children (e.g., Bradlow et al, 2003; Uther et al, 2007; Biersack et al., 2005), the hearing impaired (e.g., Picheny et al. 1985; Picheny, Durlach, and Braida 1986 & 1989; Ferguson 2004 & 2007; Krause and Braida 2002 & 2003) and non-native speakers of English (e.g., Scarborough et al., 2007), of which the latter two are the focus of the present study. In other words, talkers are being asked to do different things across studies, and yet they are all referred to under the umbrella term of 'clear speech.' This issue may be particularly important when we consider that other studies, interested in a larger scope investigation of clear speech, do not distinguish between different target audiences, resulting in a blurring of the distinction between these different types of clear speech (e.g., Bradlow and Alexander, 2006; Bradlow and Bent, 2002; Bradlow et al, 2003). These studies typically ask speakers to address an imagined "native listener with hearing loss or a non-native speaker learning your language." (See Table 1 for elicitation methods across studies.) This final approach to elicitation raises an especially important question: do non-native directed speech (speech addressed to a non-native listener, or NNDS) and aI, aI, aI., ai. aI., ofthe aI, Table 1. Elicitation methods across studies Imaginary Listener Hearing impaired listener Non-native listener Real Listener Goberman & Elmer, 2005 Uther et al., 2007 2004; Picheny et al., 1985 et al., 2005 Bradlow & Alexander, 2006; Bradlow & Bent, 2002; Bradlow et al, 2003 4 I. Ferguson, aI., aI., Biersack ct ai, aI., 5 speech directed toward the hearing impaired (HIDS) differ importantly from each other? If so, this method may result in the elicitation of different registers of clear speech or a mix of registers and the practice of making comparisons across speech studies using different elicitation techniques may be problematic. Alternatively, if it appears the two registers do not differ significantly, this could further support the growing idea in the literature that although listeners have very different needs, speakers are not necessarily good at producing clear speech to accommodate these needs. Additionally, the tendency in clear speech studies to treat the hearing impaired and non-native populations as homogeneous may also be problematic. Clearly, great diversity exists in both of these populations and in the concept that speakers have of them, both of which may result in a variety of uncontrolled variables in the production of HIDS and NNDS. Although this issue is not the main focus of the present study, it warrants important consideration when interpreting the results of studies like this one, and may seriously limit the ability to generalize results across studies and to the larger population. Several studies have investigated the notion that native and non-native speakers do indeed have different communicative needs. In Bradlow and Bent (2002), the researchers elicited conversational and clear speech from native speakers ("read as if speaking to a listener with hearing loss or from a different language background") and compared the intelligibility of the two conditions for both non-native and native regular-hearing listeners. Results showed a much greater clear speech benefit (the intelligibility difference between clear and conversational speech) for native listeners than for non-natives. In other words, native and non-native listeners benefit from different phonological adjustments and have different communicative needs and if NNDS and ofHIDS ofthe ofthe regularhearing nonnatives. different ifNNDS 6 HIDS do not differ, this means that speakers do not recognize the differing needs of the two populations and instead treat them as if they have the same communicative needs. Therefore, as it stands, the term 'clear speech' is largely ambiguous, as it has not been consistently defined across studies and has been invoked to describe speech directed toward different audiences. In one of the earliest clear speech studies, Picheny et al. (1985) provided no explicit definition for clear speech, saying only that "an individual can increase his/her intelligibility" when he/she attempts or is instructed to speak more clearly (p.96). Later studies carry on in this tradition, treating clear speech as a natural phenomenon- something a native speaker can simply do as a part of his/her native competence. Some effort has been made to define clear speech in terms of its acoustic characteristics (e.g., Scarborough et al., 2007) but the number of studies focusing on this is limited and thus far only a few preliminary observations have been made, primarily on the subject of speech rate and prosody (Biersack et al., 2005; Scarborough et al., 2007; Uther et al., 2007). In addition to the use of different methods to elicit clear speech, another methodological complication centers around the methods used to elicit clear speech; most of the above studies use an imaginary listener. (In other words, talkers are not actually talking to anyone when they produce the speech used as stimuli). However, a small number of studies have used a real confederate listener (e.g., Goberman and Elmer, 2005; Uther et al., 2007); in other words, a target listener sitting in the room with the talker. As it turns out, there are problems with both of these methods of elicitation. Scarborough et al. (2007) found that real versus imagined listener conditions have a significant impact on the clear speech effect. More specifically, the Scarborough study examined the effects of ofthe ai. hislher carryon phenomenon-- ofhislher aI., aI., aI., aI., aI., ofthese ai. two different variables (native/nonnative and real/imagined listeners) on the production of clear speech and found that many features that are typical for non-native directed speech, such as slower speech rate and hyperarticulation, are exaggerated when the task involves not a real but an imagined listener. The use of a real confederate listener is equally problematic because it introduces many extraneous variables. For example, the listener may react differently to each speaker. Although this issue of elicitation method is not immediately relevant to the fundamental question in the present study, it brings about important methodological considerations, which are addressed in detail for the present study in the methods section below. Before considering the primary question of the present paper, whether native speakers make different phonological adjustments when producing clear speech for non-native and hearing impaired listeners, it is important to evaluate the more basic claim that NNDS and HIDS are indeed each a register of clear speech in that they are listener-directed and differ significantly from speech directed toward a normal-hearing, native speaking adult (in many research studies, this is termed adult-directed speech, or ADS, which in the context of the present paper is synonymous with 'conversational' speech). In other words, do talkers do different things when they are talking to an audience with specific needs (e.g., non-native or hearing impaired listeners) than when they are talking to normal hearing native speaking adults? Non-native directed speech A great deal of acoustic evidence suggests that the differences between non-native and adult directed speech are significant. They tend to differ in terms of speed, with NNDS having fewer phones per second than ADS (Scarborough et al., 2007) and 7 nonnative listenerdirected nonnative aI., 8 fewer syllables per second (Biersack et al., 2005), greater pause duration (Biersack et al., 2005), and longer vowels (Scarborough et al., 2007; Biersack et al., 2005). Furthermore, Uther et al., (2007) found that NNDS, but not ADS, has a greater vowel area, indicating the use of an expanded vowel space and hyperarticulation. Scarborough et al. (2007) examined several of the above acoustic characteristics of NNDS and concluded it is indeed a subtype of the broader 'clear speech'. Intelligibility studies provide another source of evidence to support the hypothesis that NNDS is different from ADS. Numerous studies have shown that non-native listeners experience an intelligibility benefit for speech directed toward non-native vs. speech directed toward native listeners (e.g., Bradlow and Bent, 2002; Bradlow and Alexander, 2006). Hearing impaired directed speech The acoustic properties of HIDS also differ systematically from ADS. HIDS sentences are, on average, about twice as long as ADS sentences and HIDS is characterized by fewer segment deletions and reductions than ADS (Picheny et al. 1985). Other phonological processes also differ between the two modes, including more epenthesis in HIDS (Picheny et al., 1986). Longer and more numerous pauses appear, and lengthened segments (Picheny et al. 1986; Ferguson, 2007). Picheny et al. (1986) also noted a tendency toward a wider range in fundamental frequency in HIDS than in ADS and, interestingly, Ferguson (2004) found that females are more intelligible in HIDS, but not in ADS. As with non-native listeners, hearing impaired listeners perform better on intelligibility tasks when listening to hearing impaired directed speech versus speech directed toward normal hearing listeners (Picheny et al., 1985). aI., aI., aI., aI., aI., ai. ofNNDS Rearing RIDS RIDS RIDS ai. RIDS aI., ai. ai. RIDS RIDS, aI., 9 It is important to note that the HIDS studies discussed here actually include two different types of listener populations: some actually use hearing impaired native listeners, while others simulate impairment by playing speech-in-noise stimuli for normal hearing native listeners. This is not a distinction relevant to the present discussion, however, as a similar magnitude of (clear speech) benefit has been found for both populations (see Picheny et al., 1985 for sentence stimuli and Uchanski and Choi, 1996 for single-word stimuli). Keeping in mind this evidence that NNDS and HIDS are both types of clear speech, we can return to consider the primary research question: do these two registers differ importantly from each other? Most of the evidence in the affirmative is based on acoustic analysis demonstrating that NNDS and HIDS have different acoustic characteristics. For example, NNDS, but not HIDS, has increased mean pause duration (Biersack et al., 2005), increased vowel duration and fewer phones per second (Scarborough et al., 2007). HIDS, on the other hand, is characterized by an increased max fundamental frequency, increased pitch range (Biersack et al., 2005), higher pitch, exaggerated fundamental frequency contours and higher emotional affect (Uther et al., 2007). Uther et al. (2007) hypothesize that the two types of clear speech are each motivated by the different needs that the speaker identifies for the target listener; non-native listeners, but not hearing impaired listeners, are perceived to be in need of linguistic instruction, motivating the speaker to make different accommodations for NNDS than for HIDS. By examining some of these talker-related characteristics of speech, it becomes clear that NNDS and HIDS have different acoustic characteristics, meaning that talkers oflistener ofthe nonnative 10 know these different audiences have different communicative needs and are attempting to accommodate them. However, some evidence indicates that these accommodations are not necessarily helpful to the listener. In other words, what speakers do might not be what improves the listener's ability to comprehend speech. Several studies have addressed this issue by examining some specific phonological adjustments and their effect on intelligibility (e.g., Uther et al., 2007; Scarborough et al., 2007; Biersack et al., 2005; Smith, 2007). These studies provide substantial evidence about the specific changes that speakers make during production of different types of clear speech, showing that different processes are involved in the production of NNDS and HIDS. It remains to be seen, however, whether these adjustments have any practical effect on intelligibility. Furthermore, it appears that some adjustments may negatively affect intelligibility. Ferguson (2007) found that when addressing hearing impaired listeners, speakers sometimes altered their pitch so that it centered around the range where listeners had the most hearing loss, therefore making more difficult for listeners to perceive the relevant acoustic signal. Derwing (1990) reported that, in a conversation with a non-native listener, those speakers who slowed their speech the most (relative to their normal speaking rate) were in fact the least intelligible. This result is somewhat unexpected, and counter to what talkers believe will help the listener. Of course, a slower speech rate may be confounded with some other factors, such as hyperarticulation on the part of the speaker and requiring longer memory retention on the part of the listener; therefore, although they appear to be correlated, we cannot necessarily attribute a lower level of intelligibility to a slower speech rate. In an examination of the methodologies in studies of speech rate, Griffiths (1990) found that, compared to conversational speech, fast speech resulted in aI., aI., aI., ofNNDS 11 lower intelligibility scores but slow speech did not result in any sort of intelligibility benefit. This may be a function of the method of slowing speech; Blau (1990) found no intelligibility benefit for slower speech, but did find a benefit when pauses were inserted at prosodic boundaries, providing extra processing time. In terms of rate of speech, few of the phonological adjustments examined thus far are actually helpful to listeners; this problem may extend to other properties of clear speech as well. In summary, studies like Bradlow and Bent (2002) suggest that there is a difference between the needs of the native and non-native listeners, but it is unknown whether speakers are sensitive to these needs, whether they actually attempt to make phonological alterations to their speech to accommodate the needs of non-native listeners, or whether these changes result in a speech intelligibility benefit for non-native listeners and whether this benefit is different from the benefit non-native listeners receive from HIDS. So far, we have seen analyses that focus on talkers and what they do when producing HIDS and HHDS; these studies provide evidence that non-native directed speech and hearing impaired directed speech are distinct registers, and evidence against speakers' ability to accurately improve the intelligibility of speech. This conflicting evidence makes it difficult to form a hypothesis about the characteristics of each register and whether they will result in different intelligibility benefits for non-native listeners. In addition, although we know quite a bit about the acoustic characteristics of NNDS and HIDS, no researcher to our knowledge has performed a listener-focused analysis that directly compares the intelligibility of HIDS and NNDS for listeners, or in other words, that measures the effect of the phonological changes that speakers make on intelligibility. In that vein, the research question for the present study is as follows: ofthe conflicting ofNNDS ofthe 12 Research question Do non-native English listeners perform differently on a transcription task when listening to speech produced for non-native listeners versus speech produced for hearing impaired listeners? If results show a difference in intelligibility of NNDS and HIDS for non-native listeners (as measured by transcription accuracy), this might be taken as evidence that speakers do make importantly different accommodations when addressing non-native listeners vs. hearing impaired listeners and, crucially, that these accommodations do affect intelligibility for non-native listeners. However, if results show no difference in intelligibility, the difference in accommodations that speakers make for non-native versus hearing impaired listeners may not significantly affect the level of intelligibility for non-native listeners. perfonn Ifresults ofNNDS affect nonnative METHODS Participants Participants were recruited from the population of students enrolled in ESL classes at the University of Utah in spring 2009. The sample was not controlled for age, gender, or time spent learning English, as these variables have been found to have no significant effect on performance in a listening experiment (Ferguson, 2004 & 2007). Ferguson did find a significant effect of amount of time spent in an English-speaking environment (length or residence, or LOR), but in order to make the results more generalizable, the listener sample was not controlled for time spent living in an English-speaking environment or language background. Thirty-two listeners participated in this study. Of these, one was excluded because she failed to complete at least 90% of the transcription task (i.e., left more than three response lines completely blank) and one was excluded because he reported a hearing disorder. Of the 30 remaining participants, all were normal-hearing non-native English speakers enrolled in writing classes in the ESL or ELI (English Language Institute) programs at the University of Utah. Participants came from a range of different language backgrounds, as follows: Arabic (n=5), Mandarin Chinese (n=4), Japanese (n=5), Korean (n=l 1), Persian (n=l), Portuguese (n=l), Russian (n=2), and Taiwanese (n=l). They ranged in age from 17 to 36 years with a mean of 22.47 and had 1-15 years of classroom instruction in the English language with a mean of 8.4 years. Length of residence ranged Englishspeaking 11), I), I), 14 from one month to four years and eight months. The mean LOR was one year. Stimuli The stimuli were designed to be suitable for a sentence transcription task for non-native speakers of English. To this end, sentences from Bradlow and Alexander (2007) were selected because they were designed for use with non-native listeners, are more authentic (in comparison with other types of sentences used in listening experiments, like nonsense sentences or frame sentences), and the target (final) word in each sentence is controlled for predictability. Predictability of keywords is a concern because of the rating system utilized in the present study. Typically, in a transcription task, each keyword is counted as one point. However, this could be problematic when keywords differ in their contextual predictability, as they do in natural speech, and therefore highly predictable keywords are worth the same number of points as less predictable keywords. The modified Bradlow and Alexander sentences allow for a comparison between highly predictable keywords and 'standard' keywords with a range of predictability to assess whether predictability does indeed have an affect. To ensure that the sentences did not include any vocabulary that may be unfamiliar to the target population, an ESL teacher who is familiar with the proficiency level of the population of ESL learners from which the participants were sampled participated in a familiarity task, reading all 60 Bradlow and Alexander sentences and indicating any words that could be unfamiliar. She identified words in three sentences, which were removed from consideration as stimuli. A pilot study was conducted six months prior to the present study, and because there was a chance that some participants from the pilot were participants in the present study (although in the end this was not the case), the 19 sentences used in the pilot were also nonnative ofpredictability 15 removed from consideration as stimuli. Of the remaining sentences, 24 sentences with high-probability final words were randomly selected for use as stimuli for the present study. In spring, the plants are full of green leaves. An orange is a type of fruit. A bicycle has two wheels. A downfall of these sentences is that each sentence contains only one target word (the highly predictable final keyword), extending the time required to collect sufficient data. Therefore, two to four additional 'standard' keywords (not controlled for probability) were selected from each sentence, with eight sentences having two keywords, eight having three and eight having four, in addition to the final keyword. Consistent with other studies using a keyword transcription task (e.g., BKB sentences in Bradlow and Bent, 2002), only open-class words were selected. These keywords had a range of levels of predictability. Total number of words and number of keywords were counterbalanced across speaking conditions. See Appendix A for a list of sentences. Four female native English speakers between the ages of 53 and 63 with no speech or hearing impediments were recruited as talkers. All had lived in the Salt Lake City area for at least 10 years and all were originally from Utah. The talkers produced each of the 24 sentences three times for each of the three speaking conditions (conversational, non-native directed and hearing impaired directed) for a total of nine repetitions per sentence. Talkers were not informed ahead of time that they would be producing all three types of speech. For each speaking condition, the middle of the three tokens was selected as a stimulus. Talkers were divided randomly into two groups of two talkers each. Both groups produced the conversational speaking style first and talker group determined order of the other two speaking conditions; talker group A produced ~ sufficient oflevels ofthe oftwo 16 speech for the non-native directed speaking condition first, then for hearing impaired directed; talker group B produced speech for hearing impaired speaking condition first, then for non-native. As mentioned in the background section, different studies use different methods of elicitation, none without problems. In the present study, talkers viewed a video depicting a target listener reading a scripted passage and, after watching the video, they were instructed to read sentences "as though talking to the woman in the video." For the NNDS speaking condition, the video showed a female native of China who spoke in Chinese-accented English; talkers were explicitly informed of the speaker's language background. For the HIDS speaking condition, the video showed a 57-year-old female; talkers were explicitly informed that the speaker in the video was hearing impaired. See Appendix B for the text of the scripted passage. Although we can find no evidence that the video technique used in this study has been used in any other study, this technique was chosen for the present study because the alternatives, a confederate or an imaginary listener, pose problems discussed above. The present video method presents a compromise between the two in an attempt to avoid the downfalls of both methods. No video was used for the conversational speaking condition. Rather, talkers were instructed to "speak as though you are talking casually with a close friend." Clearly, using two different styles of elicitation (a video for the NNDS and HIDS speaking conditions and an imaginary listener for the conversational speaking condition) is not ideal because it introduces a confound into the study design. However, it is the convention of clear speech studies to elicit conversational speech in this manner ("speak as though you are talking casually with a close friend") and because each participant has 17 a different background, it is impossible to depict "a close friend" in a video. During the elicitation task, the sentences were displayed one at a time random order on a computer screen, with three practice sentences preceding the list of 24 test sentences to allow talkers to become familiar with the task and one filler sentence at the end to avoid list intonation, for a total of 28 sentences. The talkers read the sentences aloud, speaking into a microphone placed on the table in front of them, and the speech was recorded in mono at a sampling frequency of 32000 Hz using the PRAAT sound recording software (developed at the Institute of Phonetic Sciences at the University of Amsterdam, copyright by Paul Boersma and Paul Weenink). Previous studies have shown amplitude to be a relevant factor for hearing impaired directed speech, and so it was crucial to maintain amplitude within subjects. However, because amplitude differed greatly across talkers, with one talker (D) producing speech at significantly lower amplitude and one talker (L) producing speech at significantly higher amplitude, amplitude was normalized across talkers. Since the conversational speaking condition speech can be viewed as a baseline for amplitude, normalization was achieved by multiplying all tokens in all speaking conditions produced by talkers D and L by a constant factor, making the amplitude of the conversation speaking condition for all four talkers more similar (58-63 dB) without altering the relative differences in amplitude between the three speaking conditions within each speaker. in of24 18 Procedures Participants were tested in a quiet classroom in groups of one to six. Audio files were played in random order at a comfortable listening volume. Participants heard two practice sentences produced by a fifth, nontest speaker, followed by the 24 test sentences in random order (eight sentences from the conversational speaking condition, eight from the non-native directed speaking condition, and eight from the hearing-impaired directed speaking condition). Of the eight sentences in each speaking condition, two were produced by each of the four talkers. Participants were randomly assigned to one of three counterbalancing groups and talker group and speaking condition were counterbalanced across groups. Each sentence was played once, after which the participants transcribed in English what they heard on a numbered response sheet. After the experiment, participants completed a questionnaire about their language background and experience with the English language. Three raters independently scored one third of participant responses for number of keywords transcribed correctly, after which it became apparent that all three raters agreed on participant responses 100% of the time and therefore the remainder of the rating task was carried out by only one of the three raters. Two scores were computed for each sentence: proportion of standard keywords correct (excluding final keywords) and proportion of highly predictable final keywords correct. Ofthe ofthree ofthe RESULTS Proportion of keywords correct was calculated for all participants. An 4-way ANOVA with counterbalancing group (3 levels) as a between-subjects variable and speaking condition (3 levels: Conversational, HIDS, NNDS), talker (4 levels: D, J, L, S) and keyword type (2 levels: standard keyword and highly predictable final keyword) as within-subjects variables revealed a significant effect of speaking condition (F(2,54)=4.429, p= 017, partial eta squared=. 141). Critically, planned pairwise comparisons on percent correct for standard keywords showed no significant difference in proportion correct between HIDS stimuli (x =.771, cr =.171) and NNDS stimuli (x=.780, cr=. 161) (F(l,29)=.133, p=.718, partial eta squared=.005). Participants performed significantly better on HIDS stimuli than on conversational stimuli (x = 669, <T=191) (F(l,29)=14.983, p=.001, partial eta squared=.341) and better on NNDS stimuli than on conversational stimuli (F(l,29)=21.896, p<.0005, partial eta squared=.430) (see Figure 1). The four-way ANOVA also showed a significant effect of talker (F(3,81)=6.818, p<.0005, partial eta squared=.202); follow-up analyses revealed a significant effect of talker for standard keywords (F(3,81)=l 1.940, p<.0005, partial eta squared=.292), but not highly predictable keywords (F(3,87)=.357, p=.784, partial eta squared=.012), and therefore pairwise comparisons were conducted on the effect of talker for standard keywords only (see Figure 2). They showed a significant difference in proportion correct ANOV A 2,S4)=p=.OI7, x=.771, 0"=.0"=.I,OOS). (x =.0"=.191) 1,29)=OOI, 1 ,29)=OOOS, oftalker OOOS, 11.940, OOOS, 3S7, OI2), 20 1 .oo • • .90- .80- Convo HIDS Figure 1. 1.0 .9- D J L S 2. of talker 1. 00 ',----------------------------------, .80 .70 .60 .50 .40 .30 .20 .10 0.00 "-__ _ NNDS J. Effect of speaking condition on proportion correct for standard keywords. 1.0,--------------------------------------, .9 .8 .7 .6 .5 .1 s Figure Effect on proportion correct for standard keywords. 21 between talkers S and D (F(l,29)=26.523, p<.0005, partial eta squared=.478), S and J (F(l,29)=47.594, p<.0005, partial eta squared=.621) S and L (F(l,29)=10.791, p=.003, partial eta squared=.271) and talkers J and L (F(l,29)=4.321, p=.047, partial eta squared=. 130). Results showed no significant difference between talkers D and L (F(l,29)=.267, p= 113, partial eta squared=.084) nor talkers D and J (F(l,29)=1.321, p=.260, partial eta squared=.044). Although there was no significant two-way interaction between talker and speaking condition (F(6,162)=1.508, p=.179, partial eta squared=.053) results did show a significant three-way interaction among talker, speaking condition and counterbalancing group (F(12,162)=3.939, p<.0005, partial eta squared= 226). The ANOVA also revealed a significant effect of keyword type (F(l,27)=l 8.047, p<.0005, partial eta squared=.401). A follow-up analysis revealed that, as expected, participants performed significantly better on the highly predictable keywords than on standard keywords (F(l,719)=23.316, p<.0005, partial eta squared=.031). Although there was no main effect, planned pairwise comparisons were conducted to examine more closely the effect of speaking condition for highly predictable keywords only, with the intent to determine whether the pattern of results was the same for both types of keywords (see Figure 3). Unlike for the standard keywords, these results showed no significant difference in proportion correct between conversational and NNDS (F<1, partial eta squared=.004), conversational and HIDS (F<1, partial eta squared=.003) and HIDS and NNDS (F<1, partial eta squared=.000). The effect of counterbalancing group was not significant (F(2, 27)=2.061, p=.147, partial eta squared=. 132) but there was a significant interaction between talker and Sand 1,29)=1,29)=1,29)=1,29)=1 ,29)=.p=.I13, 1,29)=6, 162)=squared=.ANOV A 1 ,27)= 18.047, 1,719)=I, I, I, OOO). .900- - Convo HIDS NNDS Figure of speaking keywords. 1.0001 1 ~ o U c o 'e o "- ~ .c. :":;; 3. Effect condition on proportion correct for highly predictable 22 23 counterbalancing group (F(6,81)=9.944, p<.0005, partial eta squarecK424). The interactions between speaking condition and counterbalancing group (F(4, 54)= 1.758, p=.151, partial eta squared=.l 15) and between keyword type and counterbalancing group were not significant (F(2,27)=.174, p=.841, partial eta squared=.013). The two-way interactions between talker and keyword type (F(3,81)=6.357, p=.001, partial eta squared=.191) and between speaking condition and keyword type (F(2,54)=4.905, p=.011, partial eta squared=.154) were significant, as were the three-way interactions among talker, keyword type and counterbalancing group (F(6,81)=2.277, p=.044, partial eta squared=.144) speaking condition, keyword type and counterbalancing group (F(4,54)=5.637, p=.001, partial eta squared=.295) and among talker, speaking condition and keyword type (F(6,162)=5.292, p<.0005, partial eta squared=.164). Finally, the four-way interaction between talker, speaking condition, keyword type and counterbalancing group was also significant (F(12,162)=5.851, p<.0005, partial eta squared=.302). Examining individual participant performance offers an additional interesting perspective on the results. The pattern of results for effect of speaking condition varied across participants, with 12 of the 30 participants performing better on HIDS stimuli than NNDS stimuli, and 17 exhibiting the opposite pattern. See Table 2 for a detailed comparison of means for each participant in each speaking condition. squared=.424). p= .151, squared= .115) OOl, Oll, squared= .144) OOl, fourway perfonnance ofthe perfonning Table 2. Mean scores for each participant in each speaking condition Counterbalancing Mean Mean Mean group 1 .8650 .8963 .9588 .6975 .9588 1.0000 3 .8438 1.0000 .8963 6 .6350 .8338 .7925 .6775 .8650 .8025 .7188 .6988 .7088 .8438 1.0000 .9688 .7813 .7713 .7613 .5313 .6350 .7713 .7600 .8338 .8963 .3225 .4475 .6050 1 .7500 .6988 .7613 2 1 .8750 .8225 .7813 1.0000 .9163 .8750 3 .9688 1.0000 .9375 .6663 .5525 .4900 .5938 .7600 .4588 .6775 .4900 .4063 .5213 .5625 .3850 .2925 .4688 .3538 .7500 .8125 .5000 3 1 .5425 .7088 .8963 .7813 .9375 .8125 CO .7813 .9375 .9275 .8650 .9375 .8338 .7913 .8025 .7188 .4175 .5000 .8125 .6775 .5313 .7188 .8125 .7913 .8238 .6775 .8338 .8550 Total .70 .76 .75 24 condition Counter-balancing 9roue Subject Convo. HIDS NNDS 1 1 2 1 1 1 7 1 8 1 9 1 10 1 11 1 12 1 13 1 14 2 2 2 2 4 2 5 2 6 2 7 2 8 2 9 3 2 3 3 3 4 3 5 3 6 3 7 3 8 3 9 DISCUSSION Speaking condition Recall that the main research question was: do non-native English listeners perform differently on a transcription task when listening to speech produced for non-native listeners versus speech produced for hearing impaired listeners? Results showed no significant difference in performance between HIDS stimuli and NNDS stimuli. This could mean that talkers made the same adjustments for both types of speech, but this seems unlikely, given the large number of research studies discussed in the literature review which show that talkers do make different adjustments for different audiences. It is much more likely that talkers did make significantly different adjustments for non-native versus hearing impaired listeners, but that these adjustments did not significantly affect the level of intelligibility for non-native listeners (on a transcription task), indicating that although talkers make different adjustments for hearing impaired vs. non-native listeners and that these adjustments are effective at improving the intelligibility of speech for non-native listeners, talkers are not adept at differentiating between the communicative needs of hearing impaired and non-native listeners. It is also important to specifically note the fact that the adjustments talkers make in HIDS are effective at improving intelligibility for non-native listeners, a nontarget audience with different communicative needs, which is a new finding. nonnative nonnative nonnative different Participants and counterbalancing group Participants were randomly assigned to a counterbalancing group, and the effect of group on proportion correct was not significant, nor was the interaction between counterbalancing group and speaking condition, indicating that all three groups exhibited the same pattern of results as we see overall. However, there was a significant interaction between counterbalancing group and talker, meaning that either because of the experimental conditions or because of the backgrounds of the participants themselves, different counterbalancing groups exhibited a different pattern of results for different talkers. Talkers Considering the general impression that some people are simply more intelligible than others, it comes as no surprise that intelligibility studies often find an effect of talker. For standard keywords in the present study, talker S was significantly more intelligible than all other talkers, and talker J was the least intelligible of the three. This may be in part a function of the amplitude of these talkers' speech. Although amplitude of the conversational speaking condition was normalized across talkers, it is conceivable that this did not remove the influence of recording amplitude on intelligibility, perhaps as a result of background noise being amplified along with the speech signal. However, because there was no main interaction of talker and speaking condition, we can say that although participants performed differently when listening to different talkers, talker had no effect on the pattern of results. 26 ofthe ofthe different because there was no main interaction of talker and speaking condition, we can say that although participants performed differently when listening to different talkers, talker had no effect on the pattern of results. Keyword type Not surprisingly, participants performed significantly better on highly predictable keywords than on standard keywords. In fact, results showed a main effect of speaking condition and of talker for standard keywords, but not for highly predictable keywords, which may indicate a ceiling effect whereby participants performed so well on highly predictable keywords in all speaking conditions and for all talkers that any significant differences in performance between the speaking conditions were 'washed out.' In an attempt to lower overall participant performance on these predictability controlled keywords to, future research will replicate the present study, replacing the highly-predictable keywords with low-predictability keywords. 27 surprisingly. perfonned signi ficantly perfonned perfonnance out. ' perfonnance highJ ypredictable IMPLICATIONS The major finding of the present study is that whatever adjustments speakers make for non-native listeners versus hearing impaired listeners, they do not appear to have a different effect on intelligibility for non-native listeners. In other words, more broadly, even when speakers attempt to make adjustments to their speech to help the listener, they do not seem to be helping intelligibility. This has important implications for the effectiveness of 'clear speech' and the assumption that clear speech is a natural phenomenon, or something a native speaker can simply do as a part of his/her native competence. In addition, considering the results from the present study, it seems that the elicitation method which instructs talkers to 'speak as though you are talking to a non-native or hearing impaired listener' may actually be targeting 'clear speech' itself, and is not at risk of confusing different types of clear speech or eliciting some mix of HIDS and NNDS registers. IMPLICA nONS ofthe ofhislher nonnative LIMITATIONS One major limitation of the present study centers around task effects; intelligibility was measured based on responses from a transcription task, and since researchers have found that language learners' performance is affected, often greatly, by task type, it is difficult if not impossible to generalize learners' performance on a transcription task to other types of tasks (Brindley, 2005). A second limitation arises from the method of eliciting and by extension the definition of 'conversational' speech. The problem here is two-fold; for one, for the HIDS and NNDS speaking conditions the target listener was clearly a stranger, however for the conversational speaking condition talkers were asked to speak as though talking to a 'close friend,' which introduces the variable of familiarity with the target listener which may have unknown effects on production. Secondly, although the target listeners in the HIDS and NNDS speaking conditions were clearly defined (talkers saw and heard the listeners and were given a few details about their backgrounds), talkers completely imagined the target listener for the conversational speaking condition, leaving the target listener unclearly defined for the researcher and perhaps even for the talker; talkers may have been imagining a man or a woman, someone old or young, a native or non-native speaker, etc. and these factors may have had an unintended effect on production. perfonnance perfonnance FUTURE DIRECTIONS Future directions should include replicating the present study with hearing impaired participants and should compare the acoustic characteristics and effect on intelligibility of speech directed toward ".. .a hearing-impaired or non-native listener" with HIDS and NNDS. In order to better identify task effects, future studies should also use a variety of tasks rather than transcription only, and should examine the phenomena of 'clear speech' in languages other than English. Finally, planned future research also includes an acoustic analysis of the stimuli from the present study as well as qualitative analyses of the talkers' knowledge of target-listeners communicative needs and thought process as they were producing the stimuli. These qualitative analyses are especially important considering the diversity of the concept of the 'non-native speaker' and the problematic tendency in research studies such as this one to treat the concept as homogeneous. toward" ... APPENDIX A STIMULI 1. Elephants are very big animals. 2. A chair has four legs. 3. Red and green are colors. 4. A bicycle has two wheels. 5. People wear gloves on their hands. 6. Cut the meat into small pieces. 7. The opposite of hot is cold. 8. Children wear scarves around their necks. 9. The girl wears shoes on her feet. 10. The team was trained by their coach. 11. She laid the meal on the table. 12. An orange is a type of fruit. 13. The lady wears earrings in her ears. 14. She looked at herself in her mirror. 15. A quarter is worth twenty-five cents. 16. Last night, they had beef for dinner. 17. Bob wore a watch on his left wrist. 18. Monday is the very first day of the week. 19. The sick woman went to see a doctor. 20. In spring, the plants are foil of green leaves. 21. The lady uses a hairbrush to brush her hair. 22. After my bath, I dried off with a towel. 23. My clock was wrong so I got to school late. 24. The pan that was in the oven is very hot. Note: Underline signifies a standard keyword: bold signifies a final keyword. APPENDIX A 1. Elephants are very hlg 2. A chair has four legs. 3. Red and green are 4. A bicycle has two wheels. 5. People wear gloves on their 6. Cut the meat into small pieces. 7. The opposite of hot is 9. The girl wears shoes on her Qy 11. She laid the meal on the 12. An orange is a ~ of fruit. 14. She looked at herself in her five 16. Last night, they had beef for 17. Bob wore a watch on his left Ym the the plants are full lady uses a to brush her hair. After my bath, I dried off with a towel. 23. My clock was wrong so I !ill! to school late. 24. The pan that was in the oven is Ym hot. Note: Underline signifies a standard keyword; bold signifies a keyword. APPENDIX B SCRIPTED PASSAGE FOR TARGET LISTENER VIDEO Hi! I'm Sara. Ijust moved here last week, and Salt Lake is great! Yesterday, I went cross-country skiing and ate dinner at Blue Plate Diner. Next week, I want to go to a Jazz game and see the opera. I really like it here in Utah. APPENDIXB J'm Sara. fjus! cross-country J 1 REFERENCES Blau, Eileen K. 1990. The effect of syntax, speed and pauses on listening comprehension. TESOL Quarterly 24. 746-753. Bradlow, Ann R., and Jennifer A. Alexander. 2007. Semantic-contextual and acoustic-phonetic enhancements for English sentence-in-noise recognition by native and non-native listeners. Journal of the Acoustical Society of America 1214. 2339-2349. Bradlow, Ann R., and Tessa Bent. 2002. The clear speech effect for non-native listeners. Journal of the Acoustical Society of America 1121. 272-284. Bradlow, Ann R., and David B. Pisoni. 1999. Recognition of spoken words by native and non-native listeners: Talker-, listener- and item-related factors. Journal of the Acoustical Society of America 1064. 2074-2085. Biersack, Sonja; Vera Kempe, and Lorna Knapton. 2005. Fine-tuning speech registers: A comparison of the prosodic features of child-directed and foreigner-directed speech. ISCA Workshop on Plasticity in Speech Perception, ed. by Valerie Hazan and Paul Iverson, 2401-2404. London: ISCA Archive. Bradlow, Ann R.; Nina Kraus, and Erin Hayes. 2003. Speaking clearly for children with learning disabilities: Sentence perception in noise. Journal of Speech, Language and Hearing Research 46. 80-97. Brindley, Geoff. 2005. Issues in language assessment. Oxford handbook of applied linguistics, ed. by Robert B. Kaplan, William Grabe, Merrill Swain, and G. Richard Tucker, 459-469. Oxford: Oxford University Press. Derwing, Tracey M. 1990. Speech rate is no simple matter: Rate adjustment and NS-NNS communicative success. SSLA 12. 303-313. Ferguson, Sarah Hargus. 2004. Talker differences in clear and conversational speech: Vowel intelligibility for normal hearing listeners. Journal of the Acoustical Society of America 16(4). 2365-2373. Ferguson, Sarah Hargus. 2007. Talker differences in clear and conversational speech: Acoustic characteristics of vowels. Journal of Speech, Language, and Hearing Research 50. 1241-1255. 24.746-acousticphonetic nonnative ofthe ofthe Iverson; NSNNS 12.303-ofthe 34 Goberman, Alexander M., and Lawrence W. Elmer. 2005. Acoustic analysis of clear versus conversational speech in individuals with Parkinson disease. Journal of Communication Disorders 38. 215-230. Griffiths, Roger. 1990. Facilitating listening comprehension through rate-control. RELC Journal 21(1). 55-65. Krause, Jean C , and Louis D. Braida. 2002. Investigating alternative forms of clear speech: The effects of speaking rate and speaking mode on intelligibility. Journal of the Acoustical Society of America 112(5). 2165-2172. Picheny, Michael A.; Nathaniel I. Durlach, and Louis D. Braida. 1985. Speaking clearly for the hard of hearing I: Intelligibility differences between clear and conversational speech. Journal of Speech and Hearing Research 28. 96-103. Picheny, Michael A.; Nathaniel I. Durlach, and Louis D. Braida. 1986. Speaking clearly for the hard of hearing II: Acoustic characteristics of clear and conversational speech. Journal of Speech and Hearing Research 29. 434-446. Picheny, Michael A.; Nathaniel I. Durlach, and Louis D. Braida. 1989. Speaking clearly for the hard of hearing III: An attempt to determine the contribution of speaking rate to differences in intelligibility between clear and conversational speech. Journal of Speech and Hearing Research 32. 600-603. Scarborough, Rebecca; Jason Brenier; Yuan Zhao; Lauren Hall-Lew and Olga Dmitrieva. 2007. An acoustic study of real and imagined foreigner-directed speech. Saarbriicken 6(10). 2165-2168. Smiljanic, Rajka, and Anne R. Bradlow. 2007. Production and perception of clear speech in Croatian and English. Journal of the Acoustical Society of America 118(3). 1677- 1688. Smith, Caroline L. 2007. Prosodic accommodation by French speakers to a non-native interlocutor. Saarbrucken 6(10). 1081-1084. Uchanski, Rosalie M., and Sunkyung S. Choi. 1996. Speaking clearly for the hard of hearing IV : Further studies of the role of speaking rate. Journal of Speech and Hearing Research 39(3). Uther, Maria; Monja A. Knoll, and Denis Burnham. 2007. Do you speak E-N-G-L-I-S-H? A comparison of foreigner- and infant- directed speech. Speech Communication 49. 2-7. c., Nathaniel!. 28.96-Nathaniel!. Nathaniel!. 32.600-SH? 49.2-
Reference URL	https://collections.lib.utah.edu/ark:/87278/s63f549h