Encoding of dynamic pitch by the human auditory nerve

Publication Type	honors thesis
School or College	College of Health
Department	Communication Sciences & Disorders
Faculty Mentor	Skyler G. Jennings
Creator	Thompson, Kristin
Title	Encoding of dynamic pitch by the human auditory nerve
Date	2022
Description	The way the auditory system processes the pitch of a sound has been studied by measuring brain activity in response to stimuli that have clear acoustic cues for pitch. These auditory-evoked potentials (i.e., the frequency following response [FFR]) describe the encoding of pitch by the auditory brainstem; however, pitch encoding by the human auditory nerve has not been well established. An understanding of the neural coding of pitch by the auditory nerve is essential because the auditory nerve forms the foundation of all auditory information sent to the brain. Neural coding of acoustic cues associated with pitch for the human auditory nerve was measured for normal-hearing young adults who were native English speakers. These measurements involved obtaining recordings for the compound action potential (CAP) in response to three dynamic stimuli (up-sweep, down-sweep, and mix). CAP results were compared to auditory evoked potentials associated with brainstem activity (FFR). Results support the hypothesis that the auditory nerve is sensitive to acoustic pitch cues, and that this sensitivity is similar to that of the brainstem. Future work is needed to determine if the encoding of dynamic pitch by the auditory nerve is enhanced in certain subject populations (i.e., native vs. non-native speakers of tonal languages, musicians vs. non-musicians), as has been observed for the auditory brainstem.
Type	Text
Publisher	University of Utah
Subject	auditory; potential
Language	eng
Rights Management	(c) Kristin Thompson
Format Medium	application/pdf
ARK	ark:/87278/s6ceq2g6
Setname	ir_htoa
ID	2930215
OCR Text	Show ii ABSTRACT The way the auditory system processes the pitch of a sound has been studied by measuring brain activity in response to stimuli that have clear acoustic cues for pitch. These auditory-evoked potentials (i.e., the frequency following response [FFR]) describe the encoding of pitch by the auditory brainstem; however, pitch encoding by the human auditory nerve has not been well established. An understanding of the neural coding of pitch by the auditory nerve is essential because the auditory nerve forms the foundation of all auditory information sent to the brain. Neural coding of acoustic cues associated with pitch for the human auditory nerve was measured for normal-hearing young adults who were native English speakers. These measurements involved obtaining recordings for the compound action potential (CAP) in response to three dynamic stimuli (up-sweep, down-sweep, and mix). CAP results were compared to auditory evoked potentials associated with brainstem activity (FFR). Results support the hypothesis that the auditory nerve is sensitive to acoustic pitch cues, and that this sensitivity is similar to that of the brainstem. Future work is needed to determine if the encoding of dynamic pitch by the auditory nerve is enhanced in certain subject populations (i.e., native vs. non-native speakers of tonal languages, musicians vs. non-musicians), as has been observed for the auditory brainstem. ii iii TABLE OF CONTENTS ABSTRACT ii INTRODUCTION 1 METHODS 4 RESULTS 9 DISCUSSION 14 REFERENCES 16 iii 1 INTRODUCTION Electrocochleography (ECochG) is a technique used to record sound-evoked cochlear or auditory nerve potentials, and is measured from an electrode placed on the round window, promontory, eardrum, or external ear canal (Eggermont, 2017). ECochG can be measured in laboratory animals and in humans (Eggermont, 1976). Extra-tympanic electrodes are typically used in humans, including electrodes that make contact with the ear canal (e.g., TIPtrode) or eardrum (tymptrode) (Ferraro, 2010). The different types of recorded responses associated with ECochG include the cochlear microphonic (CM), summating potential (SP), and compound action potential (CAP) (Eggermont, 2017). Another potential recorded by ECochG is the auditory nerve neurophonic (ANN), which is a phase-locked neural response to the fine structure of the acoustic stimulus, and a variant of the ANN has been used to estimate frequency-specific thresholds for lowfrequency sounds (Lichtenhan et al., 2012). The CM and SP are thought to be generated by the cochlear hair cells (e.g., inner hair cells [IHC]; outer hair cells [OHC]); however, recent research suggests that the auditory nerve contributes, in part, to the SP (Pappa et al., 2019). The CAP and the ANN are generated by the auditory nerve fibers (Dallos et al., 1982). Electrocochleography is often applied clinically through objective audiometry and assessment of diseases such as Meniere’s and auditory neuropathy (Eggermont, 2017). This technique can be used to study auditory nerve (AN) encoding of sound in the normal (Eggermont, 1976) and impaired (Eggermont, 1977) auditory systems. Although the encoding of transient stimuli, such as clicks, has been well established, it is unclear the extent to which ECochG will reveal how the human cochlea and AN encodes the 1 2 steady-state acoustic cues associated with the temporal fine structure and temporal envelope of sound. The frequency-following response (FFR) reflects phase-locked neural activity synchronized to the temporal fine structure of the waveform (Krishnan et al., 2004). An FFR may be elicited by sounds including tone bursts, two-tone stimuli, binaural stimuli, and sounds whose spectra vary dynamically over time (Krishnan, 2021). The FFR is a noninvasive method of measuring the way in which the brain encodes sound frequency for frequencies below 1500 Hz. This method is often used to study the encoding of sound frequency by the human auditory system (Krishnan et al., 2004). The FFR has been used to study pitch, which is defined as the aspect of auditory perception whereby sounds can be categorized on a scale from low to high (Moore, 2011). The perceived pitch of complex sounds corresponds to the most frequent interspike interval in the AN at any given time (Cariani, 1996). The FFR shows prominent peaks at the fundamental frequency and the first harmonics (Krishnan et al., 2004). The scalp-recorded FFR is thought to carry pitch-relevant information originating from temporal discharge patterns of neurons in the rostral brainstem (Krishnan et al., 2009). Research on the encoding the acoustic cues of pitch has focused on evoked potentials that are likely generated by neurons in the brainstem and cortex. The FFR reveals pitch encoding for sounds whose pitch is constant over time or whose pitch changes with various velocities and accelerations (Krishnan, 2010). Previous studies found that musician’s brainstems may be differentially tuned by long-term exposure to the pitch 2 3 patterns inherent to music (Bidelman et al., 2011). Long-term experience with a tonal language may also sharpen the tuning characteristics of neurons along the pitch axis with enhanced sensitivity to linguistically relevant and rapidly changing sections of pitch contours (Krishnan et al., 2009). Though pitch has been studied using brainstem potentials, the encoding of pitch by the human AN has not been well established. A knowledge of encoding of pitch by the AN is essential because the AN is the foundation of all auditory information and provides the foundation for pitch encoding. It is important to understand the encoding of pitch by the AN to determine if such encoding is affected by language experience, musicianship, hearing impairment, age, and other factors. The aim of this study is to measure the neural coding of acoustic cues associated with pitch for the human AN and compare this coding with that of the brainstem. This aim will be accomplished by measuring the CAP and FFR in response to stimuli with time-varying pitch. The hypothesis of this study is that the results of the CAP and FFR will be similar, consistent with similar encoding of pitch for the AN and brainstem. 3 4 METHODS Participants Six normal-hearing young adults (female, 19-22 years old) participated in this study. Normal hearing was defined as: 1) pure-tone air-conduction thresholds < 25 dB HL, 2) no more than one air-bone gap > 10 dB for audiometric frequencies tested (250-8000 Hz), and 3) normal middle ear function as determined by tympanometry (i.e., type-A tympanograms). Audiometric testing was completed during an initial intake session. Participants were given instruction on the study objectives through oral presentation and provided written consent with documents approved by the Institutional Review Board of the University of Utah. Stimuli Stimuli consisted of linear upward and/or downward frequency sweeps containing three frequency components and lasting for one second. For the upward frequency sweep, 300, 600, and 900 Hz components increased in frequency by 300 Hz. For the downward frequency sweep, 500, 700, and 900 Hz components decreased in frequency by 200 Hz. A stimulus that combined both the upward and downward components was also presented (i.e., mixed sweep). All stimuli were presented at 90 dB SPL and the starting phase for all frequency components was 0 degrees. Spectrograms of the experimental stimuli are displayed in Fig. 1. 4 5 Figure 1. Spectrogram of linear frequency sweeps (1 s). “Up” contains three frequency components (300, 600, 900 Hz). “Down” contains three frequency components (500, 700, 900 Hz). “Mix” is the sum of “Down” and “Up”. Bioamplifier Inputs TM FPZ __ + + Probe microphone Ipsilateral ear Contralateral ear Figure 2. Placement of electrodes (ground, ipsilateral right earlobe, middle forehead (FPZ), and Tympanic membrane (TM)). 5 6 Subject Preparation and Procedures Prior to each evoked-potential recording session, otoscopy was conducted to confirm that the ear canals were clear and to determine if cerumen removal was necessary. In the case of excessive cerumen, wax was removed by a clinically-trained audiologist. A picture was taken of the eardrum before and after each testing session using a digital video otoscope. A tympanometry evaluation was also performed prior to and immediately following a testing session. The scalp and earlobe were prepped (Figure 2) with a disposable alcohol wipe and exfoliated with skin prep gel (NuPrep; Wever and Company, Aurora, CO). Re-usable disk electrodes for electroencephalography (EEG) were placed on the lower forehead (ground), the middle forehead (active, non-inverting), and the ipsilateral right earlobe (reference, inverting) using electrode paste (Ten20; Wever and Company, Aurora, CO) and secured in place with medical tape. The impedances for the disk electrodes were maintained < 3 kΩ. Evoked potentials were recorded using Tucker Davis Technologies (TDT) hardware and software, consisting of a bio-amplifier (TDTRA4PA), head stage (TDT RA4LI), signal processor (TDT-RZ6), and TDT Synapse software controlled by in-house MATLAB code. Prior to placing the eardrum electrode, the participant was asked to lie horizontally on a massage table within the testing booth. Saline was placed in the ear canal for approximately one minute before the participant drained the saline by turning their head and drying the excess with a paper towel. The custom tympanic membrane (TM) electrode (Simpson et al., 2020) was coated with conductive gel (Sepctra360, Parker Laboratories INC., Fairfield, New Jersey) and the tip of the electrode was placed on the 6 7 eardrum, as verified using a standard otoscope. The impedances measured for the TM electrode range from 7 to >20 kΩ; however, previous research suggests that TM electrode impedances are a poor predictor of the recording quality of cochlear potentials (Margolis, 1995, Simpson et al., 2020, Durrant, 1986), therefore no attempt was made to maintain TM electrode impedances below a criterion value. The position of the TM electrode was retained by the foam insert that was connected to the transducers for stimulus delivery. A foam insert was not placed in the left (non-test) ear. Ear canal sound pressure was measured simultaneously with electrophysiological recordings in the ipsilateral ear using an ER7C (Etymotic Research, Elk Grove, IL) probe microphone. The foam insert was modified to accommodate the probe microphone tubing. This modification involved creating a bore through the foam using a 1.5mm manual core drill designed for hearing-aid molds. The tubing of the probe microphone was pulled through a modified foam insert (ER3-14A, 13 mm) for insert earphones (ER3C) until the probe tube tip extended 0-2 mm past the medial surface of the insert. Prior to starting the main experiment, the quality of the TM electrode recordings was verified by recording CAPs in response to a series of clicks presented to the test ear from 40-110 dB peSPL. If the CAP amplitudes were consistent with laboratory normative data (Simpson et al., 2020) the main experiment commenced. Throughout each recording session, participants wore a mechanical oscillator on their ankle which vibrated every 90 seconds to ensure that each subject remained awake 7 8 throughout the duration of the session. When the motor vibrated, the participant clicked a button to stop the action. If the participant failed to stop the vibration, the experiment was paused and a lab assistant would ensure that the participant was awake before resuming the recording. The participant was also given an alert button for the purpose of requesting a break in the recording or to speak to the lab assistant. Of the six participants, only one pressed the button during the experiment. In this case, the participant requested that the experiment be paused so they could reposition. 8 9 RESULTS Temporal Responses to Stimuli A grand average of responses (TM and forehead electrodes) to the envelope and temporal fine structure recorded for the up-sweep is displayed in Fig. 3 as spectrograms, which are three-dimensional plots of frequency (kHz, y-axis) over time (s, x-axis) with the amplitude indicated by color temperature. Warmer colors (red, yellow) represent larger amplitudes and cooler colors (blue, green) show weaker amplitudes. The two panels on the top are the brainstem responses (FFR, left; EFR, right), while the bottom two panels represent responses from the cochlear hair cells (CM, left), and the AN (CAP, right). Responses associated with the temporal envelope and the temporal fine structure are indicated by white and red arrows respectively. The CM responded strongly to the temporal fine structure at each of the three upsweeping frequencies while the FFR and CAP responded relatively weaker. The CAP and EFR were the only measured evoked potentials that responded to the temporal envelope. 9 Results from neurons in the 1. the acoustic stimulus, de fluctuations (Fig. 1). man auditory system2. of auditory perception Figure 3: Spectrogram Analysis 10 Hair cell and temporal fine Neural respo envelope ng of pitch by the enerated by the human Temporal fine stru • The CM respon structure. • The FFR and C the fine structur associated with pitch g pitch. nvelope will be l be similar, consistent mation relayed to the Temporal envelop • The CAP and E envelope. pe re CM S1 S2 S3 S4 S5 S6 00 Hz Contralateral ear z) 900 Hz) Figure3. Spectrogram representation of the grand average of responses to the temporal envelope and fine structure recorded for the “Up” sweep. The two upper panels (FFR, EFR) refer to the brainstem responses, while the lower two panels (CM, CAP) represent the hair cells and auditory nerve responses respectively. The CM had the strongest response to the temporal fine structure, whereas the FFR and CAP responded more weakly. The CAP and EFR responded to the envelope. Figure 4: Cross Correlation Analysis CAP S1 S2 S3 S4 S5 S6 EFR S1 S2 S3 S4 S5 S6 • The CM is str acoustic stimu • The FFR, CA correlated or down frequen • The FFR, CA correlated wit lowing response ustic stimulus FFR S1 S2 S3 S4 S5 S6 10 Evo Aco 11 Cross-Correlation Analysis The cross-correlation analysis determined the degree of correlation between the evoked responses and the acoustic stimuli. Fig. 4 represents the responses from an example participant compared to the up-sweep stimulus. The evoked potential recordings and acoustic stimulus over time (s) are displayed as red and black lines, respectively. The CM was strongly correlated (r = .77) with the acoustic stimulus; the FFR, CAP, and EFR were weakly correlated with the stimulus for this participant. Table 1 contains the data from all subjects for each acoustic stimuli (up-sweep, downsweep, mix). Overall, the CM was strongly correlated with each stimulus for nearly all subjects. The FFR, CAP, and EFR were either weakly correlated or not correlated with the up and down frequency sweeps. These potentials were never correlated with the “mix” stimulus. 11 12 Figure 4. The responses from an example participant (red) compared to the up-sweep stimulus (black) The CM was strongly correlated with the acoustic stimulus, whereas the FFR, CAP, and EFR had weaker correlations. 12 13 CM S1 S2 S3 S4 S5 S6 up 0.66 0.77 0.51 0.75 0.87 down 0.67 0.82 0.57 0.54 0.88 0.81 mix 0.66 0.75 0.29 0.76 0.79 FFR S1 S2 S3 S4 S5 S6 up 0.20 0.24 down 0.27 0.22 - mix - CAP S1 S2 S3 S4 S5 S6 up 0.35 0.21 0.26 0.25 down 0.28 0.20 - mix - EFR S1 S2 S3 S4 S5 S6 up - down 0.28 0.21 0.20 - mix - Table 1. Data from all participants from each acoustic stimulus (up-sweep, down-sweep, mix). The CM was strongly correlated with up-sweep and down-sweep for nearly every participant. The FFR, EFR, and CAP were either weakly correlated or not correlated (dash symbol). The mix had no correlations. 13 14 DISCUSSION This study evaluated the hypothesis that the coding of pitch would be similar among neurons in the brainstem and the AN. This hypothesis was evaluated by measuring the CAP, CM, FFR, and EFR. Thus, the hypothesized result was that CAP and FFR cross correlation strength with the acoustic stimulus would be similar, consistent with the coding from the AN serving as the foundation of pitch information relayed to the brain. The data showed that responses to the temporal fine structure and envelope were stronger for the CAP compared to brainstem potentials (FFR, EFR). This is consistent with the greater phase-locking abilities of the AN compared to the brainstem (Joris et al, 2001). The strong CM responses recorded are consistent with robust encoding of the fine structure by the cochlear hair cells. These results are consistent with the AN serving as the foundation for encoding of pitch information. Moreover, the CM was seen to be a useful tool for evaluating the effective signal arriving to outer hair cells. The CM could also be used to evaluate the effect of auditory reflexes (MOC, MEM) on perception in future studies. Cross correlations were weak or absent for the FFR, CAP, and EFR which suggests that linear frequency sweeps presented at 90 dB SPL are weak elicitors of responses from the AN and brainstem. These coefficients were significantly smaller than a previous study by Bidelman and Krishnan (2010). The reason for this may be a result of using stimuli with multiple frequency components. Other studies (Swaminathan et al. 2008) used stimuli with non-linear frequency sweeps, which may evoke larger FFR amplitudes when 14 15 compared to linear frequency sweeps. Increasing the level of the stimuli in this study may have resulted in greater CAP, EFR, and FFR amplitudes; however, the stimuli was presented fairly loudly and a higher level may be uncomfortable for participants given the duration of the stimulus. Future Implications The results of this study support the idea that the effects of dynamic pitch on cochlear hair cells can be assessed by measuring the CM. Based on the comparison of this study and previous studies, the effects of dynamic pitch on neural responses (AN, brainstem) may be best assessed with a single-frequency stimulus that sweeps non-linearly in frequency across time. The encoding of dynamic pitch by the AN may be assessed with the CAP and used to compare pitch encoding for interesting subject populations (musicians vs. non-musicians, native vs. non-native speakers of tonal languages), who have been shown to have enhanced brainstem encoding of pitch. 15 16 REFERENCES Eggermont, J.J. (2017). Ups and downs in 75 years of electrocochleography. Frontiers in Systems Neuroscience, 11, 2. Eggermont, J. J. (1976). Electrocochleography. In Auditory system (pp. 625-705). Springer, Berlin, Heidelberg. Ferraro, J. A. (2010). Electrocochleography: a review of recording approaches, clinical applications, and new findings in adults and children. Journal of the American Academy of Audiology, 21(03), 145-152. Lichtenhan, J. T., Cooper, N. P., & Guinan, J. J. (2013). A new auditory threshold estimation technique for low frequencies: proof of concept. Ear and hearing, 34(1), 42. Pappa, A. K., Hutson, K. A., Scott, W. C., Wilson, J. D., Fox, K. E., Masood, M. M., ... & Fitzpatrick, D. C. (2019). Hair cell and neural contributions to the cochlear summating potential. Journal of neurophysiology, 121(6), 2163-2180. Dallos, P., Santos-Sacchi, J., & Flock, Å. (1982). Intracellular recordings from cochlear outer hair cells. Science, 218(4572), 582-584. Eggermont, J. J. (1977). Compound action potential tuning curves in normal and pathological human ears. The Journal of the Acoustical Society of America, 62(5), 1247-1251. Krishnan, A., Xu, Y., Gandour, J., & Cariani, P. (2005). Encoding of pitch in the human brainstem is sensitive to language experience. Cognitive Brain Research, 25(1), 161-168. 16 17 Krishnan, A., Xu, Y., Gandour, J. T., & Cariani, P. A. (2004). Human frequencyfollowing response: representation of pitch contours in Chinese tones. Hearing research, 189(1-2), 1-12. Krishnan, A. (2021). Auditory Brainstem Evoked Potentials: Clinical and Research Applications. Plural Publishing. Moore, B. C., & Gockel, H. E. (2011). Resolvability of components in complex tones and implications for theories of pitch perception. Hearing research, 276(1-2), 88-97. Cariani, P. A., & Delgutte, B. (1996). Neural correlates of the pitch of complex tones. I. Pitch and pitch salience. Journal of neurophysiology, 76(3), 1698-1716. Krishnan, A., Gandour, J. T., Bidelman, G. M., & Swaminathan, J. (2009). Experiencedependent neural representation of dynamic pitch in the brainstem. Neuroreport, 20(4), 408-413. Krishnan, A., Gandour, J. T., & Bidelman, G. M. (2010). The effects of tone language experience on pitch processing in the brainstem. Journal of Neurolinguistics, 23(1), 81-95. Krishnan, A., Swaminathan, J., & Gandour, J. T. (2009). Experience-dependent enhancement of linguistic pitch representation in the brainstem is not specific to a speech context. Journal of Cognitive Neuroscience, 21(6), 1092-1105. Bidelman GM, Gandour JT, Krishnan A. (2011). Musicians demonstrate experiencedependent brainstem enhancement of musical scale features within continuously gliding pitch. Neurosci Lett., 503(3):203-7. Simpson, M. J., Jennings, S. G., & Margolis, R. H. (2020). Techniques for obtaining high-quality recordings in electrocochleography. Frontiers in Systems 17 18 Neuroscience, 14, 18. Margolis, R. H., Saly, G. L., & Keefe, D. H. (1999). Wideband reflectance tympanometry in normal adults. The Journal of the Acoustical Society of America, 106(1), 265280. Durrant, J. D. (1986, August). Observations on combined noninvasive electrocochleography and auditory brainstem response recording. In Seminars in Hearing (Vol. 7, No. 03, pp. 289-304). Copyright© 1986 by Thieme Medical Publishers, Inc.. Swaminathan, J., Krishnan, A., Gandour, J. T., & Xu, Y. (2008). Applications of static and dynamic iterated rippled noise to evaluate pitch encoding in the human auditory brainstem. IEEE Transactions on Biomedical Engineering, 55(1), 281287. Krishnan, A., Gandour, J. T., & Bidelman, G. M. (2010). The effects of tone language experience on pitch processing in the brainstem. Journal of Neurolinguistics, 23(1), 81-95. 18 19 Name of Candidate: Kristin Thompson Date of Submission: May 9, 2022 19
Reference URL	https://collections.lib.utah.edu/ark:/87278/s6ceq2g6