Abstract
Localization of sound sources relies on 2 main binaural cues: interaural time differences (ITD) and interaural level differences. ITD computing is first carried out in tonotopically organized areas of the brainstem nucleus laminaris (NL) in birds and the medial superior olive (MSO) in mammals. The specific way in which ITD are derived was long assumed to conform to a delay line model in which arrays of systematically arranged cells create a representation of auditory space, with different cells responding maximally to specific ITD. This model conforms in many details to the particular case of the high-frequency regions (above 3 kHz) in the barn owl NL. However, data from recent studies in mammals are not consistent with a delay line model. A new model has been suggested in which neurons are not topographically arranged with respect to ITD and coding occurs through assessment of the overall response of 2 large neuron populations - 1 in each brainstem hemisphere. Currently available data comprise mainly low-frequency (<1,500 Hz) recordings in the case of mammals and higher-frequency recordings in the case of birds. This makes it impossible to distinguish between group-related adaptations and frequency-related adaptations. Here we report the first comprehensive data set from low-frequency NL in the barn owl and compare it to data from other avian and mammalian studies. Our data are consistent with a delay line model, so differences between ITD processing systems are more likely to have originated through divergent evolution of different vertebrate groups.
Introduction
Interaural time differences (ITD) originate when a sound comes from one side of the head, arriving at the ipsilateral ear before the contralateral one. Animals and humans rely on ITD for localization of sounds on the horizontal plane and are able to detect ITD of only a few microseconds [Joris and Yin, 2007].
Neurons in the nucleus laminaris (NL) in birds and the medial superior olive (MSO) in mammals first encode ITD in the ascending auditory system. They act as coincidence detectors, firing maximally when the phase of the inputs from both ears is the same [Goldberg and Brown, 1969; Carr and Konishi, 1990]. In order to ‘tune' a neuron to a specific ITD, a multitude of mechanisms was suggested that delay the input from one side, thus creating a transmission time mismatch that is compensated for by the matching acoustic ITD [reviewed in Vonderschen and Wagner, 2014]. These mechanisms include differences in the length and/or myelination of input axons [Jeffress, 1948; Cheng and Carr, 2007; Seidl et al., 2010, 2014], precisely timed inhibition [Brand et al., 2002; Grothe et al., 2010], cochlear delays [Shamma et al., 1989; Day and Semple, 2011], asymmetric synaptic rise times [Jercog et al., 2010], asymmetric spectrotemporal tuning of left and right inputs [Fischer et al., 2011], and dynamic changes at the coincidence detection stage itself [Franken et al., 2015].
Data from birds support a time delay system of axonal delay lines first suggested by Jeffress [1948] that creates a topographic array of NL neurons, each responding maximally to sounds from a specific ITD and together forming a map of azimuthal space [reviewed in Ashida and Carr, 2011]. Data from mammals have resulted in an alternative model in which neurons from a given frequency band in the MSO respond maximally to a contralaterally leading ITD that lies outside the naturally heard range (defined by the animal's head size). This places the slope, rather than the peak, of the response curve in the natural ITD range. In addition, neurons in a given tonotopic band have nearly identical response curves, and derivation of a specific azimuthal location then requires a comparison of activity levels between the 2 brainstem hemispheres. This ‘2-channel model' has been suggested to rely on phase delays created through precisely timed inhibition [reviewed in Grothe et al., 2010].
The intuitive conclusion from these findings is that birds and mammals have evolved different ITD-processing mechanisms [Grothe and Pecka, 2014]. However, the work by Harper and colleagues [Harper and McAlpine, 2004; Harper et al., 2014] on optimal ITD-coding strategies opened up a different interpretation, suggesting that animal head size and the frequency range of coding may be the primary factors that determine the neural code. This can be tested by examining the neural organization in the relevant nuclei as a function of frequency. The barn owl is a prime candidate to address this question. The Jeffress type mechanism it uses for high-frequency ITD coding in the NL is well characterized, undisputed, and consistent with the model prediction by Harper and McAlpine [2004]. However, at frequencies below 3 kHz, this place code model is no longer the clearly optimal solution, and below 800 Hz a change to a population code model was predicted. Low-frequency data are scarce for the barn owl [Wagner et al., 2002, 2007; Carr and Köppl, 2004; Cazettes et al., 2014]. The aim of the present study was to obtain in-vivo recordings from the low-frequency region of the NL to test predictions of optimal coding.
Materials and Methods
Experimental Animals and Preparation
We report data from 11 adult European barn owls (Tyto alba) of both sexes and aged between 4 and 17 months. All protocols and procedures were approved by the authorities of Lower Saxony, Germany (permit No. AZ 33.9-42502-04-11/0337). Animals were anesthetized with an initial dose of ketamine (10 mg/kg) and xylazine (3 mg/kg) via intramuscular injection. Smaller doses of ketamine and xylazine were administered periodically to maintain anesthesia. The depth of anesthesia was constantly monitored via EKG recordings using intramuscular needle electrodes in a wing and in the contralateral leg. Cloacal temperature was monitored and maintained stable at 39°C using a homeothermic blanket system (Harvard Apparatus). The head was firmly held by cementing the skull to a small metal plate connected to a stereotaxic frame (Kopf Instruments, Tujunga, Calif., USA). The skull was opened and the cerebellum aspirated on one side to expose the surface of the brainstem for electrode placement, as guided by visual landmarks.
Electrophysiology and Definition of Recording Types
Recordings were obtained with borosilicate microelectrodes (1.2 mm outer diameter and 0.69 mm inner diameter) filled with either 2 M sodium acetate or artificial cerebrospinal fluid (138 mM NaCl, 2.5 mM KCl, 2.5 mM CaCl2, 1 mM MgCl2, 10 mM HEPES, and 26 mM glucose). Some electrodes were additionally loaded with 5% tracer (10,000 MW dextran labeled with Texas Red). Typical electrode impedances were between 10 and 20 MΩ. Electrodes were positioned under visual control and then advanced into the brainstem remotely using a piezoelectric motor (Burleigh Inchworm). Electrodes were connected to an Intra 767 electrometer (World Precision Instruments, Sunnyvale, Calif., USA).
In early experiments (2 owls), the electrometer was followed by a PC1 spike preconditioner (Tucker Davis Technologies, Alachua, Fla., USA) which amplified and band pass filtered (300-10,000 Hz) the recording. A spike discriminator (SD1; Tucker Davis Technologies) converted neural impulses into transistor-transistor logic pulses for an event timer (ET1; Tucker Davis Technologies), which recorded the timing of the pulses. In parallel, the analog waveforms were fed into a personal computer via an analog-to-digital converter (DD1; Tucker Davis Technologies) with a sampling rate of 48 kHz and a 16-bit resolution. In later experiments, we used a different hardware configuration. The PC1 spike preconditioner was kept in order to provide amplification, but the signal was then passed through a Hum Bug (Quest Scientific Instruments Inc., North Vancouver, B.C., Canada) and into a TDT RX6 multifunction processor. Band pass filtering (50-10,000 Hz) and spike detection were carried out after the signal had been converted from analog to digital (48-kHz sampling rate, 24-bit resolution) using a custom Matlab (vR2012b; MathWorks, Natick, Mass., USA) script.
Single-unit recordings are difficult to obtain in NL and MSO due to the small and variable amplitude of the spikes from neuronal somata [Scott et al., 2005; Funabiki et al., 2011] and the presence of a strong field potential, i.e. the neurophonic [Tsuchitani and Boudreau, 1964; Sullivan and Konishi, 1986]. In order to improve unit isolation, we used the loose-patch technique described by Peña et al. [1996]. For this, a 5-ml glass syringe was connected to the electrode and a slight positive pressure (corresponding to 1 ml) was maintained while advancing the electrode in order to keep its tip clean. When spikes were detected and the presence of a nearby cell was suspected, the positive pressure was released and, if judged necessary, a small negative pressure was applied. On many occasions, this technique greatly improved the isolation of spikes. Subthreshold events were, however, never clearly observed. Well-isolated single units could be held for 20 or more minutes, allowing the full range of measurements. However, stability was a concern, especially for single units, and isolation could be and was lost without warning. The practical result of this is that the more time-consuming measurements [characteristic delay (CD) and characteristic frequency (CF) measurements] could not always be recorded.
The type of recording (single unit, multiunit, or neurophonic) was finally defined offline using the recorded analog data. This also defined the response metric analyzed. Traces were classified as spike recordings when they presented consistent action potentials that rose above the background noise and that allowed for flagging using a fixed threshold. Single units were defined as showing no or only very few interspike intervals of 1 ms or below, i.e. within the refractory period. In 2 of 19 single units, the spike sorting script ‘wave_clus' created by Quiroga et al. [2004] and available from https://vis.caltech.edu/~rodri/Wave_clus/Wave_clus_home.htm was used to separate the response of a single unit within a multiunit spike recording. All responses were tested for a significant neurophonic component using the method described by Köppl and Carr [2008].
Stimulus Generation and Calibration
All recordings were performed in a double-walled sound-attenuating chamber (Industrial Acoustics Corporation, Winchester, UK). Closed, custom-made sound systems were inserted into both ear canals for controlled stimulation. These systems consisted of small earphones (Yuin PK3, Sony MDR-E818) and miniature microphones (Knowles TM-3568, EM-3069, or FG-23329) calibrated using a Brüel and Kjaer microphone (4134; Naerum, Denmark) as the reference. Sound pressure levels (SPL) were then individually calibrated for each ear.
Sound stimuli could be monaural or binaural and were generated separately for both channels by custom-written software and a signal-processing device (AP2 or RX6; Tucker Davis Technologies). Stimuli were fed into the earphones via D/A converters (DD1 or RX6; Tucker Davis Technologies), antialiasing filters (FT6-2 or RX6; Tucker Davis Technologies), and attenuators (PA4 or PA5; Tucker Davis Technologies). All stimuli had a total duration of 50 ms, including 5-ms ramps, and were presented with an interstimulus interval of 120 ms.
Data Collection Protocols and Analysis
The best frequency (BF), i.e. the frequency that evoked the largest response, was determined by presenting a wide range of frequencies at a fixed SPL of 0-20 dB above the threshold as estimated audiovisually. This test was usually run with identical binaural stimulation; in some cases, however, monaural BF curves were run separately. Randomly inserted silent trials were used to determine the spontaneous rate.
To obtain an estimate of the threshold and the response saturation level, monaural rate-level curves were run at a frequency close to the BF.
Frequency Threshold Curves and CF
Frequency threshold curve (FTC) data were always obtained monaurally. Responses were recorded into a randomly presented matrix of frequencies and SPL, in steps of typically 100 Hz and 5 dB, and over a range of typically 1 kHz and 50 dB. FTC were interpolated from this response matrix after smoothing with a locally weighted algorithm [Köppl, 1997]. For spike recordings, the threshold was defined as a response about 20 spikes/s above the spontaneous rate as determined from randomly inserted silent trials. For neurophonic data, the lowest criterion that gave a coherent curve was used. The frequency at which the criterion response was reached at the lowest SPL defined the CF, and the corresponding SPL defined the threshold at the CF. We also derived, when possible, the Q10 dB and Q40 dB.
Best ITD and Interaural Phase Difference
The best ITD is the ITD that evokes the largest response. The range of tested ITD was ±1 period at or near the BF, in steps of one tenth of a period. The SPL was typically fixed at 0-20 dB SPL above the threshold. For spike recordings, the mean rate was derived at each ITD tested; for neurophonic recordings, we determined the average analog amplitude. A criterion that defined significant response modulation with ITD, i.e. the presence of ITD selectivity, was adopted from Köppl and Carr [2008]. Responses that fulfilled this criterion were fitted with a cosine function at the stimulus frequency to determine the best ITD and the best interaural phase difference (IPD). The best ITD was defined as the ITD closest to 0 µs ITD that elicited a maximum response.
Characteristic Phase and CD
The characteristic phase (CP) and CD were derived by performing ITD tests at several different frequencies for the same unit or neurophonic site. Three to 7 frequencies were used, covering a range of 300-600 Hz around the CF. We determined the best IPD for each frequency as described above and entered them into a linear regression of best IPD as a function of frequency [Yin and Kuwada, 1983]. The y-intercept of this regression corresponds to the CP, and the slope corresponds to the CD. CP values were collapsed into a single cycle (-0.5 to 0.5).
Labeling and Histology
Labels were placed iontophoretically at selected recording sites by passing a positive DC current through the electrode. The current amplitude and duration varied between 5 and 500 nA and between 1 and 30 min, respectively. This large variation is due to experimentation to find a set of parameters that resulted in small, specific labels. The set of parameters that yielded the best results was 20 nA for 5 min. At the conclusion of the experiment, the animal was perfused transcardially with 4% paraformaldehyde in phosphate-buffered saline in order to fix the tissue. The brain was extracted and blocked, and the brainstem was cryoprotected by immersion in 30% sucrose in phosphate-buffered saline for 48 h. Fifty-micrometer sections were cut using a cryostat (Leica CM 1950; Leica Biosystems, Wetzlar, Germany) and mounted in Vectashield. Any fluorescent labels were then detected and documented using a Nikon Eclipse 90i epifluorescence microscope with a digital camera attached. After that, sections were remounted and dried on gelatinized slides, counterstained with cresyl violet, dehydrated, and permanently coverslipped with DPX. All sections containing NL were then photographed under standard bright-field illumination. The distance between the medial edge of the NL and the midline, as well as the mediolateral extent of the NL, was measured by carefully following the nucleus' shape in each section. We distinguished medial, caudolateral, and intermediate ‘fold' regions [after Köppl and Carr, 1997] (fig. 1). These measurements were used to create a flattened reconstruction of the NL and to represent the locations of labeled sites in normalized coordinates on the caudorostral, mediolateral, and dorsoventral axes (fig. 1).
Results
We report a total of 129 recordings from barn owl NL, 19 of which were extracellular single-unit recordings (fig. 2a), 10 of which were spike multiunit recordings (fig. 2b), and 100 of which were neurophonic recordings (fig. 2c). An example of each type of recording is shown in figure 2. The neurophonic is an extracellular field potential that mimics the input signal [Tsuchitani and Boudreau, 1964; Weinberger et al., 1970]. This neurophonic potential is unusually strong in the barn owl NL [Sullivan and Konishi, 1986]. The BF recorded ranged from 100 to 3,571 Hz, and 80% (including all single- and multiunit spike recordings) were below 3,000 Hz. These can be considered low frequencies for the owl, since these frequencies are not represented in the main (medial) body of the NL but are rather in its folded and caudolateral regions. Thirty-five percent of the recordings corresponded to frequencies at or below 800 Hz, which is the approximate transition frequency where a change to a 2-channel model was predicted by Harper and McAlpine [2004].
Similarity of Neurophonic and Spike Responses at the Same Site
The presence of a strong neurophonic response was consistent throughout all of the recorded regions. The neurophonic was well modulated as a function of the ITD when the electrode was judged to be inside the cellular region of the nucleus. To test how well neurophonic responses reflected the local neural activity, we analyzed 12 cases of paired recordings where spikes and neurophonics were obtained with the same electrode in close proximity (within 0-150 µm of each other). We determined any mismatch between their best IPD and BF (table 1). The best IPD (as opposed to the best ITD) was chosen to account for the difference in period and thus maximize comparability across sites of very different BF. The BF of the 12 paired recording sites ranged from 100 to 3,500 Hz. Six pairs had best IPD mismatches of 0.1 cycles or less. The remaining 6 had best IPD mismatches between 0.14 and 0.38. These larger IPD mismatches were more often seen at low-frequency (<500 Hz) recording sites than at high-frequency ones (4/7 and 2/5, respectively). Recording sites below 500 Hz also showed larger mismatches between spike and neurophonic BF, i.e. up to 63% of the ipsilateral CF. We suggest that the increased probability of an appreciable mismatch is due to the higher neuron density in low-frequency regions of the NL [Köppl and Carr, 1997].
CF, Thresholds, and Tuning
CF values ranged from 150 to 3,500 Hz. Thresholds were variable, ranging from 13 to 57 dB SPL (fig. 3a). The sharpness of tuning, as measured by the Q10 dB, ranged from 1.5 to 15, with slightly lower values at low frequencies (fig. 3b). The spontaneous rate of single units ranged from 4 to 200 spikes/s (fig. 3c). Ipsi- and contralateral thresholds and Q10 dB values were not significantly different (Wilcoxon's signed-rank test, p = 0.767 and p = 0.794, respectively, n = 25 for the threshold and n = 10 for the Q10 dB). There were, however, significant mismatches between the CF obtained with ipsi- and contralateral stimulation (Wilcoxon's signed-rank test, p = 0.049, n = 26). The extent of these CF mismatches is shown in figure 4a, expressed as a percentage of the ipsilateral CF. We explored whether these differences in CF had any predictive value regarding the best ITD. For this, group delay data from the auditory nerve of the barn owl [fit shown in fig. 10A of Köppl, 1997] were used to predict the latency difference due to the CF mismatch and compare it to the best ITD for that particular recording site (fig. 4b). There was no correlation between the two (Spearman's rank correlation, p = 0.156, n = 22). Indeed, not even the sign of the CF difference consistently predicted the correct side leading.
Best ITD and IPD Distribution
Best ITD values ranged from 0 to 5,000 µs contralaterally leading and to -1,000 µs ipsilaterally leading. There was a bias towards contralaterally leading ITD, consistent with previously reported data, especially at lower frequencies (fig. 5a). The range of ITD represented clearly increased with decreasing frequency. Harper and McAlpine [2004] defined 3 frequency ranges with different predicted optimal systems for the barn owl given its head size: >3,000 Hz (Jeffress-like place code predicted as optimal), 800-3,000 Hz (ambiguous), and <800 Hz (2-channel population code predicted as optimal). Figure 5c shows the median and range of best ITD values within these 3 frequency ranges for the different recording types. There were no significant differences between recording types in any of these frequency ranges (Kruskal-Wallis test, p > 0.07 in all cases); therefore, the lumped distribution is also shown in figure 5d.
Best IPD values generally ranged from -0.5 to 0.5 (fig. 5b). We found only 3 best IPD values outside the pi limit, a range corresponding to half the period of the stimulus frequency and equivalent to the maximum best ITD that can be generated using phase delays [Vonderschen and Wagner, 2014]. The distribution of best IPD was homogeneous, with no clear clustering around specific values at different frequencies and no frequency-dependent distribution across all frequencies (fig. 5b). There were no significant differences between recording types in any of the frequency ranges (Kruskal-Wallis test, p > 0.27 in all cases).
Due to the nature of ITD sensitivity, neural response modulation is cyclical. Thus, with our usual ITD testing range of ±1 period of the stimulus frequency, we expected to see 2 response maxima - one of which would usually lie in the ipsilaterally leading range of ITD and the other which would be in the contralaterally leading range. This causes the best ITD and IPD values to be ambiguous, since it cannot be resolved which of the 2 possibilities truly corresponds to the ITD conveyed by the neuron's inputs. For our analysis, we took the peak closest to 0 as the physiologically relevant one. However, only additional measurements such as taking responses at several different frequencies and determining the common CD (see Materials and Methods) can truly resolve this ambiguity. Among our 129 recording sites, 33 were extensively tested in this way (an example shown in fig. 6). Of those, only 2 cases emerged in which the disambiguated response maximum was not the one closest to 0.
CP and CD
The relationship between phase and frequency can usually be expressed using a linear equation (an example is shown in fig. 6c). The slope of this equation is the CD and the y-intercept is the CP. Pure time-delay systems like the Jeffress model are expected to show CP close to 0 or 1. Other values indicate that there is some phase delay contribution [Vonderschen and Wagner, 2014].
The distribution of CD values was very similar to the best ITD distribution in that it was strongly contralaterally biased, it was distributed homogeneously, and it showed a greater spread at lower BF (fig. 7a). Phase-frequency relations were quite diverse (fig. 7b). Twenty of 31 cases showed a CP close to 0 or 1 (within ±0.15, indicated by the dashed lines in fig. 7b), which indicates a CD close to the peaks of the ITD curves. Seven cases showed intermediate CP, indicating that the CD occurred at some point along the slopes of the ITD curves. Lastly, 4 of 31 cases had CP values close to 0.5 (within ±0.15), which indicates that the CD occurred near the troughs of the ITD curves. The distribution of CP did not seem to depend on the frequency. There were no significant differences in CD or CP between the recording types (Kruskal-Wallis test, p > 0.14 in all cases).
Anatomical Position of the Labeled Recording Sites
A total of 7 recording sites from 5 different NL were confirmed by labeling and all were located in the caudolateral, low-frequency region of the NL (see fig. 1 for definition of the regions). In the cases where 2 sites were labeled in the same NL, relative entry points of the respective electrode penetrations were used to unambiguously assign the 2 labels. Four labels were deposited at sites where a single unit had been recorded; the label, however, was extracellular in all cases and radiated over 18-240 µm around the injection center.
Our data showed no significant correlations between the best ITD and the caudorostral or dorsoventral coordinates of the label (Spearman's rank correlation, all p > 0.05, n = 7). However, the mediolateral coordinate showed a significant correlation with the best ITD (Spearman's rank correlation, p = 0.003, correlation coefficient ρ = -0.93, n = 7; fig. 8). The best ITD decreased with the distance from the tip of the caudolateral portion, i.e. more medial positions showed smaller best ITD. Regarding tonotopic organization, we found 2 weak (not statistically significant) trends in the caudorostral and mediolateral axes, with the frequency increasing rostrally and laterally.
Discussion
The data reported here constitute the first comprehensive data set of in vivo recordings from the lower frequency regions of the barn owl NL. Their relevance is 2-fold. First, they allow a more direct comparison of the physiology of the barn owl NL with the mammalian MSO, since they cover an overlapping frequency range. Second, they allow us to examine which of the existing ITD-processing models is supported best. We first discuss a technical point regarding the validity of including neurophonic recordings in addition to unit recordings. We then argue that the data are consistent with previous studies in both owls and other birds and they therefore provide no evidence for a fundamental change in ITD coding between low and high frequencies in the barn owl.
Validity of the Neurophonic as a Proxy for NL Responses
Neurophonic responses in the NL and MSO show a clear ITD sensitivity [Wernick and Starr, 1968; Sullivan and Konishi, 1986] and thus reflect the defining characteristic of NL and MSO neurons. Direct, empirical comparisons between local neurophonic and unit responses are rare but have shown good correspondence [Köppl and Carr, 2008]. However, the source of the neurophonic is not entirely clear and very probably differs depending on morphological organization [Kuokkanen et al., 2010; McLaughlin et al., 2010; Goldwyn et al., 2014]. Specifically, the bipolar orientation of dendrites and the associated spatial segregation of ipsi- and contralateral inputs that is typical of the mammalian MSO and the chicken NL have repeatedly been suggested to generate a bipolar electric field [e.g. Galambos et al., 1959; Schwarz, 1992]. Recently, Goldwyn et al. [2014] showed that, depending on the distance from the cell body layer, the neurophonic potential with this particular anatomical configuration may actually be larger for the out-of-phase compared to the coincident arrival of ipsi- and contralateral inputs. In a typical test for ITD selectivity, this would lead to erroneous assignment of the best ITD, depending on the electrode's location along the dipole field. We argue that this prediction does not apply to our recordings from the owl's caudolateral NL because the morphology of this region is unique and resembles neither the chicken NL nor the MSO, nor, indeed, the medial part of the owl NL [Köppl and Carr, 1997]. Neurons in the caudolateral region are not all bipolar and those that are bipolar do not show a uniform orientation; when quantified, only 32% of those cells were oriented orthogonal to the lateral border and 21% were parallel to it [Köppl and Carr, 1997]. In addition, the electrode angle in our recordings was oblique relative to this predominant bipolar dendritic plane, rather than parallel to it, as assumed in the model of Goldwyn et al. [2014]. Lastly, as our primary aim was to obtain unit recordings from the cellular area of the NL, data collection was primarily in close proximity to cells, a judgement which was confirmed by the fact that all labeled sites were found within the nucleus.
We carefully compared the responses to ITD and frequency between neurophonic and spike responses obtained in close proximity in this study. We observed an increased range of differences between such pairs in best IPD and BF at frequencies below 500 Hz, but not any systematic bias (table 1). This increases the variability of anatomical correlations with physiology (since the BF or the best ITD of a specific low-frequency neurophonic might slightly differ compared to a single unit at the same location) but does not principally invalidate neurophonic data. Indeed, as observed previously [Köppl and Carr, 2008; Carr et al., 2009], there were no significant differences in best ITD, best IPD, CD, or CP between the populations of single-unit, multiunit, and neurophonic recordings. Thus, all of the conclusions in this paper are consistent with the single-unit data, although they comprised a minority of the sample.
Evidence for Different Types of Input Delays
The distribution of best ITD in the low-frequency region of the barn owl NL showed the same contralaterally leading bias as previously found for the higher-frequency regions of the same nucleus [Sullivan and Konishi, 1986; Carr and Konishi, 1990; Peña et al., 2001]. In addition, the best-ITD distribution within any given frequency band appeared homogeneous over its range (fig. 5a). This is in agreement with previous data on low-frequency responses in the barn owl's core of the inferior colliculus [Wagner et al., 2007]. However, an inhomogeneous representation of the ITD has been observed downstream [Cazettes et al., 2014]. A homogeneous distribution is consistent with a Jeffress type representation of the ITD, based on time-delayed inputs. In contrast, a 2-channel model, based on phase delays, predicts some degree of clustering around a specific ITD value that should decrease with increasing frequency [McAlpine et al., 2001; Brand et al., 2002; Vonderschen and Wagner, 2014]. There was clearly no clustering in our data, and many values fell near 0. However, the overall ITD range broadened with decreasing frequency - a typical finding in both avian/crocodilian and mammalian studies [Carr et al., 2009; Bremen and Joris, 2013]. Such broadening is not predicted by the classic Jeffress model and has thus been advanced in favor of a phase delay system, although this is quite controversial [reviewed in Joris and Yin, 2007]. Intriguingly, a recent study modeling the developmental plasticity of ITD coding circuits concluded that a similar frequency dependence of the best ITD range may result for systems based on time delays and phase delays alike [Fontaine and Brette, 2011]. Therefore, a larger dispersion of best ITD at lower frequencies has little conclusive value and is no grounds for dismissing a delay line model.
A related parameter is whether the data stay confined within the so-called pi limit, a theoretical limit for the best ITD distribution of a phase delay system [Vonderschen and Wagner, 2014] and equal to half the period of the stimulus frequency. Theoretically, the pi limit does not apply to a pure time-delay system such as the Jeffress model, where best ITD should instead be found all across the naturally occurring range, irrespectively of the frequency band. Our dataset presented only 3 points outside the pi limit, consistent with a phase-delay system and also consistent with low-frequency IC data in the barn owl [Wagner et al., 2007]. However, as pointed out by Wagner et al. [2007], even in a time-delay system, values above the pi limit are only expected to occur once the naturally heard range of ITD exceeds that limit. At low frequencies, this is never the case. Thus, again, the observed data distribution does not conclusively suggest a time delay or a phase delay system.
Finally, the stereausis model [Schroeder, 1977; Shamma et al., 1989] is based on cochlear delays and monaural inputs to the coincidence detectors that are mismatched in the CF. Our data from the low-frequency region of the owl NL do not support such an origin of delays, which is consistent with previous findings at higher frequencies in the NL [Peña et al., 2001; Fischer and Peña, 2009] and in the inferior colliculus [Singheiser et al., 2010]. While we did observe CF mismatches between some monaural responses in the NL, they appeared random and were not at all predictive of the measured best ITD (fig. 4). Importantly, though, these random CF mismatches could account for a phase delay component indicated by the occurrence of nonzero CP (fig. 7b), even if they are compensated by a time delay mechanism such as axonal delay lines [Day and Semple, 2011].
Topographic Representations of Frequency and ITD in the Barn Owl Caudolateral NL
Our labeling data showed the presence of a systematic, topographic representation of the ITD in the low-frequency, caudolateral region of the NL, with large best ITD located more laterally than ITD close to 0 (fig. 8). Single-unit recordings suggested an additional topography of best ITD along the dorsoventral axis; however, this was not significant for the total data population (not shown).
Regarding tonotopic organization, previous studies have found a change in frequency along the mediolateral axis of the caudolateral region of the NL [Köppl and Carr, 1997; analysis of the termination sites of labeled nucleus magnocellularis afferent axons]. This is consistent with the weak trend observed here and, together, suggests that the tonotopic organization in the caudolateral part of the NL continues the pattern known from the medial, high-frequency region [Takahashi and Konishi, 1988; Carr and Konishi, 1990]: isofrequency bands run at an angle relative to the brain's midline, from caudomedial to rostrolateral. This is also the known pattern in the chicken NL [Rubel and Parks, 1975; Köppl and Carr, 2008]. Thus, there is tentative evidence for a tonotopic organization as previously shown for other parts of the NL and other species.
The topography of the best ITD along the mediolateral axis is consistent with a chicken-like organization in which a topographic representation of the best ITD has been shown along each isofrequency band [Köppl and Carr, 2008]. Unfortunately, it was not possible to plot our data in an exactly comparable way, i.e. along the isofrequency dimension, because of the uncertainties regarding the tonotopic axis. However, a topography along the mediolateral axis, as found here, is predicted with the present plane of sectioning (fig. 9). The tentative topography of the best ITD along the dorsoventral axis that we observed for single-unit data only is not directly predicted from the chicken data; however, the chicken NL lacks this dimension over most of its extent. Intriguingly, the dorsoventral dimension is the main axis of the best ITD topography in the owl medial NL [Sullivan and Konishi, 1986; Carr and Konishi, 1990]. This is a surprising, tentative similarity that warrants further investigation.
In summary, our data provide strong evidence for a topographic representation of best ITD within each isofrequency band in the owl NL. Such an organization is a hallmark prediction of the Jeffress model. Next, we ask whether the physiology of the owl's low-frequency NL further supports a Jeffress-like coding scheme or not.
Relation of ITD Distribution to the Owl's Natural Range
In a 2-channel model of ITD coding, many best ITD values are predicted to fall outside the naturally heard range of the animal [Harper and McAlpine, 2004; Harper et al., 2014], while the Jeffress model predicts all values to fall within that range. What is this range for the barn owl? Its acoustic range of ITD was measured as ±250 to ±300 µs, and it was almost invariant over a broad frequency range [Poganiatz et al., 2001; von Campenhausen and Wagner, 2006; Hausmann et al., 2010]. Estimates derived from cochlear microphonics in the closely related grass owl agree with these values at high frequencies. In contrast, significantly higher ITD ranges, i.e. up to ±400 and ±550 µs, were found at frequencies <1 kHz [Calford and Piddington, 1988; table 2]. This is a consequence of the open, internally coupled middle ears and suggests that, at low frequencies, the ITD range actually perceived by the owl might be significantly enhanced compared to that acoustically measured in the outer ear canal [e.g. Christensen-Dalsgaard, 2005]. Although it is difficult to predict exactly what the owl's ITD range will be at even lower frequencies, it can be expected to increase further [Calford and Piddington, 1988; Larsen et al., 2006]. We thus argue that the majority of best ITD reported here fall within a plausible physiological range. Whether our largest observed values might still be physiological can only be clarified by a more thorough characterization of the ITD cues down to these low frequencies.
Is ITD Coding in the Owl Different for the Low- and High-Frequency Ranges?
Our physiological data did not show any significant deviation from what has been observed in the higher-frequency region of the barn owl NL, neither in the 800- to 3,000-Hz frequency range, where the place code model may not be optimal any more, nor below 800 Hz, where the presence of a population code was predicted by Harper and McAlpine [2004]. In fact, the distribution of ITD below 800 Hz suggests the presence of a Jeffress-like code, since the median of the best ITD distribution at those frequencies, i.e. 280 µs, was well within the physiological range of the barn owl. Even the interquartile range, which covered the contralateral space from 0 to 750 µs (fig. 5d), is still a plausible physiological range for the barn owl at low frequencies. A model by Fischer and Seidl [2014] estimated the minimum resolvable ITD based on the peak or slope regions, respectively, of idealized single-unit ITD selectivity curves at different BF. Their results suggest that the barn owl should be able to use the peaks of ITD response curves to discriminate ITD at frequencies lower than 500 Hz (table 2). If the slopes had been used, which is in principle also compatible with a Jeffress-like place code, the boundary BF would have been even lower. In addition, we found evidence for a topographic representation of the best ITD. Such an organization is a hallmark of a Jeffress-like delay line model. Taking both anatomical and physiological data together, the most parsimonious interpretation is that there is no fundamental change in organization between the high- and low-frequency regions of the NL in the owl.
A likely explanation for the absence of the change predicted by the optimal code model is that the prediction assumed a physiological range for the barn owl based only on head size and disregarding the effects of internal coupling of the middle ears. This would result in an underestimation at low frequencies, where the internal coupling increases the maximum range of ITD. We conclude that physiology, topographic organization, and theory support a Jeffress-like place code of ITD in the barn owl NL across the entire tonotopic range. The relative role, if any, of the low-frequency NL for sound localization in the owl is a separate and interesting question for future studies.
Acknowledgements
This study was supported by the Deutsche Forschungsgemeinschaft (CRC Active Hearing, project A14). We thank Jose Luis Peña, Sharad Shanbhag, and Go Ashida for the use of and support with custom-written software. Sandra Buschhaus provided expert technical support for histology. Daniel Erlemann participated in the histological analysis as part of an undergraduate project.