Tech News
← Back to articles

Distinct neuronal populations in the human brain combine content and context

read original related products more articles

Patients and recording

All studies were approved by the Medical Institutional Review Board at the University of Bonn (accession number 095/10 for single-unit recordings in humans in general and 248/11 for the current study in particular). Each patient gave informed written consent both for the implantation of microwires and for participating in the experiment. We recorded from 17 patients with pharmacologically intractable epilepsy (9 male patients; 22–65 years of age), implanted with intracranial electrodes to localize the seizure onset zone for surgical resection45. Each depth electrode contained a microwire bundle consisting of nine microwires, consisting of a reference electrode with low impedance and eight high-impedance recording electrodes (AdTech), which protruded from the tip of the electrode by approximately 4 mm. All bundles were localized using a post-implantation CT scan co-registered with a pre-implantation MRI scan normalized to Montreal Neurological Institute space. The differential signal from the microwires was amplified using a Neuralynx ATLAS system, filtered between 0.1 Hz and 9,000 Hz, and sampled at 32 kHz. These recordings were stored digitally for further analysis. Our original dataset consisted of 50 experimental sessions from 17 patients with recordings from the amygdala, parahippocampal cortex, entorhinal cortex and hippocampus. One session from one patient was excluded due to low behavioural performance, resulting in a final sample of 49 sessions from 16 patients. The excluded session stemmed from the only patient for which no context neurons could be detected (others exhibited an average of 12.5). For most patients, ten microwire bundles (median of 80 channels) were placed with an average of 69.34 recorded channels (range of 32–80) and a neuron yield of 0.92 per channel. Each experimental session was recorded after a screening for visually selective responses in the morning of the same day. The mean (±s.d.) time interval between screenings and subsequent experiments was 4.71 ± 2.02 h (range of 1.40–9.15 h). In 13 of the 16 patients, more than one session was recorded. The mean (±s.d.) time interval between multiple sessions was 46.94 ± 24.19 h (range of 16.57–118.06 h). Neural dependence across 33 consecutive pairs of these sessions was assessed using intraclass correlation coefficients (ICCs(2,1))46,47. ICCs were computed across channels (that is, microwires), separately for context and stimulus neurons. ICCs quantify cross-session stability, with lower values indicating weaker dependence. Observed ICCs were compared with null distributions obtained by permuting channels. ICCs did not strongly exceed these null distributions (Hedges’ g: stimulus = 0.3415; context = 0.1568) and only reached significance (α = 0.01) in 7 of 33 consecutive stimulus neuron session pairs, 3 of 31 consecutive context neuron session pairs, and in no session pair for both, all with low overall effect sizes (Extended Data Fig. 12). We used the spike-sorting software Combinato48 with default parameters for the exclusion of noisy recording channels, artefact removal, spike detection and spike sorting. Afterwards, we manually removed remaining artefacts, merged potentially over-clustered units from the same channel and distinguished single from multi-units using the graphical user interface in Combinato. Highly similar spike shapes, inter-spike interval distributions, neural responses to visual stimuli, asymmetric cross-correlations and the absence of neural activity during refractory periods guided this procedure, which predated all further analyses.

Experimental design

The paradigm was performed on a laptop computer with the psychtoolbox3 (www.psychtoolbox.org) and octave (www.octave.org/) running on a Debian 8 operating system (www.debian.org) for stimulus delivery. Before the experiment, approximately 100 pictures of people, animals, scenes and objects were presented on a laptop screen in pseudo-random order (presentation for 1 s; 6 or 10 trials). Then, after automatic spike extraction and sorting with Combinato, neural responses to these pictures were evaluated based on raster plots and histograms. The aim of this procedure was to identify a subset of four pictures for the following experiment while maximizing the number of neurons that were expected to respond selectively to only one of the pictures. These four pictures were then presented in the experiment. Each self-triggered trial contained one out of five questions, a sequence of two of four pictures with jittered onsets, and an answer prompt displaying 1 or 2?. Patients indicated the sequential position of the picture that best answered the question by pressing key 1 or 2. We resolved ambiguous meanings of each depicted picture before the experiment and elaborated on the meaning of each question in a short test run of the paradigm. The questions were Bigger? (volume), Last seen in real life?, More expensive? or Older? (if the pictures set included a person), Like better? and Brighter?. Patients were instructed to try to stick to one answer for a given picture pair, but to keep mentally computing the answer to each question in the course of the experiment. In total, the experiment consisted of 300 trials in which each of the 5 questions and all 12 possible ordered picture pairs out of 4 pictures were presented equally often and in an unpredictable pseudo-random order. This resulted in 60 trials per question, 25 trials per picture pair and 5 trials per specific combination of question and picture pair.

Definition of neural populations

For each unit, repeated two-way repeated-measures ANOVAs with factors stimulus (picture), context (question) and interactions were computed from the activity during picture presentations (100–1,000 ms) at an α-level of 0.001. Units with (at least) a significant main effect of stimulus or context are referred to as stimulus or context neurons, respectively. If a unit only (at most) showed a significant main effect of either stimulus or context, it is termed a MS or a MC neuron, and if both main effects were significant, it is called a contextual stimulus neuron. Conversely, stimulus–context interaction neurons were defined by the presence of a significant interaction. We determined whether the observed number of neurons pertaining to each of these populations or all effect sizes exceeded chance in the following manner. First, the probability of obtaining observed numbers of significant neurons was assessed for each population separately by binomial tests with an expected false-positive probability of 0.001 corresponding to the α-level above. Second, to account for potential statistical dependencies between factors or neurons, repeated-measures ANOVA effect sizes of all neurons were compared with stratified label-shuffling controls, irrespective of statistical significance (Fig. 2e, see insets). Specifically, null distributions of partial η2 were obtained either by permuting picture labels for each question separately (factor stimulus) or question labels for each picture separately (factor context or stimulus–context interaction). Afterwards, effect sizes of data and controls were averaged for each patient (see Fig. 2e in lavender and grey) and their distributions compared (two-sided Wilcoxon signed-rank test, Bonferroni corrected). To additionally account for patient-level variability, we fit separate linear mixed-effects models for each ANOVA factor across all recorded neurons. Each model included a fixed intercept (capturing grand population effect size differences) and patient-specific random intercepts. Neural populations remained largely consistent, regardless of whether they were identified via label-shuffled ANOVA effect sizes or analytical ANOVA P values.

To assess the stability of stimulus representations across pre-screenings and main experiments, SVM decoders were trained on main experiment activity and tested either on pre-screening (across) or main experiment activity (within, leave-out cross-validation with matched trial counts). For 45 recording sessions, all four pictures of the main experiment were part of the pre-screening before the experiment. Picture-decoding accuracies highly exceeded chance, particularly for microwire channels that did versus did not contain stimulus neurons (Extended Data Fig. 11). There were no significant differences in decoding accuracy when testing on pre-screening data (across) or on main experiment activity (within), both for stimulus neuron channels and for non-stimulus neuron channels (two-sided Wilcoxon signed-rank test). Finally, Spearman correlations were computed between stimulus–context neuron proportions and stimulus or context neuron proportions across all session-site combinations (n = 168; Extended Data Fig. 3). Site-specific correlations were assessed using permutation tests (1,000 permutations of stimulus–context proportions, one-sided test for positive associations).

SVM population decoding

Linear SVM-decoding accuracies of context were computed from session-wise population activity of context neurons (Fig. 3a,c,d,f; see Fig. 3a for decoding from all neurons) and averaged for each patient, except in Fig. 3b in which a pooled decoding scheme was used (30 random subsamples of 151 stimulus or context neurons, corresponding to 75% of context neurons, were drawn to estimate variance). Figure 3a depicts these accuracies for each context separately. In general, decoders were implemented with functions from the LIBSVM library (v3.24) using standard parameters (unless stated otherwise) in custom scripts written in MATLAB R2021a or Python (scikit-learn 0.24.1), and a fivefold cross-validation scheme without overlap between training and testing data was used with five repetitions. Generalization of context decoding across picture identities (Fig. 3b, left in red), of stimulus decoding across contexts (Fig. 3b, right in green) or of either decoding across serial picture positions (in blue) was assessed. This was achieved by training with activity in response to each picture, each picture–context or each picture–position and decoding during the respective remaining ones. Training was performed either during picture presentations (Fig. 3a,b; 100–1,000 ms), question presentations (Fig. 3e) or during the course of the experiment (Fig. 3c,d,f; see below). Thus, obtained decoding accuracies were statistically compared with chance (one-fifth for context and one-quarter for visual stimuli; two-sided Wilcoxon signed-rank test in Fig. 3a,e; two-sided cluster permutation tests P < 0.01 in Fig. 3d,f) or to each other (Mann–Whitney U-test in Fig. 3a; and two-sided cluster permutation tests in Fig. 3d).

Session-wise context-decoding accuracies computed from binned population activity (400 ms, stride = 100 ms) at differing training and decoding times throughout the trial of all context neurons were visualized as a heat map (Fig. 3c). Context-decoding accuracies from identical decoding and training times (diagonals) were then plotted with standard errors for all 16 patients with context neurons (Fig. 3d, top in green) and compared with a label-shuffling control condition (Fig. 3d, top red). The latter was obtained by randomly permuting question context labels for each session before training and decoding. Time periods of significant differences of decoding accuracies from real data and controls were evaluated with a cluster permutation test (Fig. 3d, solid line below). Specifically, paired-samples Student’s t-tests (two-sided) were computed for each time bin (α = 0.01) to determine clusters of contiguous decoding accuracy time bin differences. Sums of t-values within each cluster were assessed relative to the null distribution of t-value sums obtained from 1,000 permutations of data and control labels (or chance levels, see below). If t-value sums fell into the top percentile of the permutation distribution, differences during the time period of the respective cluster were considered significant. Decoding accuracies of either first (Fig. 3d, bottom in grey) or second pictures (Fig. 3d, bottom in lavender) obtained by patient-wise training and decoding with the activity of stimulus neurons in the course of the experiment were plotted and compared with chance (one-quarter) in an analogous manner (Fig. 3d, bottom). In addition, we examined whether neural activity patterns of stimulus neurons reflected content rather than feature representations (Extended Data Fig. 4). Session-wise linear SVMs were trained on stimulus neuron activity (400–800 ms after first-picture onset) to decode first picture identities from neural activity throughout second picture presentations (400-ms bins, stride = 100 ms) for each of the five question contexts. Statistical significance across sessions was assessed using cluster-based, two-sided permutation tests against chance (α = 0.05, chance = one-quarter).

... continue reading