ENS is hosting a Kaggle-style data challenge for machine learning classification/regression problems at challengedata.ens.fr. With the support of Ed Lalor and James O’Sullivan, we have posted our own contribution:
Attentional Selection at a Cocktail Party
Short Description
Use electroencephalography (EEG) data to identify which speaker a person is paying attention to.
Challenge Context
Auditory scenes with multiple sources (such as a noisy restaurant or a cocktail party) pose a challenging listening environment, particularly for those with hearing impairments. Current state-of-the-art hearing aids can help the wearer in these environments by using algorithms to guess what sound they are trying to listen to and amplifying that sound over the background. However, this guess is not always correct. The development of a brain-computer interface to identify the sound to which the listener is attending would pave way for the development of a generation of ‘smart’ hearing aids.
Challenge Goals
In this data challenge, a number of subjects were presented with two classic works of fiction at the same time: one in the left ear, and the other in the right. Each subject was asked to either attend to the first or the second story while their EEG data was collected. A questionnaire was used to ensure the subjects were attending to the correct story. The goal of this challenge is to use data from the subjects in the training set to determine which story an individual drawn from a second test set is listening to. More information on the experiment from which the data was taken is available from its resulting publications:
Power A.J., et al., “At what time is the cocktail party? A late locus of selective attention to natural speech”, Eur. J. Neurosci., vol. 35, no. 9, pp. 1497-503, 2012.
O’Sullivan J.A., et al., “Attentional selection in a cocktail party can be decoded from single-trial EEG”, Cereb. Cortex, vol. 25, no. 7, pp. 1697-706, 2015.
In addition to improving upon the speech envelope decoding strategy presented in the O’Sullivan paper, the contestants should feel free to explore other methods.
Data Description
The training data contains 60 cells in Matlab format. Each cell contains the 128-channel EEG response to a ~1 min long segment of audio from one of twenty subjects. Each subject was asked either to attend to the story presented in the left ear, or the right ear. 50% of the cells are with subjects attending left, and 50% are with subjects attending right. Each subject has 3 cells of data. The subject in cell 1 is the same subject in cells 2 and 3, the subject in cell 4 is the same subject in cells 5 and 6, etc.
The structure of the data in each cell is as follows:
.eeg{1} % [time x channel] EEG data
.wav{1} % [time x channel] audio data (channel 1 = left story, channel 2 = right story)
.dim % contains information about dimension order and channel labels
.fsample % contains EEG and audio sampling rates
.event.eeg % contains trigger information (EEG time point when audio starts)
.id % ID of data segment
.tgt % whether the subject was asked to attend to the left (1) or right (2) story
The .event.eeg field has 2 arrays. The ‘value’ array contains a list of trigger values. The ‘sample’ array contains a list of EEG time samples corresponding to each trigger value. The trigger value ‘200’ indicates the start of the audio. For example, if .event.eeg.value{5} equals 200, then the audio started playing at whatever EEG time sample is stored in .event.eeg.sample(5).
The test data is similar to the training data, except each cell consists of only 25 s EEG and audio, and no information is provided regarding which side the subject was attending to (i.e. no .id field). The data has already been aligned such that the audio starts at the first EEG time sample; as such, the .event field has been omitted as well. The goal is to determine, for each of the 200 cells in the test dataset, the story to which the subject was attending. You will be scored on the percentage of correct classifications.
The submission will be a .csv file with the following format:
ID, TARGET
ID1001, 2
ID1002, 1
…
ID1200, 2
Both training and test data have been preprocessed to minimize the presence of 50 Hz line noise, eye blink and muscle movement artifacts. Also, they have been average referenced.