Hackathon: EEG Data Analysis

9 minute read

This project was developed during the 2021 hackathon organized by G-Tec. On these two days, we analyzed EEG data recorded during stroke rehabilitation. In particular, that was a motor imagery Brain-Computer Interfaces (BCIs) rehabilitation.

In recent years, BCIs are widely used as rehabilitation tools to help the recovery process from strokes. Thanks to BCIs, we can detect brain signals, and use them to recognize the willingness of the subject to move a part of their body, like. In fact, before performing a movement, we rehearse it in our minds and we activate a specific brain area. This process is called motor imagery (MI), and the resulting brain signal can be recorded and interpreted to enhance therapy.

During the rehabilitation process, it is fundamental that the patient feels ownership over the movement of the limbs. For this reason, the treatment involves Functional Electrictrical Stimulation (FES) to stimulate the forearm muscles and move the limb. After interpreting the brain signal, the FES is responsible for moving the correct limb, in this case, right or left hand. Consequently, the patient obtains positive feedback and reinforcement each time he performs the right motor imagery.

Data description

The three patients involved in this experiment had upper extremity hemiparesis. Each participant was involved in three months of MI training. Before and after this intervention, the data has been stored to evaluate the improvement caused by the treatment. For each patient, we have four datasets: Before rehabilitation:

Training set
Testing set After rehabilitation:
Training set
Testing set

The data we received contained the recorded signal for each of the 16 channels with a sampling rate of 256 Hz, a label vector that included 0, 1, and -1 values corresponding to the movement of the right hand, the left one, or no movement at all;

Methods

Preprocessing

An EEG signal is a sum of different potentials generated by neuronal activity. Generally, EEG signals present some spurious artifacts not related to the task of interest, for example, a blink of an eye, a movement of the head, or simply the powerline noise. Therefore, the first step to deal with EEG signals consists of filtering the signal both on time and spatial domains.

Preparation and imagination of movement produce an event-related desynchronization (ERD) or synchronization (ERS) over the sensorimotor areas. This phenomenon originates from power changes over the mu and beta rhythms (i.e. around the 8–30Hz frequency band). For this reason, we applied a Butterworth bandpass filter of eight orders between these frequencies. Before applying the filter, we cut the first 10-15 seconds of acquired data since no tasks were performed at this time and there was no relevant information. To perform spatial filtering, we use the independent component analysis (ICA), a linear decomposition method that permits the removal of artifacts like eye movements or face muscles activation from signals. The way it operates is by identifying the projection patterns of the source to the scalp surface. It is then possible to localize the unwanted artifacts and eliminate them. During our analysis, we did not locate any artifacts: probably the exploited BCI had already filtered those effectively.

ICA applied on the sixteen channels:

Once we have filtered our signals, we split both the filtered and unfiltered data into epochs of 2, 6, and 8 seconds.

Feature extraction

To enhance the classification process, we extracted signal features from data by using Common Spatial Pattern (CSP). The objective is to separate the data by projecting them according to the variance, maximizing it for one class and simultaneously minimizing it for the other class. This aim is reached by applying spatial filters which map the channels into a new space where the variance difference is much more discriminative. To better understand, let’s have a look at the filtered data before and after CSP application. In Figure 1 we represent the signal of channel 1 and 16 before in the original space: as we can see, the two classes strongly overlap.

Representation of the data before CSP was performed: It is impossible to distinguish the two classes. In Figure 2, we represent the same data rotated by CSP. As expected, the two classes are now well separated. In particular, the blue class presents a high variance along the vertical axis, while the orange class along the horizontal axis. Thus, by looking at the variance we can easily distinguish the two classes. Plotted data after CSP: the separation is easily identified:

Classification

The model has to be trained to be able to make predictions. In this case, the classifier is supervised since we train the model on labeled data. Our goal was to find the best classifier for the data we were given. In particular, we employed Linear Discriminant Analysis (LDA), Support Vector Machine (SVC), Logistic Regression (logreg), and Multilayer Perceptron (MLP). One way to train different models and select the one with the best performance is by using a Cross-Validation Grid Search which performs a tuning to find the best hyperparameters.

In that way, we can find the best model and evaluate it over the test data to obtain accuracy on data that the model has never seen. To have a graphic visualization of the accuracies, we plotted them for the three patients and the different datasets (the combination between filtered and raw or pre and post), as reported in the image.

Accuracies for the datasets from different patients.

Discussion

The first evaluation made on the LDA classificator confirmed the results observed in the paper, reaching an accuracy of around 60–70%. We then tried to obtain a better classification of the data. For this reason, we performed a grid search to find the best hyperparameters. We managed to reach this objective by involving different classificators. We followed two roads to do so: first of all, we looked for classification over the single patients, and then we tried to find a single model valid for all the patients. For both of them, we tried to assess the best combination of classifier and signal processing filtering steps for two different scenarios:

short-epoch BCI application (2 sec)
long-epoch BCI application (6 sec)

The grid search has been performed over the following hyperparameters:

All the accuracies found were higher than the ones obtained with LDA. We found different models between patients but also between filtered and unfiltered datasets. In this way, our model does not require performing any preliminary filtering. It makes the process easier and quicker and could result in useful treatments and practice. To check that the model was not overfitting the data, we tested it both on the test dataset and the train one. The obtained accuracies were comparable, reassuring us about the process. To quantify the model’s performance, we used the AUC-ROC, the area under the ROC graph. By looking at this graph we can understand how well our model can separate the classes.

ROC graph for the two seconds epoch division.

Another way to quantify the results is by using the confusion matrix, which can confront the predicted and actual values.

Confusion matrix obtained for the two seconds epoch division

Conclusions

According to our results: for the patient-specific model, we achieved better results by working with filtered data (LogReg classifier for short-epoch; MLP for long-epoch) for the generalized model, we achieved better results with raw data (MLP for both scenarios) We achieved two important and valuable conclusions.

First of all, we found models that can classify the movements better than what has been done with LDA in the paper. Secondly, we decided to create a model based on multiple patients. That could be hard to obtain because of the big variability between patients who suffered from ictus.

We however managed to apply it successfully over the three patients we had. That should be further investigated using a wider dataset and many more patients. In this way, we could both verify if the process works or if we have just been lucky and if, by involving more patients, we can obtain an even better model. The fact that this works better on unfiltered data could be connected to the fact that we are also deleting important pieces of information and not only the noise.

Acknowledgments

Beatrice Villata, Eleonora Misino, Caterina Putti, Lorenzo Migliori, and I worked together on this project for 24 awesome(and crazy) hours but it was a very valuable and interesting experience!

Code

If you’re interested take a look at the code implemented on my Github.

Share on

Twitter Facebook LinkedIn

Pietro Sillano