.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "auto_examples/biosignal-mi/plot_augmented.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_auto_examples_biosignal-mi_plot_augmented.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_auto_examples_biosignal-mi_plot_augmented.py:


====================================================================
Augmented Covariance Matrix
====================================================================

This example compares classification pipelines based on covariance maxtrix (CM)
versus augmented covariance matrix (ACM) [1]_.

.. GENERATED FROM PYTHON SOURCE LINES 9-36

.. code-block:: Python

    # Authors: Igor Carrara <igor.carrara@inria.fr>
    #
    # License: BSD (3-clause)

    import matplotlib.pyplot as plt
    from mne import Epochs, pick_types, events_from_annotations
    from mne.io import concatenate_raws
    from mne.io.edf import read_raw_edf
    from mne.datasets import eegbci
    import numpy as np
    import pandas as pd
    import seaborn as sns
    from sklearn.base import clone, BaseEstimator, TransformerMixin
    from sklearn.metrics import accuracy_score
    from sklearn.model_selection import (
        train_test_split,
        StratifiedKFold,
        GridSearchCV,
    )
    from sklearn.pipeline import Pipeline
    from sklearn.svm import SVC

    from pyriemann.classification import MDM
    from pyriemann.estimation import Covariances
    from pyriemann.tangentspace import TangentSpace


.. GENERATED FROM PYTHON SOURCE LINES 37-39

Define the Augmented Covariance
-------------------------------

.. GENERATED FROM PYTHON SOURCE LINES 39-67

.. code-block:: Python


    class AugmentedDataset(BaseEstimator, TransformerMixin):
        """This transformation creates an embedding version of the current dataset.

        The implementation and the application is described in [1]_.
        """
        def __init__(self, order=1, lag=1):
            self.order = order
            self.lag = lag

        def fit(self, X, y=None):
            return self

        def transform(self, X):
            if self.order == 1:
                return X

            X_new = np.concatenate(
                [
                    X[:, :, p * self.lag: -(self.order - p) * self.lag]
                    for p in range(0, self.order)
                ],
                axis=1,
            )

            return X_new


.. GENERATED FROM PYTHON SOURCE LINES 68-70

Load EEG data
-------------

.. GENERATED FROM PYTHON SOURCE LINES 70-119

.. code-block:: Python


    # Avoid classification of evoked responses by using epochs that start 1s after
    # cue onset.
    tmin, tmax = 1., 2.
    event_id = dict(hands=2, feet=3)
    subject = 7
    runs = [6, 10, 14]  # motor imagery: hands vs feet

    raw_files = [
        read_raw_edf(f, preload=True)
        for f in eegbci.load_data(subject, runs, update_path=True)
    ]
    raw = concatenate_raws(raw_files)

    picks = pick_types(
        raw.info, meg=False, eeg=True, stim=False, eog=False, exclude="bads")
    # subsample elecs
    picks = picks[::2]

    # Apply band-pass filter
    raw.filter(7., 35., method="iir", picks=picks)

    events, _ = events_from_annotations(raw, event_id=dict(T1=2, T2=3))

    # Read epochs (train will be done only between 1 and 2s)
    # Testing will be done with a running classifier
    epochs = Epochs(
        raw,
        events,
        event_id,
        tmin,
        tmax,
        proj=True,
        picks=picks,
        baseline=None,
        preload=True,
        verbose=False)

    # get epochs
    X = 1e6 * epochs.get_data(copy=False)
    y = epochs.events[:, -1] - 2

    X_train, X_test, y_train, y_test = train_test_split(X, y,
                                                        test_size=0.3,
                                                        random_state=0,
                                                        shuffle=True,
                                                        stratify=y)


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    Downloading EEGBCI data
    Download complete in 09s (2.5 MB)
    Extracting EDF parameters from /home/docs/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S007/S007R06.edf...
    Setting channel info structure...
    Creating raw.info structure...
    Reading 0 ... 19999  =      0.000 ...   124.994 secs...
    Extracting EDF parameters from /home/docs/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S007/S007R10.edf...
    Setting channel info structure...
    Creating raw.info structure...
    Reading 0 ... 19999  =      0.000 ...   124.994 secs...
    Extracting EDF parameters from /home/docs/mne_data/MNE-eegbci-data/files/eegmmidb/1.0.0/S007/S007R14.edf...
    Setting channel info structure...
    Creating raw.info structure...
    Reading 0 ... 19999  =      0.000 ...   124.994 secs...
    Filtering a subset of channels. The highpass and lowpass values in the measurement info will not be updated.
    Filtering raw data in 3 contiguous segments
    Setting up band-pass filter from 7 - 35 Hz

    IIR filter parameters
    ---------------------
    Butterworth bandpass zero-phase (two-pass forward and reverse) non-causal filter:
    - Filter order 16 (effective, after forward-backward)
    - Cutoffs at 7.00, 35.00 Hz: -6.02, -6.02 dB

    Used Annotations descriptions: [np.str_('T1'), np.str_('T2')]


.. GENERATED FROM PYTHON SOURCE LINES 120-126

Defining pipelines
------------------

Define pipelines based on augmented covariance matrix (ACM),
which is an expansion of the current EEG signal using theory of phase space
reconstruction [1]_

.. GENERATED FROM PYTHON SOURCE LINES 126-171

.. code-block:: Python


    pipelines = {}
    pipelines["ACM(Grid)+TGSP+SVM"] = Pipeline(steps=[
        ("augmenteddataset", AugmentedDataset()),
        ("Covariances", Covariances("oas")),
        ("Tangent_Space", TangentSpace(metric="riemann")),
        ("SVM", SVC(kernel="linear"))
    ])

    pipelines["ACM(Grid)+MDM"] = Pipeline(steps=[
        ("augmenteddataset", AugmentedDataset()),
        ("Covariances", Covariances("oas")),
        ("MDM", MDM(metric=dict(mean="riemann", distance="riemann")))
    ])

    # Define the inner CV scheme for the nested cross-validation of
    # hyper-parameter search
    inner_cv = StratifiedKFold(n_splits=3, shuffle=True, random_state=42)

    # Define the hyper-parameters to test in the nested cross-validation
    param_grid = {
        "augmenteddataset__order": [2, 3, 4],
        "augmenteddataset__lag": [5, 7, 9],
    }

    pipelines_grid = {}
    pipelines_grid["ACM(Grid)+TGSP+SVM"] = GridSearchCV(
        pipelines["ACM(Grid)+TGSP+SVM"],
        param_grid,
        refit=True,
        cv=inner_cv,
        n_jobs=-1,
        scoring="accuracy",
    )

    pipelines_grid["ACM(Grid)+MDM"] = GridSearchCV(
        pipelines["ACM(Grid)+MDM"],
        param_grid,
        refit=True,
        cv=inner_cv,
        n_jobs=-1,
        scoring="accuracy",
    )


.. GENERATED FROM PYTHON SOURCE LINES 172-173

Run the inner CV for getting the hyper-parameter

.. GENERATED FROM PYTHON SOURCE LINES 173-205

.. code-block:: Python

    results_grid = []
    best_estimator = []
    for ppn, ppl in pipelines_grid.items():
        cvclf = clone(ppl)
        score = cvclf.fit(X_train, y_train)
        res = {
            "pipeline": ppn,
            "order": score.best_params_["augmenteddataset__order"],
            "lag": score.best_params_["augmenteddataset__lag"]
        }

        res_best = {
            "pipeline": ppn,
            "best_estimator": score.best_estimator_
        }
        best_estimator.append(res_best)
        results_grid.append(res)

    results_grid = pd.DataFrame(results_grid)
    best_estimator = pd.DataFrame(best_estimator)
    print(results_grid)

    # Update the pipeline with the best pipeline obtained with GridSearch process
    pipelines["ACM(Grid)+TGSP+SVM"] = best_estimator.loc[
        best_estimator["pipeline"] == "ACM(Grid)+TGSP+SVM",
        "best_estimator"].values[0]

    pipelines["ACM(Grid)+MDM"] = best_estimator.loc[
        best_estimator["pipeline"] == "ACM(Grid)+MDM",
        "best_estimator"].values[0]


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

                 pipeline  order  lag
    0  ACM(Grid)+TGSP+SVM      3    7
    1       ACM(Grid)+MDM      3    7


.. GENERATED FROM PYTHON SOURCE LINES 206-207

Define pipelines with usual covariance matrix (CM)

.. GENERATED FROM PYTHON SOURCE LINES 207-219

.. code-block:: Python

    pipelines["CM+TGSP+SVM"] = Pipeline(steps=[
        ("Covariances", Covariances("oas")),
        ("Tangent_Space", TangentSpace(metric="riemann")),
        ("SVM", SVC(kernel="linear"))
    ])

    pipelines["CM+MDM"] = Pipeline(steps=[
        ("Covariances", Covariances("cov")),
        ("MDM", MDM(metric=dict(mean="riemann", distance="riemann")))
    ])


.. GENERATED FROM PYTHON SOURCE LINES 220-227

Evaluation
----------

Compare classification pipelines:

- CM+TGSP+SVM versus ACM(Grid)+TGSP+SVM
- CM+MDM versus ACM(Grid)+MDM

.. GENERATED FROM PYTHON SOURCE LINES 227-247

.. code-block:: Python


    results = []
    for ppn, ppl in pipelines.items():
        cvclf = clone(ppl)
        cvclf.fit(X_train, y_train)
        y_pred = cvclf.predict(X_test)
        score = accuracy_score(y_test, y_pred)
        res = {
            "score": score,
            "pipeline": ppn,
            "Features": ppn.split("+")[0],
            "Classifiers": ppn.split("+", 1)[1]
        }
        results.append(res)
    results = pd.DataFrame(results)

    for _, row in results.sort_values(by="score", ascending=False).iterrows():
        print(f"{row.pipeline}, score: {row.score:.4f}")


.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    ACM(Grid)+TGSP+SVM, score: 0.9286
    CM+TGSP+SVM, score: 0.8571
    ACM(Grid)+MDM, score: 0.7857
    CM+MDM, score: 0.7143


.. GENERATED FROM PYTHON SOURCE LINES 248-250

Plot
----

.. GENERATED FROM PYTHON SOURCE LINES 250-261

.. code-block:: Python


    sns.pointplot(
        data=results,
        x="Features",
        y="score",
        hue="Classifiers",
        order=["CM", "ACM(Grid)"]
    )
    plt.show()


.. image-sg:: /auto_examples/biosignal-mi/images/sphx_glr_plot_augmented_001.png
   :alt: plot augmented
   :srcset: /auto_examples/biosignal-mi/images/sphx_glr_plot_augmented_001.png
   :class: sphx-glr-single-img


.. GENERATED FROM PYTHON SOURCE LINES 262-267

References
----------
.. [1] `Classification of BCI-EEG based on augmented covariance matrix
   <https://doi.org/10.48550/arXiv.2302.04508>`_
   Carrara, I., & Papadopoulo, T., arXiv, 2023


.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 20.302 seconds)


.. _sphx_glr_download_auto_examples_biosignal-mi_plot_augmented.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: plot_augmented.ipynb <plot_augmented.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: plot_augmented.py <plot_augmented.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: plot_augmented.zip <plot_augmented.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_