Skip to content

Latest commit

 

History

History
96 lines (68 loc) · 4.09 KB

README.md

File metadata and controls

96 lines (68 loc) · 4.09 KB

Do not 😴 on traditional ML

Simple and Interpretable Techniques Are Competitive to Deep Learning for Sleep Scoring

Code from the paper Do Not Sleep on Traditional Machine Learning: Simple and Interpretable Techniques Are Competitive to Deep Learning for Sleep Scoring.

Preprint: https://arxiv.org/abs/2207.07753
Published article: https://doi.org/10.1016/j.bspc.2022.104429

Citation:

@article{vanderdonckt2023donotsleep,
  title={Do not sleep on traditional machine learning: Simple and interpretable techniques are competitive to deep learning for sleep scoring},
  author={Van Der Donckt, Jeroen and Van Der Donckt, Jonas and Deprost, Emiel and Vandenbussche, Nicolas and Rademaker, Michael and Vandewiele, Gilles and Van Hoecke, Sofie},
  journal={Biomedical Signal Processing and Control},
  volume={81},
  pages={104429},
  year={2023},
  publisher={Elsevier}
}

How is the code structured?

For each dataset you can find a separate notebook in the notebooks folder.

The notebooks allow to reproduce the results as they contain;

  1. data loading (see code in src folder)
  2. pre-processing & feature extraction
  3. (seeded) machine learning experiments
notebook dataset
SleepEDF-SC +- 30min.ipynb SC-EDF-20 & SC-EDF-78
SleepEDF-ST SC-EDF-ST
MASS-SS3 MASS SS3

Additional experiments

The notebooks/other folder contains some additional experiments;

notebook experiment description
inputs_SleepEDF-SC +- 30min.ipynb evaluate impact of signal combination on performance for SC-EDF-20 & SC-EDF-78
inputs_SleepEDF-ST.ipynb evaluate impact of signal combination on performance for SC-EDF-ST
inputs_SleepEDF-MASS.ipynb evaluate impact of signal combination on performance for MASS SS3
feature_selection.ipynb show the (little to no) impact of feature selection on performance
feature_space_visualization.ipynb PCA and t-SNE visualization of the feature vector for SleepEDF-SC +/- 30min

A table showing the impact of signal combination on performance can be found in notebooks/other/signal_combination_impact.md.


How to install the requirements?

This repository uses poetry as dependency manager.
A specification of the dependencies is provided in the pyproject.toml and poetry.lock files.

You can install the dependencies in your Python environment by executing the following steps;

  1. Install poetry: https://python-poetry.org/docs/#installation
  2. Install the dependencies by calling poetry install

How to download the datasets?

This work uses 4 (sub)sets of data;

  • SC-EDF-20: first 20 patients (40 recordings) of Sleep-EDFx - Sleep Cassette
  • SC-EDF-78: : all 78 patients (153 recordings) of Sleep-EDFx - Sleep Cassette
  • ST-EDF: all 22 patients (44 recordings) of Sleep-EDFx - Sleep Telemetry
  • MASS SS3: all 62 patients (62 recordings) of the MASS - SS3 subset

Contains the the SC-EDF-20, SC-EDF-78, and ST-EDF subset.

You can download & extract the data via the following commands;

mkdir data
# Download the data
wget https://physionet.org/static/published-projects/sleep-edfx/sleep-edf-database-expanded-1.0.0.zip -P data
# Extract all data
unzip data/sleep-edf-database-expanded-1.0.0.zip -d data

Contains the MASS SS3 subset.

In order to access the data you should submit a request as is described here; http://ceams-carsm.ca/mass/