DH USE CASE Pipeline

Pipeline that uses EDDL and ECVL to train a CNN on three different datasets (MNIST, ISIC and PNEUMOTHORAX), applying different image augmentations, for both the classification and the segmentation task.

Requirements

CMake 3.13 or later
C++ Compiler with C++17 support (e.g. GCC 6 or later, Clang 5.0 or later, Visual Studio 2017 or later)
(Optional) ISIC dataset.
(Optional) Pneumothorax dataset.

Datasets

The YAML datasets format is described here. Each dataset listed below contains both the data and the YAML description format, but they can also be downloaded separately: ISIC classification, ISIC segmentation and Pneumothorax segmentation.

MNIST

Automatically downloaded and extracted by CMake.

ISIC - isic-archive.com

Classification: Download it from here and extract it. Change the dataset path into the skin_lesion_classification_training.cpp source file accordingly. To perform only inference, change the dataset path into the skin_lesion_classification_inference.cpp source file and download checkpoints here (best accuracy on validation in 50 epochs).

Segmentation: Download it from here and extract it. Change the dataset path into the skin_lesion_segmentation_training.cpp source file accordingly. To perform only inference, change the dataset path into the skin_lesion_segmentation_inference.cpp source file and download checkpoints here (best Mean Intersection over Union on validation in 50 epochs).

PNEUMOTHORAX

Dataset taken from a kaggle challenge (more details here).

Download training and test images here.
Download from here ground truth masks and the YAML dataset file.
In order to copy the ground truth masks in the directory of the corresponding images, edit the cpp/copy_ground_truth_pneumothorax.cpp file with the path to the downloaded dataset and ground truth directory and run it. Move the YAML file in the siim dataset folder.

Short video in which these steps are shown.

From the 2669 distinct training images with mask, 200 are randomly sampled as validation set.

Training set: 3086 total images - 80% with mask and 20% without mask.
Validation set: 250 total images - 80% with mask and 20% without mask.

To perform only inference on test set, change the dataset path into the pneumothorax_segmentation_inference.cpp source file and download checkpoint here for EDDL versions >= 0.4.3 or here for EDDL versions <= 0.4.2 (best Dice Coefficient on validation in 50 epochs).

CUDA

On Linux systems, starting from CUDA 10.1, cuBLAS libraries are installed in the /usr/lib/<arch>-linux-gnu/ or /usr/lib64/. Create a symlink to resolve the issue:

sudo ln -s /usr/lib/<arch>-linux-gnu/libcublas.so /usr/local/cuda-10.1/lib64/libcublas.so

Building

*nix

Building from scratch, assuming CUDA driver already installed if you want to use GPUs (video in which these steps are performed in a clean nvidia docker image):

sudo apt update
sudo apt install wget git make gcc-8 g++-8

# cmake version >= 3.13 is required for ECVL
wget https://cmake.org/files/v3.13/cmake-3.13.5-Linux-x86_64.tar.gz
tar -xf cmake-3.13.5-Linux-x86_64.tar.gz

# symbolic link for cmake
sudo ln -s /<path/to>/cmake-3.13.5-Linux-x86_64/bin/cmake /usr/bin/cmake
# symbolic link for cublas if we have cuda >= 10.1
sudo ln -s /usr/lib/<arch>-linux-gnu/libcublas.so /usr/local/cuda-10.1/lib64/libcublas.so

# if other versions of gcc (e.g., gcc-7) are present, set a higher priority to gcc-8 so that it is chosen as the default
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 80 --slave /usr/bin/g++ g++ /usr/bin/g++-8
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 70 --slave /usr/bin/g++ g++ /usr/bin/g++-7

git clone https://github.com/deephealthproject/use_case_pipeline.git
cd use_case_pipeline

# install dependencies as sudo so that they will be installed in "standard" system directories
chmod +x install_dependencies.sh
sudo ./install_dependencies.sh

# install EDDL, OpenCV, ECVL and build the pipeline
chmod +x build_pipeline.sh
./build_pipeline.sh

Building with all the dependencies already installed:

git clone https://github.com/deephealthproject/use_case_pipeline.git
cd use_case_pipeline
mkdir build && cd build

# if ECVL is not installed in a "standard" system directory (like /usr/local/) you have to provide the installation directory
cmake -Decvl_DIR=/<path/to>/ecvl/build/install ..
make

Windows

Building assuming cmake >= 3.13, git, Visual Studio 2017 or 2019, CUDA driver (if you want to use GPUs) already installed

# install EDDL and all its dependencies, OpenCV, ECVL and build the pipeline
git clone https://github.com/deephealthproject/use_case_pipeline.git
cd use_case_pipeline
build_pipeline.bat

N.B. EDDL is built for GPU by default.

Training and inference

The project creates different executables: MNIST_BATCH, SKIN_LESION_CLASSIFICATION_TRAINING, SKIN_LESION_SEGMENTATION_TRAINING, SKIN_LESION_CLASSIFICATION_INFERENCE, SKIN_LESION_SEGMENTATION_INFERENCE, PNEUMOTHORAX_SEGMENTATION_TRAINING and PNEUMOTHORAX_SEGMENTATION_INFERENCE.
1. MNIST_BATCH and SKIN_LESION_CLASSIFICATION_TRAINING train the neural network loading the dataset in batches (needed when the dataset is too large to fit in memory).
2. SKIN_LESION_SEGMENTATION_TRAINING trains the neural network loading the dataset (images and their ground truth masks) in batches for the segmentation task.
3. PNEUMOTHORAX_SEGMENTATION_TRAINING trains the neural network loading the dataset (images and their ground truth masks) in batches with a custom function for this specific segmentation task.
4. SKIN_LESION_CLASSIFICATION_INFERENCE, SKIN_LESION_SEGMENTATION_INFERENCE and PNEUMOTHORAX_SEGMENTATION_INFERENCE perform only inference on classification or segmentation task loading weights from a previous training process.
Examples of output for the pre-trained models provided:
1. ISIC segmentation test set:
  
  The red line represents the prediction processed by ECVL to obtain contours that are overlaid on the original image.
2. Pneumothorax segmentation validation set:
  
  The red area represents the prediction, the green area the ground truth. The yellow area therefore represents the correctly predicted pixels.

Name		Name	Last commit message	Last commit date
Latest commit History 141 Commits
cpp		cpp
imgs		imgs
python		python
pytorch		pytorch
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
build_pipeline.bat		build_pipeline.bat
build_pipeline.sh		build_pipeline.sh
install_dependencies.sh		install_dependencies.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DH USE CASE Pipeline

Requirements

Datasets

MNIST

ISIC - isic-archive.com

PNEUMOTHORAX

CUDA

Building

Training and inference

About

Releases

Packages

Languages

License

jonandergomez/use_case_pipeline

Folders and files

Latest commit

History

Repository files navigation

DH USE CASE Pipeline

Requirements

Datasets

MNIST

ISIC - isic-archive.com

PNEUMOTHORAX

CUDA

Building

Training and inference

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages