REF: use torchscript models from huggingface hub (#144)

* use pytorch 2.0.0 as base image * install g++ * do not remove gcc * import torch to please jit compiler * move custom model impls to custom_models namespace * refactor to use wsinfer-zoo * run isort and then black * rm modeldefs + make modellib and patchlib public (no underscore) * do not use torch.compile on torchscript models * Fix/issue 131 (#133) * use tifffile in lieu of large_image * run isort * make outputs float or None * changes to please mypy * add newline at end of document * add openslide-python and tifffile to core deps * add back roi support and mps device caps * black formatting * rm unused file * add wsinfer-zoo to deps * predownload registry JSON + install system deps in early layer * scale step size and print info Fixes #135 * add patchlib presets to package data and rm modeldefs * set default step_size to None * only allow step-size=patch-size * allow custom step sizes * update mpp print logs to slide mpp * add tiff mpp via openslide * resize patches to prescribed patch size and spacing * add model config schema * add schemas to package data * fix error messages Replace `--model-name` with `--model`. * create OpenSlide obj in worker_init func Fixes #137 The OpenSlide object is no longer created in `__init__`. Previously the openslide object was shared across workers. Now each worker creates its own OpenSlide object. I hypothesize that this will allow multi-worker data loading on Windows. * handle num_workers=0 * ADD choice of backends (tiffslide or openslide) (#139) * replace openslide with tiffslide * patch zarr to avoid decoding tiles in duplicate This implements the change proposed in zarr-developers/zarr-python#1454 * rm openslide-python and add tiffslide * do not stitch because it imposes a performance penalty * ignore types in vis_params * add isort and tiffslide to dev deps * add NoBackendException * run isort * use wsinfer.wsi module instead of slide_utils and add tiffslide and openslide backends * use wsinfer.wsi.WSI as generic entrypoint for whole slides * replace PathType with "str | Path" * add logging and backend selection to cli * add "from __future__ import annotations" * TST: update tests for dev branch (#143) * begin to update tests * do not resize images prior to transform This introduces subtle differences from the current stable version of wsinfer. * fix for issue #125 * do not save slide path in model outputs csv * add test_cli_run_with_registered_models * add reference model outputs These reference outputs were created using a patched version of 0.3.6 wsinfer. The patches involved padding the patches from large-image to be the expected patch size. Large image does not pad images by default, whereas openslide and tiffslide pad with black. * skip jit tests and cli with custom config * deprecate python 3.7 * install openslide and tiffslide * remove WSIType object * remove dense grid creation fixes #138 * remove timm and custom models We will focus on using TorchScript models only. In the future, we can also look into using ONNX as a backend. fixes #140 * limit click versions to please mypy related to pallets/click#2558 * satisfy mypy * fix cli args for wsinfer run * fail loudly with dev pytorch + fix jit compile tests * fix test of issue 89 * move wsinfer imports to beginning of file * add test of mutually exclusive cli args * use -p shorthand for model-path * mark that we support typing * add py.typed to package data * run test-package on windows, macos, and linux * fix test of patching * install openslide differently on different systems * close the case statement * fix the way we install openslide on different envs * fix matrix.os test * get line length with python for cross-platform * test "wsinfer run" differently for unix and windows * fix windows test * fix path to csv * skip windows tests for now because tissue segmentation is different * run "wsinfer run" on windows but do not test file length * add test of local model with config
SBU-BMI · Jul 16, 2023 · 8629418 · 8629418
1 parent 3afff37
commit 8629418
Show file tree

Hide file tree

Showing 63 changed files with 2,428 additions and 2,882 deletions.
diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
@@ -11,7 +11,7 @@ jobs:
     runs-on: ubuntu-latest
     strategy:
       matrix:
-        python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
+        python-version: ["3.8", "3.9", "3.10", "3.11"]
     steps:
       - uses: actions/checkout@v3
       - name: Set up Python ${{ matrix.python-version }}
@@ -20,11 +20,14 @@ jobs:
           python-version: ${{ matrix.python-version }}
       - name: Install the package
         run: |
+          sudo apt update
+          sudo apt install -y libopenslide0
           python -m pip install --upgrade pip setuptools wheel
-          python -m pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu
-          python -m pip install --editable .[dev] --find-links https://girder.github.io/large_image_wheels
+          python -m pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu openslide-python tiffslide
+          python -m pip install --editable .[dev]
       - name: Run tests
         run: python -m pytest --verbose tests/
+
   test-pytorch-nightly:
     runs-on: ubuntu-latest
     steps:
@@ -35,14 +38,16 @@ jobs:
           python-version: "3.10"
       - name: Install the package
         run: |
+          sudo apt update
+          sudo apt install -y libopenslide0
           python -m pip install --upgrade pip setuptools wheel
-          python -m pip install --pre torch torchvision --extra-index-url https://download.pytorch.org/whl/nightly/cpu
-          python -m pip install --editable .[dev] --find-links https://girder.github.io/large_image_wheels
+          python -m pip install --pre torch torchvision --extra-index-url https://download.pytorch.org/whl/nightly/cpu openslide-python tiffslide
+          python -m pip install --editable .[dev]
       - name: Check types
         run: python -m mypy --install-types --non-interactive wsinfer/
       - name: Run tests
-        continue-on-error: true
         run: python -m pytest --verbose tests/
+
   test-docker:
     runs-on: ubuntu-latest
     steps:
@@ -60,35 +65,62 @@ jobs:
           wget -q https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/JP2K-33003-1.svs
           cd ..
           docker run --rm --shm-size=512m --volume $(pwd):/work --workdir /work wsinferimage run \
-            --wsi-dir slides/ --results-dir results/ --model resnet34 --weights TCGA-BRCA-v1
+            --wsi-dir slides/ --results-dir results/ --model breast-tumor-resnet34.tcga-brca
           test -f results/run_metadata_*.json
           test -f results/patches/JP2K-33003-1.h5
           test -f results/model-outputs/JP2K-33003-1.csv
           test $(wc -l < results/model-outputs/JP2K-33003-1.csv) -eq 653
+
+  # This is run on multiple operating systems.
   test-package:
-    runs-on: ubuntu-latest
+    strategy:
+        matrix:
+            os: [ubuntu-latest, windows-latest, macos-latest]
+    runs-on: ${{ matrix.os }}
     steps:
       - uses: actions/checkout@v3
       - name: Set up Python 3.10
         uses: actions/setup-python@v4
         with:
           python-version: "3.10"
-      - name: Install the package
+      - name: Install OpenSlide on Ubuntu
+        if: matrix.os == 'ubuntu-latest'
+        run: sudo apt update && sudo apt install -y libopenslide0
+      - name: Install OpenSlide on macOS
+        if: matrix.os == 'macos-latest'
+        run: brew install openslide
+      - name: Install the wsinfer python package
         run: |
           python -m pip install --upgrade pip setuptools wheel
-          python -m pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu
-          python -m pip install . --find-links https://girder.github.io/large_image_wheels
-      - name: Run the wsinfer command in a new directory
+          python -m pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu openslide-python tiffslide
+          python -m pip install .
+      - name: Run 'wsinfer run' on Unix
+        if: matrix.os != 'windows-latest'
         run: |
           mkdir newdir && cd newdir
           mkdir slides && cd slides
           wget -q https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/JP2K-33003-1.svs
           cd ..
-          wsinfer run --wsi-dir slides/ --results-dir results/ --model resnet34 --weights TCGA-BRCA-v1
+          wsinfer run --wsi-dir slides/ --results-dir results/ --model breast-tumor-resnet34.tcga-brca
           test -f results/run_metadata_*.json
           test -f results/patches/JP2K-33003-1.h5
           test -f results/model-outputs/JP2K-33003-1.csv
           test $(wc -l < results/model-outputs/JP2K-33003-1.csv) -eq 653
+      # FIXME: tissue segmentation has different outputs on Windows. The patch sizes
+      # are the same but the coordinates found are different.
+      - name: Run 'wsinfer run' on Windows
+        if: matrix.os == 'windows-latest'
+        run: |
+          mkdir newdir && cd newdir
+          mkdir slides && cd slides
+          Invoke-WebRequest -URI https://openslide.cs.cmu.edu/download/openslide-testdata/Aperio/JP2K-33003-1.svs -OutFile JP2K-33003-1.svs
+          cd ..
+          wsinfer run --wsi-dir slides/ --results-dir results/ --model breast-tumor-resnet34.tcga-brca
+          Test-Path -Path results/run_metadata_*.json -PathType Leaf
+          Test-Path -Path results/patches/JP2K-33003-1.h5 -PathType Leaf
+          Test-Path -Path results/model-outputs/JP2K-33003-1.csv -PathType Leaf
+          # test $(python -c "print(sum(1 for _ in open('results/model-outputs/JP2K-33003-1.csv')))") -eq 653
+
   style-and-types:
     runs-on: ubuntu-latest
     steps:
@@ -99,28 +131,33 @@ jobs:
           python-version: "3.10"
       - name: Install the package
         run: |
+          sudo apt update
+          sudo apt install -y libopenslide0
           python -m pip install --upgrade pip setuptools wheel
-          python -m pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu
-          python -m pip install .[dev] --find-links https://girder.github.io/large_image_wheels
+          python -m pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu openslide-python tiffslide
+          python -m pip install .[dev]
       - name: Check style (flake8)
         run: python -m flake8 wsinfer/
       - name: Check style (black)
         run: python -m black --check wsinfer/
       - name: Check types
         run: python -m mypy --install-types --non-interactive wsinfer/
+
   docs:
     runs-on: ubuntu-latest
     steps:
       - uses: actions/checkout@v3
-      - name: Set up Python 3.10
+      - name: Set up Python
         uses: actions/setup-python@v4
         with:
           python-version: "3.10"
       - name: Install the package
         run: |
+          sudo apt update
+          sudo apt install -y libopenslide0
           python -m pip install --upgrade pip setuptools wheel
-          python -m pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu
-          python -m pip install .[docs] --find-links https://girder.github.io/large_image_wheels
+          python -m pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu openslide-python tiffslide
+          python -m pip install .[docs]
       - name: Build docs
         run: |
           cd docs

diff --git a/.gitignore b/.gitignore
@@ -170,4 +170,4 @@ cython_debug/
 .idea/
 
 # Extras
-.DS_Store
+.DS_Store
diff --git a/.readthedocs.yaml b/.readthedocs.yaml
@@ -10,7 +10,7 @@ build:
   jobs:
     post_create_environment:
       - python -m pip install torch torchvision --extra-index-url https://download.pytorch.org/whl/cpu
-      - python -m pip install .[docs] --find-links https://girder.github.io/large_image_wheels
+      - python -m pip install .[docs]
     post_install:
       # Re-run the installation to ensure we have an appropriate version of sphinx.
       # We might not want to use the latest version.

diff --git a/Dockerfile b/Dockerfile
@@ -1,17 +1,32 @@
+# FIXME: when using the torch 2.0.1 image, we get an error
+#   OSError: /lib/x86_64-linux-gnu/libgobject-2.0.so.0: undefined symbol: ffi_type_uint32, version LIBFFI_BASE_7.0
+# The error is fixed by
+#   LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libffi.so.7
+
 FROM pytorch/pytorch:2.0.0-cuda11.7-cudnn8-runtime
-WORKDIR /opt/wsinfer
-COPY . .
+ARG DEBIAN_FRONTEND="noninteractive"
+ENV TZ=Etc/UTC
+
+# Install system dependencies.
 RUN apt-get update \
     && apt-get install -y --no-install-recommends gcc g++ git libopenslide0 \
-    && python -m pip install --no-cache-dir --editable . \
-        --find-links https://girder.github.io/large_image_wheels \
     && rm -rf /var/lib/apt/lists/*
+
+# Install wsinfer.
+WORKDIR /opt/wsinfer
+COPY . .
+RUN python -m pip install --no-cache-dir --editable . openslide-python tiffslide
+
 # Use a writable directory for downloading model weights. Default is ~/.cache, which is
 # not guaranteed to be writable in a Docker container.
 ENV TORCH_HOME=/var/lib/wsinfer
 RUN mkdir -p "$TORCH_HOME" \
     && chmod 777 "$TORCH_HOME" \
     && chmod a+s "$TORCH_HOME"
+
+# Test that the program runs (and also download the registry JSON file).
+RUN wsinfer --help
+
 WORKDIR /work
 ENTRYPOINT ["wsinfer"]
 CMD ["--help"]

diff --git a/README.md b/README.md
@@ -23,15 +23,13 @@ We do not install these dependencies automatically because their installation ca
 on a user's system. Then use the command below to install this package.
 
 ```
-python -m pip install --find-links https://girder.github.io/large_image_wheels wsinfer
+python -m pip install wsinfer
 ```
 
 To use the _bleeding edge_, use
 
 ```
-python -m pip install \
-    --find-links https://girder.github.io/large_image_wheels \
-    git+https://github.com/SBU-BMI/wsinfer.git
+python -m pip install git+https://github.com/SBU-BMI/wsinfer.git
 ```
 
 ## Developers
@@ -41,7 +39,7 @@ Clone this GitHub repository and install the package (in editable mode with the
 ```
 git clone https://github.com/SBU-BMI/wsinfer.git
 cd wsinfer
-python -m pip install --editable .[dev] --find-links https://girder.github.io/large_image_wheels
+python -m pip install --editable .[dev]
 ```
 
 # Cutting a release

diff --git a/docs/installing.rst b/docs/installing.rst
@@ -6,7 +6,7 @@ Installing and getting started
 Prerequisites
 -------------
 
-WSInfer supports Python 3.7+ and has been tested on Linux.
+WSInfer supports Python 3.8+ and has been tested on Linux.
 
 Install PyTorch before installing WSInfer. Please see
 `PyTorch's installation instructions <https://pytorch.org/get-started/locally/>`_.
@@ -21,11 +21,9 @@ the type of hardware a user has.
 Manual installation
 -------------------
 
-After having installed PyTorch, install releases of WSInfer from `PyPI <https://pypi.org/project/wsinfer/>`_.
-Be sure to include the line :code:`--find-links https://girder.github.io/large_image_wheels` to ensure
-dependencies are installed properly. ::
+After having installed PyTorch, install releases of WSInfer from `PyPI <https://pypi.org/project/wsinfer/>`_. ::
 
-    pip install wsinfer --find-links https://girder.github.io/large_image_wheels
+    pip install wsinfer
 
 This installs the :code:`wsinfer` Python package and the :code:`wsinfer` command line program. ::
 

diff --git a/setup.cfg b/setup.cfg
@@ -18,7 +18,6 @@ classifiers =
     Operating System :: OS Independent
     Programming Language :: Python :: 3
     Programming Language :: Python :: 3 :: Only
-    Programming Language :: Python :: 3.7
     Programming Language :: Python :: 3.8
     Programming Language :: Python :: 3.9
     Programming Language :: Python :: 3.10
@@ -29,35 +28,36 @@ classifiers =
 
 [options]
 packages = find:
-python_requires = >= 3.7
+python_requires = >= 3.8
 install_requires =
-    click>=8.0,<9
+    click>=8.0,<9,!=8.1.4,!=8.1.5
     geojson
     h5py
-    # OpenSlide and TIFF readers should handle all images we will encounter.
-    large-image[openslide,tiff]>=1.8.0
     numpy
     opencv-python-headless>=4.0.0
     pandas
     pillow
     pyyaml
     shapely
-    timm
+    tifffile
+    tiffslide
     # The installation fo torch and torchvision can differ by hardware. Users are
     # advised to install torch and torchvision for their given hardware and then install
     # wsinfer. See https://pytorch.org/get-started/locally/.
     torch>=1.7
     torchvision
     tqdm
+    wsinfer-zoo
 
 [options.extras_require]
 dev =
     black
     flake8
     imagecodecs  # for tifffile
+    isort
     mypy
     pytest
-    tifffile
+    tiffslide
     types-Pillow
     types-PyYAML
     types-tqdm
@@ -73,8 +73,9 @@ console_scripts =
 
 [options.package_data]
 wsinfer =
-    _patchlib/presets/*.csv
-    modeldefs/*.yaml
+    py.typed
+    patchlib/presets/*.csv
+    schemas/*.json
 
 [flake8]
 max-line-length = 88
@@ -84,8 +85,6 @@ exclude = wsinfer/_version.py
 [mypy]
 [mypy-h5py]
 ignore_missing_imports = True
-[mypy-large_image]
-ignore_missing_imports = True
 [mypy-cv2]
 ignore_missing_imports = True
 [mypy-geojson]
@@ -96,12 +95,16 @@ ignore_missing_imports = True
 ignore_missing_imports = True
 [mypy-pandas]
 ignore_missing_imports = True
-[mypy-timm.*]
+[mypy-safetensors.*]
 ignore_missing_imports = True
 [mypy-scipy.stats]
 ignore_missing_imports = True
 [mypy-shapely.*]
 ignore_missing_imports = True
+[mypy-tifffile]
+ignore_missing_imports = True
+[mypy-zarr.storage]
+ignore_missing_imports = True
 
 [versioneer]
 VCS = git
@@ -110,3 +113,8 @@ versionfile_source = wsinfer/_version.py
 versionfile_build = wsinfer/_version.py
 tag_prefix = v
 parentdir_prefix = wsinfer
+
+
+[isort]
+profile = black
+force_single_line = True