Plugin Update: Theia bleedthough estimation (#451)

* feat: updated theia plugin to new standards * build: updates with bfio, preadator, tensorflow, and filepattern * build: bumped version 0.5.0-dev0 -> 0.5.0-dev1
PolusAI · Feb 5, 2024 · 7709175 · 7709175
1 parent d085265
commit 7709175
Show file tree

Hide file tree

Showing 19 changed files with 1,505 additions and 0 deletions.
diff --git a/regression/theia-bleedthrough-estimation-plugin/.bumpversion.cfg b/regression/theia-bleedthrough-estimation-plugin/.bumpversion.cfg
@@ -0,0 +1,29 @@
+[bumpversion]
+current_version = 0.5.0-dev1
+commit = False
+tag = False
+parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\-(?P<release>[a-z]+)(?P<dev>\d+))?
+serialize = 
+	{major}.{minor}.{patch}-{release}{dev}
+	{major}.{minor}.{patch}
+
+[bumpversion:part:release]
+optional_value = _
+first_value = dev
+values = 
+	dev
+	_
+
+[bumpversion:part:dev]
+
+[bumpversion:file:pyproject.toml]
+search = version = "{current_version}"
+replace = version = "{new_version}"
+
+[bumpversion:file:plugin.json]
+
+[bumpversion:file:VERSION]
+
+[bumpversion:file:README.md]
+
+[bumpversion:file:src/polus/plugins/regression/theia_bleedthrough_estimation/__init__.py]
diff --git a/regression/theia-bleedthrough-estimation-plugin/.dockerignore b/regression/theia-bleedthrough-estimation-plugin/.dockerignore
@@ -0,0 +1,171 @@
+################################################################################
+# Local Files and Folders
+################################################################################
+/data
+requirements.txt
+**/__pycache__
+**/*.so
+
+################################################################################
+# Python Template from github
+################################################################################
+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+
+# C extensions
+*.so
+
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+cover/
+
+# Translations
+*.mo
+*.pot
+
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+
+# Flask stuff:
+instance/
+.webassets-cache
+
+# Scrapy stuff:
+.scrapy
+
+# Sphinx documentation
+docs/_build/
+
+# PyBuilder
+.pybuilder/
+target/
+
+# Jupyter Notebook
+.ipynb_checkpoints
+
+# IPython
+profile_default/
+ipython_config.py
+
+# pyenv
+#   For a library or package, you might want to ignore these files since the code is
+#   intended to run in multiple environments; otherwise, check them in:
+# .python-version
+
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+
+# poetry
+#   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
+#   This is especially recommended for binary packages to ensure reproducibility, and is more
+#   commonly ignored for libraries.
+#   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
+#poetry.lock
+
+# pdm
+#   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
+#pdm.lock
+#   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
+#   in version control.
+#   https://pdm.fming.dev/#use-with-ide
+.pdm.toml
+
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
+__pypackages__/
+
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+
+# SageMath parsed files
+*.sage.py
+
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+
+# Spyder project settings
+.spyderproject
+.spyproject
+
+# Rope project settings
+.ropeproject
+
+# mkdocs documentation
+/site
+
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+
+# Pyre type checker
+.pyre/
+
+# pytype static type analyzer
+.pytype/
+
+# Cython debug symbols
+cython_debug/
+
+# PyCharm
+#  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
+#  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
+#  and can be added to the global gitignore or merged into this file.  For a more nuclear
+#  option (not recommended) you can uncomment the following to ignore the entire idea folder.
+.idea/
diff --git a/regression/theia-bleedthrough-estimation-plugin/.gitignore b/regression/theia-bleedthrough-estimation-plugin/.gitignore
@@ -0,0 +1 @@
+/.vscode
diff --git a/regression/theia-bleedthrough-estimation-plugin/Dockerfile b/regression/theia-bleedthrough-estimation-plugin/Dockerfile
@@ -0,0 +1,53 @@
+FROM tensorflow/tensorflow:2.12.0-gpu
+
+# Output from `cat /etc/os-release` in the base container
+#
+# NAME="Ubuntu"
+# VERSION="20.04.5 LTS (Focal Fossa)"
+# ID=ubuntu
+# ID_LIKE=debian
+# PRETTY_NAME="Ubuntu 20.04.5 LTS"
+# VERSION_ID="20.04"
+# HOME_URL="https://www.ubuntu.com/"
+# SUPPORT_URL="https://help.ubuntu.com/"
+# BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
+# PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
+# VERSION_CODENAME=focal
+# UBUNTU_CODENAME=focal
+
+# Instal Python 3.9
+RUN apt update && \
+    apt install software-properties-common -y && \
+    add-apt-repository ppa:deadsnakes/ppa && \
+    apt install python3.9 python3.9-distutils curl -y && \
+    curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
+    python3.9 get-pip.py && \
+    apt autoremove -y && \
+    rm -rf /var/lib/apt/lists/*
+
+# Symbolic link to python3.9
+RUN ln -sf /usr/bin/python3.9 /usr/bin/python3
+
+# environment variables defined in polusai/bfio
+ENV EXEC_DIR="/opt/executables"
+ENV DATA_DIR="/data"
+ENV POLUS_LOG="INFO"
+ENV POLUS_IMG_EXT=".ome.tif"
+ENV POLUS_TAB_EXT=".csv"
+
+RUN mkdir /.cache && chmod 777 /.cache
+
+# Work directory defined in the base container
+WORKDIR ${EXEC_DIR}
+
+# Copy the python package
+COPY pyproject.toml ${EXEC_DIR}
+COPY VERSION ${EXEC_DIR}
+COPY README.md ${EXEC_DIR}
+COPY src ${EXEC_DIR}/src
+
+# Install the python package
+RUN pip3 install ${EXEC_DIR} --no-cache-dir
+
+ENTRYPOINT ["python3", "-m", "polus.plugins.regression.theia_bleedthrough_estimation"]
+CMD ["--help"]
diff --git a/regression/theia-bleedthrough-estimation-plugin/README.md b/regression/theia-bleedthrough-estimation-plugin/README.md
@@ -0,0 +1,90 @@
+# Theia Bleedthrough Estimation (v0.5.0-dev1)
+
+This WIPP plugin estimates the bleed-through in a collection of 2d images.
+It uses the Theia algorithm from [this repo](https://github.com/PolusAI/theia).
+
+## File Patterns
+
+This plugin uses [file-patterns](https://filepattern.readthedocs.io/en/latest/Examples.html#what-is-filepattern) to create subsets of an input collection.
+In particular, defining a filename variable is surrounded by `{}`, and the variable name and number of spaces dedicated to the variable are denoted by repeated characters for the variable.
+For example, if all filenames follow the structure `prefix_tTTT.ome.tif`, where `TTT` indicates the time-point of capture of the image, then the file-pattern would be `prefix_t{ttt}.ome.tif`.
+
+## Optional Parameters
+
+### --groupBy
+
+This parameter can be used to group images into subsets.
+This plugin will apply bleed-through correction to each subset.
+Each subset should contain all channels for one image/tile/FOV.
+The images in each subset should all have the same size (in pixels and dimensions) and one.
+
+If no `--groupBy` is specified, then the plugin will assume that all images in the input collection are part of the same subset.
+
+### --selectionCriterion
+
+Which method to use to rank and select tiles in images.
+The available options are:
+
+1. `"MeanIntensity"`: Select tiles with the highest mean pixel intensity. This is the default.
+2. `"Entropy"`: Select tiles with the highest entropy.
+3. `"MedianIntensity"`: Select tiles with the highest median pixel intensity.
+4. `"IntensityRange"`: Select tiles with the largest difference in intensity of the brightest and dimmest pixels.
+
+We rank-order all tiles based on one of these criteria and then select some of the best few tiles from each channel.
+If the images are small enough, we select all tiles from each channel.
+
+### --channelOrdering
+
+By default, we assumed that the order of channel numbers is the same as the order, in increasing wavelength, of the emission filters for those channels.
+If this is not the case, use this parameter to specify, as a string of comma-separated integers, the wavelength-order of the channels.
+
+For example, if the channels are `0, 1, 2, 3, 4` and they correspond to wavelengths (of the emission filter) of `420nm, 350nm, 600nm, 510nm, 580nm`, then `--channelOrdering` should be `"1,0,3,4,2"`.
+
+If this parameter is not specified, then we assume that the channel numbers are in increasing wavelength order.
+
+If you do not know the channel ordering, you can use the `--channelOverlap` parameter to specify a higher number of adjacent channels to consider as contributors to bleed-through.
+
+### --channelOverlap
+
+For each channel in the image, we assume that the only noticeable bleed-through is from a small number of adjacent channels.
+By default, we consider only `1` adjacent channel on each side of the wavelength scale as contributors to bleed-through.
+
+For example, for channel 3, we would consider channels 2 and 4 to contribute bleed-through components.
+
+Use a higher value for `--channelOverlap` to have our model look for bleed-through from farther channels.
+
+### --kernelSize
+
+We learn a convolutional kernel for estimating the bleed-through from each channel to each neighboring channel.
+This parameter specifies the size of those kernels.
+
+We recommend one of `3`, `5`, or `7` and use `3` as the default.
+
+## TODOs:
+
+1. Handle case where each image file contains all channels.
+2. Extend to 3d images.
+
+## Build the plugin
+
+To build the Docker image for the conversion plugin, run `./build-docker.sh`.
+
+## Install WIPP Plugin
+
+In WIPP, navigate to the plugins page and add a new plugin.
+Paste the contents of `plugin.json` into the pop-up window and submit.
+
+## Options
+
+This plugin takes 6 input arguments and 1 output argument:
+
+| Name                   | Description                              | I/O    | Type    | Default         |
+| ---------------------- | ---------------------------------------- | ------ | ------- | --------------- |
+| `--inpDir`             | Input image collection.                  | Input  | String  | N/A             |
+| `--filePattern`        | File pattern to subset images.           | Input  | String  | ".*"            |
+| `--groupBy`            | Variables to group together.             | Input  | String  | ""              |
+| `--channelOrdering`    | Channel ordering by wavelength scale.    | Input  | String  | ""              |
+| `--selectionCriterion` | Method to use for selecting tiles.       | Input  | Enum    | "MeanIntensity" |
+| `--channelOverlap`     | Number of adjacent channels to consider. | Input  | Integer | 1               |
+| `--kernelSize`         | Size of convolutional kernels to learn.  | Input  | Integer | 3               |
+| `--outDir`             | Output image collection.                 | Output | String  | N/A             |
diff --git a/regression/theia-bleedthrough-estimation-plugin/VERSION b/regression/theia-bleedthrough-estimation-plugin/VERSION
@@ -0,0 +1 @@
+0.5.0-dev1