Skip to content

Commit

Permalink
Plugin Update: Theia bleedthough estimation (#451)
Browse files Browse the repository at this point in the history
* feat: updated theia plugin to new standards

* build: updates with bfio, preadator, tensorflow, and filepattern

* build: bumped version 0.5.0-dev0 -> 0.5.0-dev1
  • Loading branch information
nishaq503 authored Feb 5, 2024
1 parent d085265 commit 7709175
Show file tree
Hide file tree
Showing 19 changed files with 1,505 additions and 0 deletions.
29 changes: 29 additions & 0 deletions regression/theia-bleedthrough-estimation-plugin/.bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
[bumpversion]
current_version = 0.5.0-dev1
commit = False
tag = False
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\-(?P<release>[a-z]+)(?P<dev>\d+))?
serialize =
{major}.{minor}.{patch}-{release}{dev}
{major}.{minor}.{patch}

[bumpversion:part:release]
optional_value = _
first_value = dev
values =
dev
_

[bumpversion:part:dev]

[bumpversion:file:pyproject.toml]
search = version = "{current_version}"
replace = version = "{new_version}"

[bumpversion:file:plugin.json]

[bumpversion:file:VERSION]

[bumpversion:file:README.md]

[bumpversion:file:src/polus/plugins/regression/theia_bleedthrough_estimation/__init__.py]
171 changes: 171 additions & 0 deletions regression/theia-bleedthrough-estimation-plugin/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
################################################################################
# Local Files and Folders
################################################################################
/data
requirements.txt
**/__pycache__
**/*.so

################################################################################
# Python Template from github
################################################################################
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
.idea/
1 change: 1 addition & 0 deletions regression/theia-bleedthrough-estimation-plugin/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
/.vscode
53 changes: 53 additions & 0 deletions regression/theia-bleedthrough-estimation-plugin/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
FROM tensorflow/tensorflow:2.12.0-gpu

# Output from `cat /etc/os-release` in the base container
#
# NAME="Ubuntu"
# VERSION="20.04.5 LTS (Focal Fossa)"
# ID=ubuntu
# ID_LIKE=debian
# PRETTY_NAME="Ubuntu 20.04.5 LTS"
# VERSION_ID="20.04"
# HOME_URL="https://www.ubuntu.com/"
# SUPPORT_URL="https://help.ubuntu.com/"
# BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
# PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
# VERSION_CODENAME=focal
# UBUNTU_CODENAME=focal

# Instal Python 3.9
RUN apt update && \
apt install software-properties-common -y && \
add-apt-repository ppa:deadsnakes/ppa && \
apt install python3.9 python3.9-distutils curl -y && \
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py && \
python3.9 get-pip.py && \
apt autoremove -y && \
rm -rf /var/lib/apt/lists/*

# Symbolic link to python3.9
RUN ln -sf /usr/bin/python3.9 /usr/bin/python3

# environment variables defined in polusai/bfio
ENV EXEC_DIR="/opt/executables"
ENV DATA_DIR="/data"
ENV POLUS_LOG="INFO"
ENV POLUS_IMG_EXT=".ome.tif"
ENV POLUS_TAB_EXT=".csv"

RUN mkdir /.cache && chmod 777 /.cache

# Work directory defined in the base container
WORKDIR ${EXEC_DIR}

# Copy the python package
COPY pyproject.toml ${EXEC_DIR}
COPY VERSION ${EXEC_DIR}
COPY README.md ${EXEC_DIR}
COPY src ${EXEC_DIR}/src

# Install the python package
RUN pip3 install ${EXEC_DIR} --no-cache-dir

ENTRYPOINT ["python3", "-m", "polus.plugins.regression.theia_bleedthrough_estimation"]
CMD ["--help"]
90 changes: 90 additions & 0 deletions regression/theia-bleedthrough-estimation-plugin/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Theia Bleedthrough Estimation (v0.5.0-dev1)

This WIPP plugin estimates the bleed-through in a collection of 2d images.
It uses the Theia algorithm from [this repo](https://github.com/PolusAI/theia).

## File Patterns

This plugin uses [file-patterns](https://filepattern.readthedocs.io/en/latest/Examples.html#what-is-filepattern) to create subsets of an input collection.
In particular, defining a filename variable is surrounded by `{}`, and the variable name and number of spaces dedicated to the variable are denoted by repeated characters for the variable.
For example, if all filenames follow the structure `prefix_tTTT.ome.tif`, where `TTT` indicates the time-point of capture of the image, then the file-pattern would be `prefix_t{ttt}.ome.tif`.

## Optional Parameters

### --groupBy

This parameter can be used to group images into subsets.
This plugin will apply bleed-through correction to each subset.
Each subset should contain all channels for one image/tile/FOV.
The images in each subset should all have the same size (in pixels and dimensions) and one.

If no `--groupBy` is specified, then the plugin will assume that all images in the input collection are part of the same subset.

### --selectionCriterion

Which method to use to rank and select tiles in images.
The available options are:

1. `"MeanIntensity"`: Select tiles with the highest mean pixel intensity. This is the default.
2. `"Entropy"`: Select tiles with the highest entropy.
3. `"MedianIntensity"`: Select tiles with the highest median pixel intensity.
4. `"IntensityRange"`: Select tiles with the largest difference in intensity of the brightest and dimmest pixels.

We rank-order all tiles based on one of these criteria and then select some of the best few tiles from each channel.
If the images are small enough, we select all tiles from each channel.

### --channelOrdering

By default, we assumed that the order of channel numbers is the same as the order, in increasing wavelength, of the emission filters for those channels.
If this is not the case, use this parameter to specify, as a string of comma-separated integers, the wavelength-order of the channels.

For example, if the channels are `0, 1, 2, 3, 4` and they correspond to wavelengths (of the emission filter) of `420nm, 350nm, 600nm, 510nm, 580nm`, then `--channelOrdering` should be `"1,0,3,4,2"`.

If this parameter is not specified, then we assume that the channel numbers are in increasing wavelength order.

If you do not know the channel ordering, you can use the `--channelOverlap` parameter to specify a higher number of adjacent channels to consider as contributors to bleed-through.

### --channelOverlap

For each channel in the image, we assume that the only noticeable bleed-through is from a small number of adjacent channels.
By default, we consider only `1` adjacent channel on each side of the wavelength scale as contributors to bleed-through.

For example, for channel 3, we would consider channels 2 and 4 to contribute bleed-through components.

Use a higher value for `--channelOverlap` to have our model look for bleed-through from farther channels.

### --kernelSize

We learn a convolutional kernel for estimating the bleed-through from each channel to each neighboring channel.
This parameter specifies the size of those kernels.

We recommend one of `3`, `5`, or `7` and use `3` as the default.

## TODOs:

1. Handle case where each image file contains all channels.
2. Extend to 3d images.

## Build the plugin

To build the Docker image for the conversion plugin, run `./build-docker.sh`.

## Install WIPP Plugin

In WIPP, navigate to the plugins page and add a new plugin.
Paste the contents of `plugin.json` into the pop-up window and submit.

## Options

This plugin takes 6 input arguments and 1 output argument:

| Name | Description | I/O | Type | Default |
| ---------------------- | ---------------------------------------- | ------ | ------- | --------------- |
| `--inpDir` | Input image collection. | Input | String | N/A |
| `--filePattern` | File pattern to subset images. | Input | String | ".*" |
| `--groupBy` | Variables to group together. | Input | String | "" |
| `--channelOrdering` | Channel ordering by wavelength scale. | Input | String | "" |
| `--selectionCriterion` | Method to use for selecting tiles. | Input | Enum | "MeanIntensity" |
| `--channelOverlap` | Number of adjacent channels to consider. | Input | Integer | 1 |
| `--kernelSize` | Size of convolutional kernels to learn. | Input | Integer | 3 |
| `--outDir` | Output image collection. | Output | String | N/A |
1 change: 1 addition & 0 deletions regression/theia-bleedthrough-estimation-plugin/VERSION
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0.5.0-dev1
Loading

0 comments on commit 7709175

Please sign in to comment.