-
Notifications
You must be signed in to change notification settings - Fork 17
Tools: Docker
An infrastructure that containerizes programs to make them easy to distribute and run.
- https://www.docker.com/ Project site
- wikipedia: https://en.wikipedia.org/wiki/Docker_(software)
Wrapping complex programs and environment for distribution. Currently aimed at
-
getpapers
to avoid installing Node -
ami
andamidict
to distribute code, dictionaries, stylesheets etc. - KNIME to distribute precompiled workflows
- bundling resources (e.g. articles, derived material, etc.)
- Follow instructions on: https://docs.docker.com/docker-for-windows/install/
Problems have been reported on this, but not on Windows Pro The following hardware prerequisites are required to successfully run WSL 2 on Windows 10 Home:
- 64 bit processor with Second Level Address Translation (SLAT)
- 4GB system RAM
- BIOS-level hardware virtualization support must be enabled in the BIOS settings.
If the system doesn't meet the said requirements, the workaround is to install [Docker Toolbox] (https://docs.docker.com/toolbox/toolbox_install_windows/) which may be a tad more complicated.
- Anugrah?
- We can set up Docker Hub Automated Builds to make it easy to save, distribute and install the container images.
- Any version or platform concerns?
- Any resource constraints? (e.g. memory)
- how to document and maintain?
- how to distribute gigabyte files
Make sure the hello-world
example of docker works in your shell/command prompt.
Primary instructions from : https://github.com/bauhuasbadguy/getpapers
- Create a
Dockerfile
without.txt
extension using notepad++ named asDockerfile
. - Write the following commands in the
Dockerfile
FROM node:slim
WORKDIR /usr/src/app
RUN npm install --global getpapers
- Now, give the command
docker build -t paper_getter .
SECURITY WARNING (may be shown) : Means that the get papers JavaScript library is using outdated tools but its only a problem if you're using those tools for a public facing server.
To download PMCs from EPMC, use the command
docker run -it --rm --name get-papers -v <path>/<Cproject>:/<Cproject> paper_getter getpapers -k <limit> -p -x -o /<Cproject> --query 'viral epidemics'
<path>
- path of the directory of the Dockerfile
.
_<limit>
- number of entries needed.
<Cproject>
- directory name of the files to be stored.
AMI was Dockerised for the forest-plot
subset. Cannot find code ATM
Also See https://github.com/petermr/ami3/blob/master/Dockerfile.
FROM maven:3-jdk-8 as builder
# Alternative: clone remote code > not used because Dockerfile lifes in code repo
# get and build ami3 (integrating https://github.com/petermr/cephis and https://github.com/petermr/normami)
# WORKDIR /app
# RUN git clone --depth 1 https://github.com/petermr/ami3.git && \
WORKDIR /app/ami3
COPY src src
COPY pom.xml pom.xml
RUN mvn -Dmaven.test.skip=true install
# unstaged build with just this PATH adjustment works!
ENV PATH /app/ami3/target/appassembler/bin/:${PATH}
# remove unused .bat files
WORKDIR /app/ami3/target/appassembler/bin/
RUN rm *.bat
FROM openjdk:8
# would like to copy more specifically the jar and binary files here, if possible, see #2
COPY --from=builder /app/ami3/target/ /bin/ami3/
ENV PATH /bin/ami3/appassembler/bin/:${PATH}
# Add additional tools needed to handle PDF workflows and support Python tools:
RUN apt-get update && \
apt-get -y install tesseract-ocr gocr default-jre python3 python3-pip && \
rm -rf /var/lib/apt/lists/*
## And install Tika and GROBID:
RUN curl -k -o /opt/tika.jar https://www.mirrorservice.org/sites/ftp.apache.org/tika/tika-app-1.24.jar && \
curl -k -L -o /opt/grobid-src-0.5.3.zip https://github.com/kermitt2/grobid/archive/0.5.3.zip && \
curl -k -L -o /opt/grobid-core-0.5.3-onejar.jar https://github.com/kermitt2/grobid/releases/download/0.5.3/grobid-core-0.5.3-onejar.jar && \
cd /opt && unzip -o grobid-src-0.5.3.zip && mkdir -p /opt/grobid-0.5.3/grobid-core/build/libs/ && mv /opt/grobid-core-0.5.3-onejar.jar /opt/grobid-0.5.3/grobid-core/build/libs/ && \
mkdir -p /opt/grobid-0.5.3/grobid-home/tmp && chmod a+rwx /opt/grobid-0.5.3/grobid-home/tmp
CMD [ "/bin/bash" ]
# docker build --tag ami3 .
# docker run --rm -it norami-docker:0.0.1 ami-test a b c
to be added
In order to run getpapers from a docker container the following Dockerfile may be used:
(PMR: Please upload a copy to the openVirus site)
FROM node:slim
WORKDIR /usr/src/app
RUN npm install --global getpapers
This may be run from terminal using the following shell script
#!/usr/bin/env bash
docker build -t paper_getter .
docker run -it --rm --name get-papers \
-v $(pwd)/results:/results \
paper_getter \
getpapers -o /results --query 'c4 photosynthesis flaveria'
Here the query is set by docker run
. This also sets the folder the results will be stored in using the mounted volume. This allows for queries to be run automatically if desired.
First step is to build the clone the ami repository
git clone https://github.com/petermr/openVirus.git
Then you will need to move into the freshly cloned repo and build the docker image using the command
docker build -t ami3 .
This might take a bit of time since its quite a complicated image. Now we can start using the image.
In order to run the image we will run docker using the following format
docker run -it --rm --name test-ami3 \
-v $(pwd)/<tree-folder>:<tree-folder> \
ami3 \
ami <ami_command>
for example if we have written our downloaded files into $(pwd)/xml_results
docker run -it --rm --name test-ami3 \
-v $(pwd)/xml_results:/xml_results \
ami3 \
ami -p ./xml_results pdfbox
Running a ami-search using Docker
docker run -it --rm --name test-ami3 \
-v $(pwd)/../scraper/xml_results:/xml_results \
-v $(pwd)/output:/output \
-v $(pwd)/logs:/logs \
ami3 \
ami -p xml_results/ search --dictionary country
Creating a library using Docker. This will put the dictionary into the folder, target
.
docker run -it --rm --name test-ami3 \
-v $(pwd)/../scraper/xml_results:/xml_results \
-v $(pwd)/output:/output \
-v $(pwd)/logs:/logs \
-v $(pwd)/target:/target \
ami3 \
amidict --dictionary myterpenes --directory=target/dictionary/create --inputname miniterpenes \
create --informat list --terms thymol menthol borneol junkolol --wikilinks wikidata wikipedia --outformats xml