-
Notifications
You must be signed in to change notification settings - Fork 17
Tools: Docker
petermr edited this page Jun 23, 2020
·
16 revisions
An infrastructure that containerizes programs to make them easy to distribute and run.
- https://www.docker.com/ Project site
- wikipedia: https://en.wikipedia.org/wiki/Docker_(software)
Wrapping complex programs and environment for distribution. Currently aimed at
-
getpapers
to avoid installing Node -
ami
andamidict
to distribute code, dictionaries, stylesheets etc. - KNIME to distribute precompiled workflows
- bundling resources (e.g. articles, derived material, etc.)
- ?Follow instructions on: https://www.docker.com/?
- We can set up Docker Hub Automated Builds to make it easy to save, distribute and install the container images.
- Any version or platform concerns?
- Any resource constraints? (e.g. memory)
- how to document and maintain?
- how to distribute gigabyte files
Bundle:
- Node / or installation script?
- getpapers.js or installation script?
AMI was Dockerised for the forest-plot
subset. Cannot find code ATM
Also See https://github.com/petermr/ami3/blob/master/Dockerfile.
FROM maven:3-jdk-8 as builder
# Alternative: clone remote code > not used because Dockerfile lifes in code repo
# get and build ami3 (integrating https://github.com/petermr/cephis and https://github.com/petermr/normami)
# WORKDIR /app
# RUN git clone --depth 1 https://github.com/petermr/ami3.git && \
WORKDIR /app/ami3
COPY src src
COPY pom.xml pom.xml
RUN mvn -Dmaven.test.skip=true install
# unstaged build with just this PATH adjustment works!
ENV PATH /app/ami3/target/appassembler/bin/:${PATH}
# remove unused .bat files
WORKDIR /app/ami3/target/appassembler/bin/
RUN rm *.bat
FROM openjdk:8
# would like to copy more specifically the jar and binary files here, if possible, see #2
COPY --from=builder /app/ami3/target/ /bin/ami3/
ENV PATH /bin/ami3/appassembler/bin/:${PATH}
# Add additional tools needed to handle PDF workflows and support Python tools:
RUN apt-get update && \
apt-get -y install tesseract-ocr gocr default-jre python3 python3-pip && \
rm -rf /var/lib/apt/lists/*
## And install Tika and GROBID:
RUN curl -k -o /opt/tika.jar https://www.mirrorservice.org/sites/ftp.apache.org/tika/tika-app-1.24.jar && \
curl -k -L -o /opt/grobid-src-0.5.3.zip https://github.com/kermitt2/grobid/archive/0.5.3.zip && \
curl -k -L -o /opt/grobid-core-0.5.3-onejar.jar https://github.com/kermitt2/grobid/releases/download/0.5.3/grobid-core-0.5.3-onejar.jar && \
cd /opt && unzip -o grobid-src-0.5.3.zip && mkdir -p /opt/grobid-0.5.3/grobid-core/build/libs/ && mv /opt/grobid-core-0.5.3-onejar.jar /opt/grobid-0.5.3/grobid-core/build/libs/ && \
mkdir -p /opt/grobid-0.5.3/grobid-home/tmp && chmod a+rwx /opt/grobid-0.5.3/grobid-home/tmp
CMD [ "/bin/bash" ]
# docker build --tag ami3 .
# docker run --rm -it norami-docker:0.0.1 ami-test a b c
to be added
In order to run getpapers from a docker container the following Dockerfile may be used:
(PMR: Please upload a copy to the openVirus site)
FROM node:slim
WORKDIR /usr/src/app
RUN npm install --global getpapers
This may be run from terminal using the following shell script
#!/usr/bin/env bash
docker build -t paper_getter .
docker run -it --rm --name get-papers \
-v $(pwd)/results:/results \
paper_getter \
getpapers -o /results --query 'c4 photosynthesis flaveria'
Here the query is set by docker run
. This also sets the folder the results will be stored in using the mounted volume. This allows for queries to be run automatically if desired.