-
Notifications
You must be signed in to change notification settings - Fork 17
Tools: Docker
An infrastructure that containerizes programs to make them easy to distribute and run.
- https://www.docker.com/ Project site
- wikipedia: https://en.wikipedia.org/wiki/Docker_(software)
Wrapping complex programs and environment for distribution. Currently aimed at
-
getpapers
to avoid installing Node -
ami
andamidict
to distribute code, dictionaries, stylesheets etc. - KNIME to distribute precompiled workflows
- bundling resources (e.g. articles, derived material, etc.)
- ?Follow instructions on: https://www.docker.com/?
- We can set up Docker Hub Automated Builds to make it easy to save, distribute and install the container images.
- Any version or platform concerns?
- Any resource constraints? (e.g. memory)
- how to document and maintain?
- how to distribute gigabyte files
Bundle:
- Node / or installation script?
- getpapers.js or installation script?
AMI was Dockerised for the forest-plot
subset. Cannot find code ATM
Also See https://github.com/petermr/ami3/blob/master/Dockerfile.
FROM maven:3-jdk-8 as builder
# Alternative: clone remote code > not used because Dockerfile lifes in code repo
# get and build ami3 (integrating https://github.com/petermr/cephis and https://github.com/petermr/normami)
# WORKDIR /app
# RUN git clone --depth 1 https://github.com/petermr/ami3.git && \
WORKDIR /app/ami3
COPY src src
COPY pom.xml pom.xml
RUN mvn -Dmaven.test.skip=true install
# unstaged build with just this PATH adjustment works!
ENV PATH /app/ami3/target/appassembler/bin/:${PATH}
# remove unused .bat files
WORKDIR /app/ami3/target/appassembler/bin/
RUN rm *.bat
FROM openjdk:8
# would like to copy more specifically the jar and binary files here, if possible, see #2
COPY --from=builder /app/ami3/target/ /bin/ami3/
ENV PATH /bin/ami3/appassembler/bin/:${PATH}
# Add additional tools needed to handle PDF workflows and support Python tools:
RUN apt-get update && \
apt-get -y install tesseract-ocr gocr default-jre python3 python3-pip && \
rm -rf /var/lib/apt/lists/*
## And install Tika and GROBID:
RUN curl -k -o /opt/tika.jar https://www.mirrorservice.org/sites/ftp.apache.org/tika/tika-app-1.24.jar && \
curl -k -L -o /opt/grobid-src-0.5.3.zip https://github.com/kermitt2/grobid/archive/0.5.3.zip && \
curl -k -L -o /opt/grobid-core-0.5.3-onejar.jar https://github.com/kermitt2/grobid/releases/download/0.5.3/grobid-core-0.5.3-onejar.jar && \
cd /opt && unzip -o grobid-src-0.5.3.zip && mkdir -p /opt/grobid-0.5.3/grobid-core/build/libs/ && mv /opt/grobid-core-0.5.3-onejar.jar /opt/grobid-0.5.3/grobid-core/build/libs/ && \
mkdir -p /opt/grobid-0.5.3/grobid-home/tmp && chmod a+rwx /opt/grobid-0.5.3/grobid-home/tmp
CMD [ "/bin/bash" ]
# docker build --tag ami3 .
# docker run --rm -it norami-docker:0.0.1 ami-test a b c
to be added
In order to run getpapers from a docker container the following Dockerfile may be used:
(PMR: Please upload a copy to the openVirus site)
FROM node:slim
WORKDIR /usr/src/app
RUN npm install --global getpapers
This may be run from terminal using the following shell script
#!/usr/bin/env bash
docker build -t paper_getter .
docker run -it --rm --name get-papers \
-v $(pwd)/results:/results \
paper_getter \
getpapers -o /results --query 'c4 photosynthesis flaveria'
Here the query is set by docker run
. This also sets the folder the results will be stored in using the mounted volume. This allows for queries to be run automatically if desired.
First step is to build the clone the ami repository
git clone https://github.com/petermr/openVirus.git
Then you will need to move into the freshly cloned repo and build the docker image using the command
docker build -t ami3 .
This might take a bit of time since its quite a complicated image. Now we can start using the image.
In order to run the image we will run docker using the following format
docker run -it --rm --name test-ami3 \
-v $(pwd)/<folder>:<folder> \
ami3 \
ami <ami_command>
for example if we have written our downloaded files into $(pwd)/xml_results
docker run -it --rm --name test-ami3 \
-v $(pwd)/../scraper/xml_results:/xml_results \
ami3 \
ami -p ./xml_results pdfbox