-
Notifications
You must be signed in to change notification settings - Fork 5
Deployment
You can test the package without needing to install anything except docker. To try out the package follow these instructions:
-
Make sure you have docker installed
-
Clone the
DQAstats
repo:git clone https://gitlab.miracum.org/miracum/dqa/dqastats.git dqastats cd dqastats
-
Run the containerized setup using
docker-compose -f ./docker/docker-compose.yml up
-
Go to
./docker/output/
and see the created report.
You can test the package without needing to install anything except docker. To try out the package follow these instructions:
-
Make sure you have docker installed
-
Clone the
DQAgui
repo:git clone --depth 1 -b development --single-branch https://gitlab.miracum.org/miracum/dqa/dqagui.git dqagui cd dqagui/docker
-
Run the containerized setup using
docker-compose up -d
-
Access the GUI under
localhost:3839
. For a quick intro into the GUI using the Demo Data, see the DQAgui Intro. -
To stop, run
docker-compose down cd ../..
If you want to use your own docker-compose and .env file(s) you can do this simply by using them in this command:
docker-compose \
-f docker-compose_miracum.yml \
--env-file ../dqastats.env \
up --build
Maybe these snippets might be helpful to debug if something goes wrong:
## Open an console inside the container:
docker run -it ghcr.io/miracum/dqastats:latest //bin//bash
## Installed R packages are stored in:
## "/usr/local/lib/R/site-library" and
## "/usr/local/lib/R/library"
## Run example data:
Sys.setenv("EXAMPLECSV_SOURCE_PATH" = system.file("demo_data", package = "DQAstats"));
Sys.setenv("EXAMPLECSV_TARGET_PATH" = system.file("demo_data", package = "DQAstats"));
tmp <- DQAstats::dqa(
source_system_name = "exampleCSV_source",
target_system_name = "exampleCSV_target",
utils_path = "/usr/local/lib/R/site-library/DQAstats/demo_data/utilities",
mdr_filename = "mdr_example_data.csv",
output_dir = "/data/output",
logfile_dir = "/data/logs"
)
The manifest ./docker/dqastats-workflow.yaml
uses Argo Workflows to shedule the dockerized version of DQAstats to run a data quality (DQ) analysis on a regular basis.
-
Install KinD (Kubernetes in Docker).
-
Create a local cluster for testing:
kind create cluster
-
Install Argo Workflows:
## Add the HELM repo for Argo: helm repo add bitnami https://charts.bitnami.com/bitnami ## Install Argo Workflow with own presets: helm install argo-wf bitnami/argo-workflows \ --set server.serviceAccount.name=argo-wf-san
-
Follow the instructions in the console to obtain the Bearer token, these might be similar to the following:
## Note: If you changed the name `arg-wf` of the deployment ## in the `helm install ...` command above, ## you also need to change it here: SECRET=$(kubectl get sa argo-wf-san -o=jsonpath='{.secrets[0].name}') ARGO_TOKEN="Bearer $(kubectl get secret $SECRET -o=jsonpath='{.data.token}' | base64 --decode)" echo "$ARGO_TOKEN"
-
Change the manifest
./docker/dqastats-workflow.yaml
to your needs or keep the current one for demo purpose. -
Send the secret and the workflow to the cluster:
kubectl apply -f ./docker/dqastats-secret.yaml kubectl apply -f ./docker/dqastats-workflow.yaml
🎉 Big thanks to @christian.gulden / @chgl for all Kubernetes Support! The draft of this "How to ..." section is borrowed from him, originally from here: https://gitlab.miracum.org/miracum/charts/-/blob/master/README.md.