Skip to content

System Setup on Debian

Jan Tomášek edited this page Oct 29, 2024 · 8 revisions

This page provides instructions to install and set up ARCLib on Debian-like systems (including Ubuntu). This is not officially supported guide and should be considered as an complementary to the official System Setup. The instructions in this guide can be outdated and is provided without warranty. However, it could be helpful for Debian-like system administrators.

Tested on: Debian 10.7

Basic dependencies

Git

Git will be used to get necessary packages.

root@arclib:~# aptitude install git`

Java

Java 17 is required. Maven is required. Java 11 is default in Debian 10.

root@arclib:~# aptitude install default-jdk
root@arclib:~# aptitude install maven`

Database

PostgreSQL is required, default (version 15) works properly.

root@arclib:~# aptitude install postgresql
root@arclib:~# systemctl start postgresql
root@arclib:~# systemctl enable postgresql

ClamAV

root@arclib:~# aptitude install clamav

DROID

DROID is not a part of official Debian repo, will be installed manually:

root@arclib:~# mkdir /opt/droid-6.5 && cd /opt/droid-6.5
root@arclib:~# wget https://github.com/digital-preservation/droid/releases/download/droid-6.5/droid-binary-6.5-bin.zip
root@arclib:~# unzip droid-binary-6.5-bin.zip
root@arclib:~# cd /opt && ln -s droid-6.5/ droid && ln -s /opt/droid/droid.sh /usr/local/bin/droid.sh && ln -s /opt/droid/droid.sh /usr/local/bin/droid

Solr

ARClib required Solr 9.7.0, will be installed manually:

root@arclib:~# cd /opt
root@arclib:~# wget https://archive.apache.org/dist/solr/solr/9.7.0/solr-9.7.0.tgz
root@arclib:~# tar -xvf solr-9.7.0.tgz
root@arclib:~# adduser solr
root@arclib:~# chown -R solr:solr solr-9.7.0
root@arclib:~# ln -s solr-9.7.0 solr
root@arclib:~# usermod -d /opt/solr/server/solr solr

Initial setup of Solr is needed, Solr have to be run in a cloud mode, this can be done this way (when prompt, leave defaults):

root@arclib:~# su - solr
solr@arclib:~$ cd /opt/solr
solr@arclib:~$ ./bin/solr -e cloud

The Solr server is then started in a cloud mode.

LDAP

Users (including administrator) might be stored in LDAP or in local database. If you choose LDAP and you don't have LDAP server, you need to install and set up local one:

root@arclib:~# aptitude install slapd ldap-utils

Debian would ask to set up admin user (including ldappassword which will be needed later), who can be used also for testing purposes in ARCLib. You can check the database using slapcat. Creating users etc. are beyond the scope of this guide.

Yarn, NodeJS

Yarn is not in standard repos, either NodeJS . Both needed for ARCLib GUI.

root@arclib:~# curl -sL https://deb.nodesource.com/setup_14.x | bash -
root@arclib:~# apt-get install -y nodejs
root@arclib:~# curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add -
root@arclib:~# echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list
root@arclib:~# aptitude update
root@arclib:~# aptitude install yarn

Apache HTTP server

Apache HTTP server (or just HTTP server) is needed to run Arclib GUI.

root@arclib:~# aptitude install apache2

Archival Storage

Follow the instructions in next chapter.

Archival Storage

Archival storage is a required "module" for ARCLib and serves as a kind of clever database backend for storing ingested packages. The OS system user arclib will be created firstly, later it will be used for ARCLib too. When creating, set defaults (e.g. /home/arclib). See the official documentation..

root@arclib:~# adduser arclib
root@arclib:~# mkdir -p /opt/ais/archival-storage
root@arclib:~# mkdir -p /opt/ais/logs
root@arclib:~# mkdir -p /opt/ais/archival-storage/tmp
root@arclib:~# chown -R arclib:arclib /opt/ais
root@arclib:~# su - postgres
postgres@arclib:~$ psql postgres -c "CREATE USER arcstorage; ALTER USER arcstorage WITH PASSWORD 'arcstoragepassword'"
postgres@arclib:~$ createdb arcstorage -O arcstorage -E 'utf-8'
postgres@arclib:~$ exit
root@arclib:~# su - arclib
arclib@arclib:~$ git clone https://github.com/LIBCAS/ARCLib-Archival-Storage.git
arclib@arclib:~$ cd ARCLib-Archival-Storage
arclib@arclib:~$ mvn clean package -DskipTests
arclib@arclib:~$ cp target/archival-storage.jar /opt/ais/archival-storage
arclib@arclib:~$ cp src/main/resources/application.yml /opt/ais/archival-storage/

The /opt/ais/archival-storage/application.yml should be edited to set up basic properties. Check especially for the PostgreSQL settings (spring.datasource and liquibase).

The systemd unit for Archival Storage service:

[Unit]
Description=Archival Storage
After=syslog.target

[Service]
Type=simple
User=ais
Group=ais
WorkingDirectory=/opt/ais/archival-storage
ExecStart=/opt/ais/archival-storage/archival-storage.jar
SuccessExitStatus=143

[Install]
WantedBy=multi-user.target

Copy this to /etc/systemd/system/archival-storage.service, then

root@arclib:~# systemctl daemon-reload
root@arclib:~# systemctl enable archival-storage.service
root@arclib:~# systemctl start archival-storage.service

The users for communication with ARCLib must be created. BCrypt generator can be used to encode chosen password, lets set it to 'arcstorageuserpassword' (will be written to ARCLib config file later in this guide), the result may be $2a$12$iVcd2tjoDqagfeg45OWfluTSEVZoljql0Y20HQZIjlwqYH90k/VRS (BCrypt internally generates a random salt while encoding passwords and store that salt along with the encrypted password, hence you will get different encoded results for the same input every time).

root@arclib:~# su - postgres
postgres@arclib:~$ psql arcstorage
INSERT INTO arcstorage_user VALUES ('1', NOW(), NOW(), NULL, 'arclib-read', '$2a$12$iVcd2tjoDqagfeg45OWfluTSEVZoljql0Y20HQZIjlwqYH90k/VRS', 'dataspace', 'ROLE_READ', 'root@localhost');
INSERT INTO arcstorage_user VALUES ('2', NOW(), NOW(), NULL, 'arclib-read-write', '$2a$12$iVcd2tjoDqagfeg45OWfluTSEVZoljql0Y20HQZIjlwqYH90k/VRS', 'dataspace', 'ROLE_READ_WRITE', 'root@localhost');
INSERT INTO arcstorage_user VALUES ('0', NOW(), NOW(), NULL, 'admin', '$2a$12$iVcd2tjoDqagfeg45OWfluTSEVZoljql0Y20HQZIjlwqYH90k/VRS', 'dataspace', 'ROLE_ADMIN', 'root@localhost');

Last step is to create at least one datastore, for testing purposes the local file system. Copy the JSON bellow to the file data_space.json:

{
	"id": "4fddaf00-43a9-485f-b81a-d3a4bcd6dd83",
	"name": "dataspace",
	"host": "localhost",
	"port": 0,
	"priority": 10,
	"storageType": "FS",
	"note": null,
	"config": "{\"rootDirPath\":\"/opt/storage/\"}",
	"reachable": true
}

then call:

arclib@arclib:~$ curl -H "Content-Type: application/json" -H "Authorization: Basic YWRtaW46YXJjc3RvcmFnZXVzZXJwYXNzd29yZA==" -d @data_space.json http://localhost:8081/api/administration/storage

The YWRtaW46YXJjc3RvcmFnZXVzZXJwYXNzd29yZA== is Base64 encoded string 'admin:arcstorageuserpassword'.

Finally create the storage dir with correct permissions:

root@arclib:~# mkdir /opt/storage && chown -R arclib:arclib /opt/storage

ARCLib (install and set up)

This part of installation is described in the official guide, the instructions bellow might not be correct for actual version of ARCLib.

Database preparation

su - postgres
createuser arclib -d -P
createdb arclib -O arclib -E 'utf-8'
exit

Solr set up

main/resources/

Download config sets to /opt and create sets in Solr:

root@arclib:~# cd /opt
root@arclib:~# cp -va arclib-arclibXmlC-schema/ arclib-managed/ /opt/solr/server/solr/configsets/
root@arclib:~# export SOLR_BIN=/opt/solr/bin/solr
root@arclib:~# $SOLR_BIN delete -c arclibDomainC && $SOLR_BIN delete -c formatC && $SOLR_BIN delete -c ingestIssueC && $SOLR_BIN delete -c arclibXmlC
root@arclib:~# $SOLR_BIN create -c arclibDomainC -d arclib-managed && $SOLR_BIN create -c formatC -d arclib-managed && $SOLR_BIN create -c ingestIssueC -d arclib-managed && $SOLR_BIN create -c arclibXmlC -d arclib-arclibXmlC-schema
root@arclib:~# $SOLR_BIN zk -cmd upconfig -n arclibXmlC -d /opt/solr/server/solr/configsets/arclib-arclibXmlC-schema/conf -z localhost:9983 -collection arclibXmlC
root@arclib:~# curl 'http://localhost:8983/solr/admin/collections?action=RELOAD&name=arclibXmlC'

Arclib build and install

root@arclib:~# mkdir /opt/arclib
root@arclib:~# mkdir /opt/logs
root@arclib:~# chown arclib:arclib /opt/arclib /opt/logs
root@arclib:~# mkdir -p /opt/arclib/multipart_tmp && chown www-data:arclib /opt/arclib/multipart_tmp/ && chmod g+w /opt/arclib/multipart_tmp/
root@arclib:~# mkdir -p /opt/arclib/fileStorage/prod1 && chown -R www-data:arclib /opt/arclib/fileStorage && chmod -R g+w /opt/arclib/fileStorage
root@arclib:~# su - arclib
arclib@arclib:~$ echo "export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64" >> .bashrc
arclib@arclib:~$ source .bashrc
arclib@arclib:~$ git clone https://github.com/LIBCAS/ARCLib.git
arclib@arclib:~$ mvn clean package -DskipTests
arclib@arclib:~$ cp ARCLib/system/target/arclib.jar /opt/arclib
arclib@arclib:~$ cp ARCLib/system/src/main/resources/application.yml /opt/arclib/
arclib@arclib:~$ exit

Now edit the /opt/arclib/application.yml and set up necessary properties. Most important are:

spring:
  datasource:
    url: jdbc:postgresql://localhost:5432/arclib
    driver-class-name: org.postgresql.Driver
    name: mainPool
    username: arclib
    password: <thepasswordusedforthearclibuser> 
liquibase:
    changeLog: classpath:/dbchangelog.arclib.xml
    url: jdbc:postgresql://localhost:5432/arclib
    user: arclib
    password: <thepasswordusedforthearclibuser>
arclib:
  version: 1.0
  path:
    workspace: /opt/arclib/workspace
    quarantine: /opt/arclib/workspace/quarantine
    fileStorage: /opt/arclib/fileStorage
ldap:
    enabled: true
    server: ldap://localhost:389
    startTls: false
    bind.dn: cn=admin,...    
    bind.pwd: ldappassword

Connection to Archival storage - generate strings for basic auth:

echo -n "arclib-read:arcstorageuserpassword" | base64
echo -n "arclib-read-write:arcstorageuserpassword" | base64
echo -n "admin:arcstorageuserpassword" | base64

and copy them to

archivalStorage:
  api: http://localhost:8081/api      
  debugLocation: arcStorageData
  authorization.basic:
    read: YXJjbGliLXJlYWQ6YXJjc3RvcmFnZXVzZXJwYXNzd29yZA==
    readWrite: YXJjbGliLXJlYWQtd3JpdGU6YXJjc3RvcmFnZXVzZXJwYXNzd29yZA==
    admin: YWRtaW46YXJjc3RvcmFnZXVzZXJwYXNzd29yZA==

Set up systemd unit, copy the unit bellow to /etc/systemd/system/arclib.service.

[Unit]
Description=ARCLib
Wants=network-online.target
After=network-online.target

[Service]
WorkingDirectory=/opt/arclib

User=arclib
Group=arclib

ExecStart=/usr/bin/java -jar -Xms2g -Xmx6512m /opt/arclib/arclib.jar

StandardOutput=journal
StandardError=inherit

# Disable timeout logic and wait until process is stopped
TimeoutStopSec=0

# SIGTERM signal is used to stop the Java process
KillSignal=SIGTERM

# Send the signal only to the JVM rather than its control group
KillMode=mixed

# Java process is never killed
SendSIGKILL=no

# When a JVM receives a SIGTERM signal it exits with code 143
SuccessExitStatus=143

[Install]
WantedBy=multi-user.target

Enable the ARCLib service:

root@arclib:~# systemctl daemon-reload
root@arclib:~# systemctl enable arclib.service

Now it is ready for a first run:

root@arclib:~# echo -e "\nspring.jpa.hibernate.ddl-auto: none\nspring.liquibase.enabled: false" >> /opt/arclib/application.yml
root@arclib:~# systemctl start arclib
root@arclib:~# sed -i 's/spring.jpa.hibernate.ddl-auto: none//' /opt/arclib/application.yml 
root@arclib:~# sed -i 's/spring.liquibase.enabled: false//' /opt/arclib/application.yml
root@arclib:~# systemctl start arclib

ARClib GUI

Build and install the frontend (to /opt/www)

root@arclib:~# aptitude install gnupg
root@arclib:~# curl -sL https://deb.nodesource.com/setup_14.x | bash -
root@arclib:~# aptitude install -y nodejs
root@arclib:~# curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add -
root@arclib:~# echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list
root@arclib:~# aptitude update && aptitude install yarn
root@arclib:~# mkdir /opt/www && chown -R arclib /opt/www
root@arclib:~# su - arclib
arclib@arclib:~$ git clone https://github.com/LIBCAS/ARCLib-frontend.git
arclib@arclib:~$ yarn
arclib@arclib:~$ yarn build-prod
arclib@arclib:~$ cp -va ARCLib-frontend/build/* /opt/www

Set up Apache server to serve the /opt/www location:

root@arclib:~# a2enmod proxy_http

Copy the configuration bellow to /etc/apache2/sites-available/000-default.conf:

<VirtualHost *:80>
	ServerName yourservername

	ServerAdmin youremail
	DocumentRoot /opt/www/

	ProxyPreserveHost On
        ProxyRequests off
        ProxyPass "/api" "http://127.0.0.1:8080/api"
        ProxyPassReverse "/api" "http://127.0.0.1:8080/api"

	<Directory /opt/www/>
  		Require all granted
	</Directory>

	ErrorLog ${APACHE_LOG_DIR}/error.log
	CustomLog ${APACHE_LOG_DIR}/access.log combined

</VirtualHost>

# vim: syntax=apache ts=4 sw=4 sts=4 sr noet

Do not forget to replace yourservername and youremail with appropriate values.

Restart the Apache server and try to log in with the credentials from LDAP (under any user you want, for testing purposes you can use admin user with ldappassword you set during LDAP installation). For the first time, the initial user is created.

Alternatively, you can log in (or just test it) via cmd:

arclib@arclib:~$ curl -X POST --header 'Content-Type: application/json' --header 'Accept: */*' --header 'Authorization: Basic YWRtaW46bGRhcHBhc3N3b3Jk' 'http://localhost:8080/api/user/login'

You must assign admin role to the created user:

postgres@arclib:~$ ARCLIBADMIN=$(psql -A -t arclib -c 'select id from arclib_user');
postgres@arclib:~$ psql arclib -c "INSERT INTO arclib_assigned_user_role (arclib_user_id, arclib_role_id) VALUES('$ARCLIBADMIN', 'b7a43ad5-883f-4741-948b-08678fa38604');"

The installation process is finished. You can start tu use ARCLib. See the sample ingest or learn about the ingest workflow.

Clone this wiki locally