-
Notifications
You must be signed in to change notification settings - Fork 1
System Setup on Debian
This page provides instructions to install and set up ARCLib on Debian-like systems (including Ubuntu). This is not officially supported guide and should be considered as an complementary to the official System Setup. The instructions in this guide can be outdated and is provided without warranty. However, it could be helpful for Debian-like system administrators.
Tested on: Debian 10.7
Git will be used to get necessary packages.
root@arclib:~# aptitude install git`
Java 17 is required. Maven is required. Java 11 is default in Debian 10.
root@arclib:~# aptitude install default-jdk
root@arclib:~# aptitude install maven`
PostgreSQL is required, default (version 15) works properly.
root@arclib:~# aptitude install postgresql
root@arclib:~# systemctl start postgresql
root@arclib:~# systemctl enable postgresql
root@arclib:~# aptitude install clamav
DROID is not a part of official Debian repo, will be installed manually:
root@arclib:~# mkdir /opt/droid-6.5 && cd /opt/droid-6.5
root@arclib:~# wget https://github.com/digital-preservation/droid/releases/download/droid-6.5/droid-binary-6.5-bin.zip
root@arclib:~# unzip droid-binary-6.5-bin.zip
root@arclib:~# cd /opt && ln -s droid-6.5/ droid && ln -s /opt/droid/droid.sh /usr/local/bin/droid.sh && ln -s /opt/droid/droid.sh /usr/local/bin/droid
ARClib required Solr 9.7.0, will be installed manually:
root@arclib:~# cd /opt
root@arclib:~# wget https://archive.apache.org/dist/solr/solr/9.7.0/solr-9.7.0.tgz
root@arclib:~# tar -xvf solr-9.7.0.tgz
root@arclib:~# adduser solr
root@arclib:~# chown -R solr:solr solr-9.7.0
root@arclib:~# ln -s solr-9.7.0 solr
root@arclib:~# usermod -d /opt/solr/server/solr solr
Initial setup of Solr is needed, Solr have to be run in a cloud mode, this can be done this way (when prompt, leave defaults):
root@arclib:~# su - solr
solr@arclib:~$ cd /opt/solr
solr@arclib:~$ ./bin/solr -e cloud
The Solr server is then started in a cloud mode.
Users (including administrator) might be stored in LDAP or in local database. If you choose LDAP and you don't have LDAP server, you need to install and set up local one:
root@arclib:~# aptitude install slapd ldap-utils
Debian would ask to set up admin user (including ldappassword which will be needed later), who can be used also for testing purposes in ARCLib. You can check the database using slapcat
.
Creating users etc. are beyond the scope of this guide.
Yarn is not in standard repos, either NodeJS . Both needed for ARCLib GUI.
root@arclib:~# curl -sL https://deb.nodesource.com/setup_14.x | bash -
root@arclib:~# apt-get install -y nodejs
root@arclib:~# curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add -
root@arclib:~# echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list
root@arclib:~# aptitude update
root@arclib:~# aptitude install yarn
Apache HTTP server (or just HTTP server) is needed to run Arclib GUI.
root@arclib:~# aptitude install apache2
Follow the instructions in next chapter.
Archival storage is a required "module" for ARCLib and serves as a kind of clever database backend for storing ingested packages. The OS system user arclib will be created firstly, later it will be used for ARCLib too. When creating, set defaults (e.g. /home/arclib). See the official documentation..
root@arclib:~# adduser arclib
root@arclib:~# mkdir -p /opt/ais/archival-storage
root@arclib:~# mkdir -p /opt/ais/logs
root@arclib:~# mkdir -p /opt/ais/archival-storage/tmp
root@arclib:~# chown -R arclib:arclib /opt/ais
root@arclib:~# su - postgres
postgres@arclib:~$ psql postgres -c "CREATE USER arcstorage; ALTER USER arcstorage WITH PASSWORD 'arcstoragepassword'"
postgres@arclib:~$ createdb arcstorage -O arcstorage -E 'utf-8'
postgres@arclib:~$ exit
root@arclib:~# su - arclib
arclib@arclib:~$ git clone https://github.com/LIBCAS/ARCLib-Archival-Storage.git
arclib@arclib:~$ cd ARCLib-Archival-Storage
arclib@arclib:~$ mvn clean package -DskipTests
arclib@arclib:~$ cp target/archival-storage.jar /opt/ais/archival-storage
arclib@arclib:~$ cp src/main/resources/application.yml /opt/ais/archival-storage/
The /opt/ais/archival-storage/application.yml
should be edited to set up basic properties.
Check especially for the PostgreSQL settings (spring.datasource and liquibase).
The systemd unit for Archival Storage service:
[Unit]
Description=Archival Storage
After=syslog.target
[Service]
Type=simple
User=ais
Group=ais
WorkingDirectory=/opt/ais/archival-storage
ExecStart=/opt/ais/archival-storage/archival-storage.jar
SuccessExitStatus=143
[Install]
WantedBy=multi-user.target
Copy this to /etc/systemd/system/archival-storage.service, then
root@arclib:~# systemctl daemon-reload
root@arclib:~# systemctl enable archival-storage.service
root@arclib:~# systemctl start archival-storage.service
The users for communication with ARCLib must be created. BCrypt generator can be used to encode chosen password, lets set it to 'arcstorageuserpassword' (will be written to ARCLib config file later in this guide), the result may be $2a$12$iVcd2tjoDqagfeg45OWfluTSEVZoljql0Y20HQZIjlwqYH90k/VRS (BCrypt internally generates a random salt while encoding passwords and store that salt along with the encrypted password, hence you will get different encoded results for the same input every time).
root@arclib:~# su - postgres
postgres@arclib:~$ psql arcstorage
INSERT INTO arcstorage_user VALUES ('1', NOW(), NOW(), NULL, 'arclib-read', '$2a$12$iVcd2tjoDqagfeg45OWfluTSEVZoljql0Y20HQZIjlwqYH90k/VRS', 'dataspace', 'ROLE_READ', 'root@localhost');
INSERT INTO arcstorage_user VALUES ('2', NOW(), NOW(), NULL, 'arclib-read-write', '$2a$12$iVcd2tjoDqagfeg45OWfluTSEVZoljql0Y20HQZIjlwqYH90k/VRS', 'dataspace', 'ROLE_READ_WRITE', 'root@localhost');
INSERT INTO arcstorage_user VALUES ('0', NOW(), NOW(), NULL, 'admin', '$2a$12$iVcd2tjoDqagfeg45OWfluTSEVZoljql0Y20HQZIjlwqYH90k/VRS', 'dataspace', 'ROLE_ADMIN', 'root@localhost');
Last step is to create at least one datastore, for testing purposes the local file system. Copy the JSON bellow to the file data_space.json:
{
"id": "4fddaf00-43a9-485f-b81a-d3a4bcd6dd83",
"name": "dataspace",
"host": "localhost",
"port": 0,
"priority": 10,
"storageType": "FS",
"note": null,
"config": "{\"rootDirPath\":\"/opt/storage/\"}",
"reachable": true
}
then call:
arclib@arclib:~$ curl -H "Content-Type: application/json" -H "Authorization: Basic YWRtaW46YXJjc3RvcmFnZXVzZXJwYXNzd29yZA==" -d @data_space.json http://localhost:8081/api/administration/storage
The YWRtaW46YXJjc3RvcmFnZXVzZXJwYXNzd29yZA== is Base64 encoded string 'admin:arcstorageuserpassword'.
Finally create the storage dir with correct permissions:
root@arclib:~# mkdir /opt/storage && chown -R arclib:arclib /opt/storage
This part of installation is described in the official guide, the instructions bellow might not be correct for actual version of ARCLib.
su - postgres
createuser arclib -d -P
createdb arclib -O arclib -E 'utf-8'
exit
main/resources/
Download config sets to /opt and create sets in Solr:
root@arclib:~# cd /opt
root@arclib:~# cp -va arclib-arclibXmlC-schema/ arclib-managed/ /opt/solr/server/solr/configsets/
root@arclib:~# export SOLR_BIN=/opt/solr/bin/solr
root@arclib:~# $SOLR_BIN delete -c arclibDomainC && $SOLR_BIN delete -c formatC && $SOLR_BIN delete -c ingestIssueC && $SOLR_BIN delete -c arclibXmlC
root@arclib:~# $SOLR_BIN create -c arclibDomainC -d arclib-managed && $SOLR_BIN create -c formatC -d arclib-managed && $SOLR_BIN create -c ingestIssueC -d arclib-managed && $SOLR_BIN create -c arclibXmlC -d arclib-arclibXmlC-schema
root@arclib:~# $SOLR_BIN zk -cmd upconfig -n arclibXmlC -d /opt/solr/server/solr/configsets/arclib-arclibXmlC-schema/conf -z localhost:9983 -collection arclibXmlC
root@arclib:~# curl 'http://localhost:8983/solr/admin/collections?action=RELOAD&name=arclibXmlC'
root@arclib:~# mkdir /opt/arclib
root@arclib:~# mkdir /opt/logs
root@arclib:~# chown arclib:arclib /opt/arclib /opt/logs
root@arclib:~# mkdir -p /opt/arclib/multipart_tmp && chown www-data:arclib /opt/arclib/multipart_tmp/ && chmod g+w /opt/arclib/multipart_tmp/
root@arclib:~# mkdir -p /opt/arclib/fileStorage/prod1 && chown -R www-data:arclib /opt/arclib/fileStorage && chmod -R g+w /opt/arclib/fileStorage
root@arclib:~# su - arclib
arclib@arclib:~$ echo "export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-amd64" >> .bashrc
arclib@arclib:~$ source .bashrc
arclib@arclib:~$ git clone https://github.com/LIBCAS/ARCLib.git
arclib@arclib:~$ mvn clean package -DskipTests
arclib@arclib:~$ cp ARCLib/system/target/arclib.jar /opt/arclib
arclib@arclib:~$ cp ARCLib/system/src/main/resources/application.yml /opt/arclib/
arclib@arclib:~$ exit
Now edit the /opt/arclib/application.yml and set up necessary properties. Most important are:
spring:
datasource:
url: jdbc:postgresql://localhost:5432/arclib
driver-class-name: org.postgresql.Driver
name: mainPool
username: arclib
password: <thepasswordusedforthearclibuser>
liquibase:
changeLog: classpath:/dbchangelog.arclib.xml
url: jdbc:postgresql://localhost:5432/arclib
user: arclib
password: <thepasswordusedforthearclibuser>
arclib:
version: 1.0
path:
workspace: /opt/arclib/workspace
quarantine: /opt/arclib/workspace/quarantine
fileStorage: /opt/arclib/fileStorage
ldap:
enabled: true
server: ldap://localhost:389
startTls: false
bind.dn: cn=admin,...
bind.pwd: ldappassword
Connection to Archival storage - generate strings for basic auth:
echo -n "arclib-read:arcstorageuserpassword" | base64
echo -n "arclib-read-write:arcstorageuserpassword" | base64
echo -n "admin:arcstorageuserpassword" | base64
and copy them to
archivalStorage:
api: http://localhost:8081/api
debugLocation: arcStorageData
authorization.basic:
read: YXJjbGliLXJlYWQ6YXJjc3RvcmFnZXVzZXJwYXNzd29yZA==
readWrite: YXJjbGliLXJlYWQtd3JpdGU6YXJjc3RvcmFnZXVzZXJwYXNzd29yZA==
admin: YWRtaW46YXJjc3RvcmFnZXVzZXJwYXNzd29yZA==
Set up systemd unit, copy the unit bellow to /etc/systemd/system/arclib.service.
[Unit]
Description=ARCLib
Wants=network-online.target
After=network-online.target
[Service]
WorkingDirectory=/opt/arclib
User=arclib
Group=arclib
ExecStart=/usr/bin/java -jar -Xms2g -Xmx6512m /opt/arclib/arclib.jar
StandardOutput=journal
StandardError=inherit
# Disable timeout logic and wait until process is stopped
TimeoutStopSec=0
# SIGTERM signal is used to stop the Java process
KillSignal=SIGTERM
# Send the signal only to the JVM rather than its control group
KillMode=mixed
# Java process is never killed
SendSIGKILL=no
# When a JVM receives a SIGTERM signal it exits with code 143
SuccessExitStatus=143
[Install]
WantedBy=multi-user.target
Enable the ARCLib service:
root@arclib:~# systemctl daemon-reload
root@arclib:~# systemctl enable arclib.service
Now it is ready for a first run:
root@arclib:~# echo -e "\nspring.jpa.hibernate.ddl-auto: none\nspring.liquibase.enabled: false" >> /opt/arclib/application.yml
root@arclib:~# systemctl start arclib
root@arclib:~# sed -i 's/spring.jpa.hibernate.ddl-auto: none//' /opt/arclib/application.yml
root@arclib:~# sed -i 's/spring.liquibase.enabled: false//' /opt/arclib/application.yml
root@arclib:~# systemctl start arclib
Build and install the frontend (to /opt/www)
root@arclib:~# aptitude install gnupg
root@arclib:~# curl -sL https://deb.nodesource.com/setup_14.x | bash -
root@arclib:~# aptitude install -y nodejs
root@arclib:~# curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add -
root@arclib:~# echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list
root@arclib:~# aptitude update && aptitude install yarn
root@arclib:~# mkdir /opt/www && chown -R arclib /opt/www
root@arclib:~# su - arclib
arclib@arclib:~$ git clone https://github.com/LIBCAS/ARCLib-frontend.git
arclib@arclib:~$ yarn
arclib@arclib:~$ yarn build-prod
arclib@arclib:~$ cp -va ARCLib-frontend/build/* /opt/www
Set up Apache server to serve the /opt/www location:
root@arclib:~# a2enmod proxy_http
Copy the configuration bellow to /etc/apache2/sites-available/000-default.conf:
<VirtualHost *:80>
ServerName yourservername
ServerAdmin youremail
DocumentRoot /opt/www/
ProxyPreserveHost On
ProxyRequests off
ProxyPass "/api" "http://127.0.0.1:8080/api"
ProxyPassReverse "/api" "http://127.0.0.1:8080/api"
<Directory /opt/www/>
Require all granted
</Directory>
ErrorLog ${APACHE_LOG_DIR}/error.log
CustomLog ${APACHE_LOG_DIR}/access.log combined
</VirtualHost>
# vim: syntax=apache ts=4 sw=4 sts=4 sr noet
Do not forget to replace yourservername and youremail with appropriate values.
Restart the Apache server and try to log in with the credentials from LDAP (under any user you want, for testing purposes you can use admin user with ldappassword you set during LDAP installation). For the first time, the initial user is created.
Alternatively, you can log in (or just test it) via cmd:
arclib@arclib:~$ curl -X POST --header 'Content-Type: application/json' --header 'Accept: */*' --header 'Authorization: Basic YWRtaW46bGRhcHBhc3N3b3Jk' 'http://localhost:8080/api/user/login'
You must assign admin role to the created user:
postgres@arclib:~$ ARCLIBADMIN=$(psql -A -t arclib -c 'select id from arclib_user');
postgres@arclib:~$ psql arclib -c "INSERT INTO arclib_assigned_user_role (arclib_user_id, arclib_role_id) VALUES('$ARCLIBADMIN', 'b7a43ad5-883f-4741-948b-08678fa38604');"
The installation process is finished. You can start tu use ARCLib. See the sample ingest or learn about the ingest workflow.
Home
The Ingest - Archival Process
Instructions for Sample Ingest
Predefined Profiles
Docker
Reindex and Reingest (upgrading ARCLib or its profiles)
- System Setup
- System Setup on Debian (unofficial)
- Api and Authorization
- Administration of running system
- ARCLib XML Index Config
- Usage@Index
- Sip Format
- Usage@Sip Profiles
- Usage@Validation Profiles
- Usage@Workflow Definitions
- Usage@Producer Profiles
- Usage@Debug Mode
- Tutorial@Custom Ingest