Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

verdi storage backup #6069

Merged
merged 63 commits into from
Apr 18, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
d9169c7
preliminary backup script and verdi profile dbdump command
eimrek Jul 5, 2023
10b85b7
convert the script to 'verdi storage backup' command
eimrek Aug 3, 2023
7c277d4
reorganize; separate utility functions to backup_utils
eimrek Sep 7, 2023
d7fa508
use backup utilities from disk_objectstore
eimrek Nov 2, 2023
078ad40
adapt to latest disk-objectstore PR161
eimrek Dec 7, 2023
b8ddf42
rm dbdump cli command
eimrek Jan 12, 2024
839aa1c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 12, 2024
28ea090
implement sphuber's review
eimrek Jan 12, 2024
32a7a39
config.json - only the backed up profile
eimrek Jan 12, 2024
10f25ed
add a minimal backup pytest
eimrek Jan 18, 2024
63bf76b
test psql_dos backup; raise NotImplementedError for other backends
eimrek Jan 23, 2024
db08d0c
docs: update backup instructions
eimrek Jan 24, 2024
0304201
temporarily install `disk-objectstore` from repo
sphuber Jan 24, 2024
78ed0da
docs: restore tui section
sphuber Jan 24, 2024
ec913e9
Update docs/source/howto/installation.rst
eimrek Jan 24, 2024
c4d52bf
Update src/aiida/cmdline/commands/cmd_storage.py
eimrek Jan 24, 2024
877827d
Update src/aiida/cmdline/commands/cmd_storage.py
eimrek Jan 24, 2024
07f8168
Update src/aiida/cmdline/commands/cmd_storage.py
eimrek Jan 24, 2024
de74427
Update tests/cmdline/commands/test_storage.py
eimrek Jan 24, 2024
980f74c
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Jan 24, 2024
c4c67ae
Update tests/storage/psql_dos/test_backend.py
eimrek Jan 24, 2024
78fa311
storage_backend backup non-abstract
eimrek Jan 24, 2024
c656feb
docs: add backup mention in FAQ
eimrek Jan 24, 2024
4a154ff
docs: update backup section
eimrek Jan 24, 2024
b30120a
adapt docs
eimrek Jan 24, 2024
c6834ce
config.json check at the start; doc changes
eimrek Jan 24, 2024
d6bd0b1
cli: pg-dump-exe comment
eimrek Jan 25, 2024
26203bd
add backup_utils.BackupManager to docs nitpick-exceptions
eimrek Jan 25, 2024
7028dbf
Update src/aiida/cmdline/commands/cmd_storage.py
eimrek Feb 7, 2024
6dd3915
backup: turn off compression for maintain to match default cli cmd
eimrek Feb 7, 2024
740ff42
correct pass exception message
eimrek Feb 7, 2024
f5501cc
fix live-backup created as a file
eimrek Feb 7, 2024
61cde3d
remove rsync and pg_dump arguments from CLI, check them in the backend
eimrek Feb 7, 2024
8700cbe
add logger.report commands to indicate different steps
eimrek Feb 7, 2024
1644007
Logging: Add the `disk_objectstore` logger to the config
sphuber Feb 22, 2024
966b90c
adapt to latest disk-objectstore
eimrek Feb 22, 2024
a63756f
rm CLI profile dbdump command that was added by mistake
eimrek Feb 23, 2024
6755b97
catch NotImplementedError
eimrek Feb 23, 2024
5cafc17
keep default None, which keeps all backups
eimrek Mar 1, 2024
52110d9
aiida-backup.json: checks profile match or empty dest
eimrek Mar 1, 2024
7ecfdc0
disk-objectstore dependency to 1.1 instead of master
eimrek Mar 14, 2024
3ddbaaa
make verbosity exception to affect disk_objectstore logger
eimrek Mar 14, 2024
4537751
fix pre-commit
eimrek Mar 14, 2024
2231f2b
Remove the backup of config.json
eimrek Mar 14, 2024
2424b20
manually update requirements files
eimrek Mar 15, 2024
1952eda
fix tui
eimrek Mar 15, 2024
35c6ef5
adapt psql_dos backup test
eimrek Mar 15, 2024
c8b26d9
test failure on non-empty backup destination
eimrek Mar 15, 2024
1eaf97d
Update tests/cmdline/commands/test_storage.py
eimrek Mar 15, 2024
48adfd6
add failure test on backup profile mismatch
eimrek Mar 18, 2024
517a53b
Update src/aiida/cmdline/commands/cmd_storage.py
eimrek Apr 1, 2024
59ad950
test for keep argument
eimrek Apr 1, 2024
98f8456
raise NotImplementedError for sqlite_dos
eimrek Apr 1, 2024
95ffe14
check is_backup_implemented before creating the folder
eimrek Apr 10, 2024
d4d5dee
Merge branch 'main' into backup-script
sphuber Apr 11, 2024
2428fc5
replace top-level aiida-backup.json with config.json
eimrek Apr 15, 2024
029c0c4
adapt docs slightly
eimrek Apr 15, 2024
b6aeb0d
docs: nitpick typing_extensions.Literal
eimrek Apr 15, 2024
8f4bc6e
Merge remote-tracking branch 'origin/main' into backup-script
sphuber Apr 18, 2024
573c8fa
Alternative solution to `is_backup_implemented`
sphuber Apr 17, 2024
7ed4313
Use settings variable for `config.json` literal
sphuber Apr 18, 2024
47430f9
Merge pull request #6355 from sphuber/fix/backup-script-alternative-i…
sphuber Apr 18, 2024
413a957
Merge branch 'main' into backup-script
sphuber Apr 18, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions docs/source/howto/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -111,3 +111,9 @@ When the SSH key pair expires, AiiDA will fail to connect to the remote computer
This will cause all calculations submitted on that computer to pause.
To restart them, one needs to generate a new SSH key pair and play the paused processes using ``verdi process play --all``.
Typically, this is all one needs to do - AiiDA will re-establish the connection to the computer and will continue following the calculations.

How to back up AiiDA data?
=============================================================================

The most convenient way to back up an AiiDA profile is to use the ``verdi --profile <name> storage backup`` command.
For more information, see :ref:`how-to:installation:backup`.
58 changes: 37 additions & 21 deletions docs/source/howto/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -547,20 +547,33 @@ See the :doc:`../reference/_changelog` for a list of breaking changes.

.. _how-to:installation:backup:

Backing up your installation
Backing up your data
============================

A full backup of an AiiDA instance and AiiDA managed data requires a backup of:
General information
-----------------------------------------

* the AiiDA configuration folder, which is named ``.aiida``.
The location of the folder is shown in the output of ``verdi status``.
This folder contains, among other things, the ``config.json`` configuration file and log files.
The most convenient way to back up the data of a single AiiDA profile is to use
eimrek marked this conversation as resolved.
Show resolved Hide resolved

* the data stored for each profile.
Where the data is stored, depends on the storage backend used by each profile.
.. code:: bash

The panels below provide instructions for storage backends provided by ``aiida-core``.
To determine what storage backend a profile uses, call ``verdi profile show``.
$ verdi --profile <profile_name> storage backup /path/to/destination

This command automatically manages a subfolder structure of previous backups, and new backups are done in an efficient way (using ``rsync`` hard-link functionality to the previous backup).
The command backs up everything that's needed to restore the profile later:

* the AiiDA configuration file ``.aiida/config.json``, from which other profiles are removed (see ``verdi status`` for exact location);
* all the data of the backed up profile (which depends on the storage backend).

The specific procedure of the command and whether it even is implemented depends on the storage backend.

.. note::
The ``verdi storage backup`` command is implemented in a way to be as safe as possible to use when AiiDA is running, meaning that it will most likely produce an uncorrupted backup even when data is being modified. However, the exact conditions depend on the specific storage backend and to err on the safe side, only perform a backup when the profile is not in use.

Storage backend specific information
-----------------------------------------

Alternatively to the CLI command, one can also manually create a backup. This requires a backup of the configuration file ``.aiida/config.json`` and the storage backend. The panels below provide instructions for storage backends provided by ``aiida-core``. To determine what storage backend a profile uses, call ``verdi profile show``.

.. tip:: Before creating a backup, it is recommended to run ``verdi storage maintain``.
This will optimize the storage which can significantly reduce the time required to create the backup.
Expand Down Expand Up @@ -605,44 +618,47 @@ To determine what storage backend a profile uses, call ``verdi profile show``.

.. _how-to:installation:backup:restore:

Restoring your installation
===========================
Restoring data from a backup
==================================

Restoring a backed up AiiDA installation requires:
Restoring a backed up AiiDA profile requires:

* restoring the backed up ``.aiida`` folder, with at the very least the ``config.json`` file it contains.
It should be placed in the path defined by the ``AIIDA_PATH`` environment variable.
To test the restoration worked, run ``verdi profile list`` to verify that all profiles are displayed.
* restoring the profile information in the AiiDA ``config.json`` file. Simply copy the`profiles` entry from
the backed up `config.json`to the one of the running AiiDA instance (see `verdi status` for exact location).
Some information (e.g. the database parameters) might need to be updated.

* restoring the data of each backed up profile.
* restoring the data of of the backed up profile according to the ``config.json`` entry.
Like the backup procedure, this is dependent on the storage backend used by the profile.

The panels below provide instructions for storage backends provided by ``aiida-core``.
To determine what storage backend a profile uses, call ``verdi profile show``.
To test if the restoration worked, run ``verdi -p <profile-name> status`` to verify that AiiDA can successfully connect to the data storage.

.. tab-set::

.. tab-item:: psql_dos

To fully backup the data stored for a profile using the ``core.psql_dos`` backend, you should restore the associated database and file repository.
To restore the backed up data for a profile using the ``core.psql_dos`` backend, you should restore the associated database and file repository.

**PostgreSQL database**

To restore the PostgreSQL database from the ``.psql`` file that was backed up, first you should create an empty database following the instructions described in :ref:`database <intro:install:database>` skipping the ``verdi setup`` phase.
To restore the PostgreSQL database from the ``db.psql`` file that was backed up, first you should create an empty database following the instructions described in :ref:`database <intro:install:database>` skipping the ``verdi setup`` phase.
The backed up data can then be imported by calling:

.. code-block:: console

psql -h <database_hostname> -p <database_port> -d <database_name> -W < aiida_backup.psql
psql -h <db_hostname> -p <db_port> - U <db_user> -d <db_name> -W < db.psql

where the parameters need to match with the corresponding AiiDA `config.json` profile entry.

**File repository**

To restore the file repository, simply copy the directory that was backed up to the location indicated by the ``storage.config.repository_uri`` key returned by the ``verdi profile show`` command.
To restore the file repository, simply copy the directory that was backed up to the location indicated in AiiDA `config.json` (or the ``storage.config.repository_uri`` key returned by the ``verdi profile show`` command).
Like the backing up process, we recommend using ``rsync`` for this:

.. code-block:: console

rsync -arvz /some/path/aiida_backup <storage.config.repository_uri>
rsync -arvz /path/to/backup/container <storage.config.repository_uri>


.. _how-to:installation:multi-user:
Expand Down
1 change: 1 addition & 0 deletions docs/source/nitpick-exceptions
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,7 @@ py:class concurrent.futures._base.TimeoutError
py:class concurrent.futures._base.Future

py:class disk_objectstore.utils.LazyOpener
py:class disk_objectstore.backup_utils.BackupManager

py:class frozenset

Expand Down
1 change: 1 addition & 0 deletions docs/source/reference/command_line.rst
Original file line number Diff line number Diff line change
Expand Up @@ -567,6 +567,7 @@ Below is a list with all available subcommands.
--help Show this message and exit.

Commands:
backup Backup the data storage of a profile.
info Summarise the contents of the storage.
integrity Checks for the integrity of the data storage.
maintain Performs maintenance tasks on the repository.
Expand Down
2 changes: 1 addition & 1 deletion environment.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ dependencies:
- circus~=0.18.0
- click-spinner~=0.1.8
- click~=8.1
- disk-objectstore~=1.0
- disk-objectstore~=1.1
- docstring_parser
- get-annotations~=0.1
- python-graphviz~=0.19
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ dependencies = [
'circus~=0.18.0',
'click-spinner~=0.1.8',
'click~=8.1',
'disk-objectstore~=1.0',
'disk-objectstore~=1.1',
'docstring-parser',
'get-annotations~=0.1;python_version<"3.10"',
'graphviz~=0.19',
Expand Down
2 changes: 1 addition & 1 deletion requirements/requirements-py-3.10.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ debugpy==1.6.7
decorator==5.1.1
defusedxml==0.7.1
deprecation==2.1.0
disk-objectstore==1.0.0
disk-objectstore==1.1.0
docstring-parser==0.15
docutils==0.20.1
emmet-core==0.57.1
Expand Down
2 changes: 1 addition & 1 deletion requirements/requirements-py-3.11.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ debugpy==1.6.7
decorator==5.1.1
defusedxml==0.7.1
deprecation==2.1.0
disk-objectstore==1.0.0
disk-objectstore==1.1.0
docstring-parser==0.15
docutils==0.20.1
emmet-core==0.57.1
Expand Down
2 changes: 1 addition & 1 deletion requirements/requirements-py-3.12.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ debugpy==1.8.0
decorator==5.1.1
defusedxml==0.7.1
deprecation==2.1.0
disk-objectstore==1.0.0
disk-objectstore==1.1.0
docstring-parser==0.15
docutils==0.20.1
executing==2.0.0
Expand Down
2 changes: 1 addition & 1 deletion requirements/requirements-py-3.9.txt
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ debugpy==1.6.7
decorator==5.1.1
defusedxml==0.7.1
deprecation==2.1.0
disk-objectstore==1.0.0
disk-objectstore==1.1.0
docstring-parser==0.15
docutils==0.20.1
emmet-core==0.57.1
Expand Down
47 changes: 47 additions & 0 deletions src/aiida/cmdline/commands/cmd_storage.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,3 +165,50 @@ def storage_maintain(ctx, full, no_repack, force, dry_run, compress):
except LockingProfileError as exception:
echo.echo_critical(str(exception))
echo.echo_success('Requested maintenance procedures finished.')


@verdi_storage.command('backup')
@click.argument('dest', type=click.Path(file_okay=False), nargs=1)
@click.option(
'--keep',
type=int,
required=False,
help=(
'Number of previous backups to keep in the destination, '
'if the storage backend supports it. If not set, keeps all previous backups.'
),
)
@decorators.with_manager
@click.pass_context
def storage_backup(ctx, manager, dest: str, keep: int):
"""Backup the data storage of a profile.

The backup is created in the destination `DEST`, in a subfolder that follows the naming convention
backup_<timestamp>_<randstr> and a symlink called `last-backup` is pointed to it.

Destination (DEST) can either be a local path, or a remote destination (reachable via ssh).
In the latter case, remote destination needs to have the following syntax:

[<remote_user>@]<remote_host>:<path>

i.e., contain the remote host name and the remote path, separated by a colon (and optionally the
remote user separated by an @ symbol). You can tune SSH parameters using the standard options given
by OpenSSH, such as adding configuration options to ~/.ssh/config (e.g. to allow for passwordless
login - recommended, since this script might ask multiple times for the password).

NOTE: 'rsync' and other UNIX-specific commands are called, thus the command will not work on
non-UNIX environments. What other executables are called, depend on the storage backend.
"""

storage = manager.get_profile_storage()
profile = ctx.obj.profile
try:
storage.backup(dest, keep)
except NotImplementedError:
echo.echo_critical(
f'Profile {profile.name} uses the storage plugin `{profile.storage_backend}` which does not implement a '
'backup mechanism.'
)
except (ValueError, exceptions.StorageBackupError) as exception:
echo.echo_critical(str(exception))
eimrek marked this conversation as resolved.
Show resolved Hide resolved
echo.echo_success(f'Data storage of profile `{profile.name}` backed up to `{dest}`.')
5 changes: 5 additions & 0 deletions src/aiida/common/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
'OutputParsingError',
'HashingError',
'StorageMigrationError',
'StorageBackupError',
'LockedProfileError',
'LockingProfileError',
'ClosedStorage',
Expand Down Expand Up @@ -218,6 +219,10 @@ class StorageMigrationError(DatabaseMigrationError):
"""Raised if a critical error is encountered during a storage migration."""


class StorageBackupError(AiidaException):
"""Raised if a critical error is encountered during a storage backup."""


class DbContentError(AiidaException):
"""Raised when the content of the DB is not valid.
This should never happen if the user does not play directly
Expand Down
7 changes: 6 additions & 1 deletion src/aiida/common/log.py
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,11 @@ def get_logging_config():
'level': lambda: get_config_option('logging.verdi_loglevel'),
'propagate': False,
},
'disk_objectstore': {
'handlers': ['console'],
'level': lambda: get_config_option('logging.disk_objectstore_loglevel'),
'propagate': False,
},
'plumpy': {
'handlers': ['console'],
'level': lambda: get_config_option('logging.plumpy_loglevel'),
Expand Down Expand Up @@ -221,7 +226,7 @@ def configure_logging(with_orm=False, daemon=False, daemon_log_file=None):
# can still configure those manually beforehand through the config options.
if CLI_LOG_LEVEL is not None:
for name, logger in config['loggers'].items():
if name in ['aiida', 'verdi']:
if name in ['aiida', 'verdi', 'disk_objectstore']:
logger['level'] = CLI_LOG_LEVEL

# Add the `DbLogHandler` if `with_orm` is `True`
Expand Down
3 changes: 3 additions & 0 deletions src/aiida/manage/configuration/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,9 @@ class ProfileOptionsSchema(BaseModel, defer_build=True):
logging__verdi_loglevel: LogLevels = Field(
'REPORT', description='Minimum level to log to console when running a `verdi` command.'
)
logging__disk_objectstore_loglevel: LogLevels = Field(
'INFO', description='Minimum level to log to daemon log and the `DbLog` table for `disk_objectstore` logger.'
)
logging__db_loglevel: LogLevels = Field('REPORT', description='Minimum level to log to the DbLog table.')
logging__plumpy_loglevel: LogLevels = Field(
'WARNING', description='Minimum level to log to daemon log and the `DbLog` table for the `plumpy` logger.'
Expand Down
Loading
Loading