diff --git a/docs/source/howto/faq.rst b/docs/source/howto/faq.rst index b7e9f029dc..c283f1cbe3 100644 --- a/docs/source/howto/faq.rst +++ b/docs/source/howto/faq.rst @@ -111,3 +111,9 @@ When the SSH key pair expires, AiiDA will fail to connect to the remote computer This will cause all calculations submitted on that computer to pause. To restart them, one needs to generate a new SSH key pair and play the paused processes using ``verdi process play --all``. Typically, this is all one needs to do - AiiDA will re-establish the connection to the computer and will continue following the calculations. + +How to back up AiiDA data? +============================================================================= + +The most convenient way to back up an AiiDA profile is to use the ``verdi --profile storage backup`` command. +For more information, see :ref:`how-to:installation:backup`. diff --git a/docs/source/howto/installation.rst b/docs/source/howto/installation.rst index 534582963a..a5dc6c00a1 100644 --- a/docs/source/howto/installation.rst +++ b/docs/source/howto/installation.rst @@ -547,20 +547,33 @@ See the :doc:`../reference/_changelog` for a list of breaking changes. .. _how-to:installation:backup: -Backing up your installation +Backing up your data ============================ -A full backup of an AiiDA instance and AiiDA managed data requires a backup of: +General information +----------------------------------------- -* the AiiDA configuration folder, which is named ``.aiida``. - The location of the folder is shown in the output of ``verdi status``. - This folder contains, among other things, the ``config.json`` configuration file and log files. +The most convenient way to back up the data of a single AiiDA profile is to use -* the data stored for each profile. - Where the data is stored, depends on the storage backend used by each profile. +.. code:: bash -The panels below provide instructions for storage backends provided by ``aiida-core``. -To determine what storage backend a profile uses, call ``verdi profile show``. + $ verdi --profile storage backup /path/to/destination + +This command automatically manages a subfolder structure of previous backups, and new backups are done in an efficient way (using ``rsync`` hard-link functionality to the previous backup). +The command backs up everything that's needed to restore the profile later: + +* the AiiDA configuration file ``.aiida/config.json``, from which other profiles are removed (see ``verdi status`` for exact location); +* all the data of the backed up profile (which depends on the storage backend). + +The specific procedure of the command and whether it even is implemented depends on the storage backend. + +.. note:: + The ``verdi storage backup`` command is implemented in a way to be as safe as possible to use when AiiDA is running, meaning that it will most likely produce an uncorrupted backup even when data is being modified. However, the exact conditions depend on the specific storage backend and to err on the safe side, only perform a backup when the profile is not in use. + +Storage backend specific information +----------------------------------------- + +Alternatively to the CLI command, one can also manually create a backup. This requires a backup of the configuration file ``.aiida/config.json`` and the storage backend. The panels below provide instructions for storage backends provided by ``aiida-core``. To determine what storage backend a profile uses, call ``verdi profile show``. .. tip:: Before creating a backup, it is recommended to run ``verdi storage maintain``. This will optimize the storage which can significantly reduce the time required to create the backup. @@ -605,44 +618,47 @@ To determine what storage backend a profile uses, call ``verdi profile show``. .. _how-to:installation:backup:restore: -Restoring your installation -=========================== +Restoring data from a backup +================================== -Restoring a backed up AiiDA installation requires: +Restoring a backed up AiiDA profile requires: -* restoring the backed up ``.aiida`` folder, with at the very least the ``config.json`` file it contains. - It should be placed in the path defined by the ``AIIDA_PATH`` environment variable. - To test the restoration worked, run ``verdi profile list`` to verify that all profiles are displayed. +* restoring the profile information in the AiiDA ``config.json`` file. Simply copy the`profiles` entry from + the backed up `config.json`to the one of the running AiiDA instance (see `verdi status` for exact location). + Some information (e.g. the database parameters) might need to be updated. -* restoring the data of each backed up profile. +* restoring the data of of the backed up profile according to the ``config.json`` entry. Like the backup procedure, this is dependent on the storage backend used by the profile. The panels below provide instructions for storage backends provided by ``aiida-core``. To determine what storage backend a profile uses, call ``verdi profile show``. +To test if the restoration worked, run ``verdi -p status`` to verify that AiiDA can successfully connect to the data storage. .. tab-set:: .. tab-item:: psql_dos - To fully backup the data stored for a profile using the ``core.psql_dos`` backend, you should restore the associated database and file repository. + To restore the backed up data for a profile using the ``core.psql_dos`` backend, you should restore the associated database and file repository. **PostgreSQL database** - To restore the PostgreSQL database from the ``.psql`` file that was backed up, first you should create an empty database following the instructions described in :ref:`database ` skipping the ``verdi setup`` phase. + To restore the PostgreSQL database from the ``db.psql`` file that was backed up, first you should create an empty database following the instructions described in :ref:`database ` skipping the ``verdi setup`` phase. The backed up data can then be imported by calling: .. code-block:: console - psql -h -p -d -W < aiida_backup.psql + psql -h -p - U -d -W < db.psql + + where the parameters need to match with the corresponding AiiDA `config.json` profile entry. **File repository** - To restore the file repository, simply copy the directory that was backed up to the location indicated by the ``storage.config.repository_uri`` key returned by the ``verdi profile show`` command. + To restore the file repository, simply copy the directory that was backed up to the location indicated in AiiDA `config.json` (or the ``storage.config.repository_uri`` key returned by the ``verdi profile show`` command). Like the backing up process, we recommend using ``rsync`` for this: .. code-block:: console - rsync -arvz /some/path/aiida_backup + rsync -arvz /path/to/backup/container .. _how-to:installation:multi-user: diff --git a/docs/source/nitpick-exceptions b/docs/source/nitpick-exceptions index db08f4f1c4..d3fdc420f0 100644 --- a/docs/source/nitpick-exceptions +++ b/docs/source/nitpick-exceptions @@ -150,6 +150,7 @@ py:class concurrent.futures._base.TimeoutError py:class concurrent.futures._base.Future py:class disk_objectstore.utils.LazyOpener +py:class disk_objectstore.backup_utils.BackupManager py:class frozenset diff --git a/docs/source/reference/command_line.rst b/docs/source/reference/command_line.rst index 2f9a10e12e..d15c3b3ce4 100644 --- a/docs/source/reference/command_line.rst +++ b/docs/source/reference/command_line.rst @@ -567,6 +567,7 @@ Below is a list with all available subcommands. --help Show this message and exit. Commands: + backup Backup the data storage of a profile. info Summarise the contents of the storage. integrity Checks for the integrity of the data storage. maintain Performs maintenance tasks on the repository. diff --git a/environment.yml b/environment.yml index 559aebcae9..ba2bff4c93 100644 --- a/environment.yml +++ b/environment.yml @@ -12,7 +12,7 @@ dependencies: - circus~=0.18.0 - click-spinner~=0.1.8 - click~=8.1 -- disk-objectstore~=1.0 +- disk-objectstore~=1.1 - docstring_parser - get-annotations~=0.1 - python-graphviz~=0.19 diff --git a/pyproject.toml b/pyproject.toml index 5669c7bc44..4d8543b6e2 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -24,7 +24,7 @@ dependencies = [ 'circus~=0.18.0', 'click-spinner~=0.1.8', 'click~=8.1', - 'disk-objectstore~=1.0', + 'disk-objectstore~=1.1', 'docstring-parser', 'get-annotations~=0.1;python_version<"3.10"', 'graphviz~=0.19', diff --git a/requirements/requirements-py-3.10.txt b/requirements/requirements-py-3.10.txt index e9c9e0a079..bc28d7eb39 100644 --- a/requirements/requirements-py-3.10.txt +++ b/requirements/requirements-py-3.10.txt @@ -41,7 +41,7 @@ debugpy==1.6.7 decorator==5.1.1 defusedxml==0.7.1 deprecation==2.1.0 -disk-objectstore==1.0.0 +disk-objectstore==1.1.0 docstring-parser==0.15 docutils==0.20.1 emmet-core==0.57.1 diff --git a/requirements/requirements-py-3.11.txt b/requirements/requirements-py-3.11.txt index 183e5181b9..dfa35673a8 100644 --- a/requirements/requirements-py-3.11.txt +++ b/requirements/requirements-py-3.11.txt @@ -41,7 +41,7 @@ debugpy==1.6.7 decorator==5.1.1 defusedxml==0.7.1 deprecation==2.1.0 -disk-objectstore==1.0.0 +disk-objectstore==1.1.0 docstring-parser==0.15 docutils==0.20.1 emmet-core==0.57.1 diff --git a/requirements/requirements-py-3.12.txt b/requirements/requirements-py-3.12.txt index 71786d7003..86d44d4c36 100644 --- a/requirements/requirements-py-3.12.txt +++ b/requirements/requirements-py-3.12.txt @@ -41,7 +41,7 @@ debugpy==1.8.0 decorator==5.1.1 defusedxml==0.7.1 deprecation==2.1.0 -disk-objectstore==1.0.0 +disk-objectstore==1.1.0 docstring-parser==0.15 docutils==0.20.1 executing==2.0.0 diff --git a/requirements/requirements-py-3.9.txt b/requirements/requirements-py-3.9.txt index 214a70f8d9..d59b8e2f1d 100644 --- a/requirements/requirements-py-3.9.txt +++ b/requirements/requirements-py-3.9.txt @@ -41,7 +41,7 @@ debugpy==1.6.7 decorator==5.1.1 defusedxml==0.7.1 deprecation==2.1.0 -disk-objectstore==1.0.0 +disk-objectstore==1.1.0 docstring-parser==0.15 docutils==0.20.1 emmet-core==0.57.1 diff --git a/src/aiida/cmdline/commands/cmd_storage.py b/src/aiida/cmdline/commands/cmd_storage.py index c86227e8ba..5382ff455f 100644 --- a/src/aiida/cmdline/commands/cmd_storage.py +++ b/src/aiida/cmdline/commands/cmd_storage.py @@ -165,3 +165,50 @@ def storage_maintain(ctx, full, no_repack, force, dry_run, compress): except LockingProfileError as exception: echo.echo_critical(str(exception)) echo.echo_success('Requested maintenance procedures finished.') + + +@verdi_storage.command('backup') +@click.argument('dest', type=click.Path(file_okay=False), nargs=1) +@click.option( + '--keep', + type=int, + required=False, + help=( + 'Number of previous backups to keep in the destination, ' + 'if the storage backend supports it. If not set, keeps all previous backups.' + ), +) +@decorators.with_manager +@click.pass_context +def storage_backup(ctx, manager, dest: str, keep: int): + """Backup the data storage of a profile. + + The backup is created in the destination `DEST`, in a subfolder that follows the naming convention + backup__ and a symlink called `last-backup` is pointed to it. + + Destination (DEST) can either be a local path, or a remote destination (reachable via ssh). + In the latter case, remote destination needs to have the following syntax: + + [@]: + + i.e., contain the remote host name and the remote path, separated by a colon (and optionally the + remote user separated by an @ symbol). You can tune SSH parameters using the standard options given + by OpenSSH, such as adding configuration options to ~/.ssh/config (e.g. to allow for passwordless + login - recommended, since this script might ask multiple times for the password). + + NOTE: 'rsync' and other UNIX-specific commands are called, thus the command will not work on + non-UNIX environments. What other executables are called, depend on the storage backend. + """ + + storage = manager.get_profile_storage() + profile = ctx.obj.profile + try: + storage.backup(dest, keep) + except NotImplementedError: + echo.echo_critical( + f'Profile {profile.name} uses the storage plugin `{profile.storage_backend}` which does not implement a ' + 'backup mechanism.' + ) + except (ValueError, exceptions.StorageBackupError) as exception: + echo.echo_critical(str(exception)) + echo.echo_success(f'Data storage of profile `{profile.name}` backed up to `{dest}`.') diff --git a/src/aiida/common/exceptions.py b/src/aiida/common/exceptions.py index c1250b076b..6fdd1c2620 100644 --- a/src/aiida/common/exceptions.py +++ b/src/aiida/common/exceptions.py @@ -48,6 +48,7 @@ 'OutputParsingError', 'HashingError', 'StorageMigrationError', + 'StorageBackupError', 'LockedProfileError', 'LockingProfileError', 'ClosedStorage', @@ -218,6 +219,10 @@ class StorageMigrationError(DatabaseMigrationError): """Raised if a critical error is encountered during a storage migration.""" +class StorageBackupError(AiidaException): + """Raised if a critical error is encountered during a storage backup.""" + + class DbContentError(AiidaException): """Raised when the content of the DB is not valid. This should never happen if the user does not play directly diff --git a/src/aiida/common/log.py b/src/aiida/common/log.py index 3948ca5a5e..fc0f860aef 100644 --- a/src/aiida/common/log.py +++ b/src/aiida/common/log.py @@ -101,6 +101,11 @@ def get_logging_config(): 'level': lambda: get_config_option('logging.verdi_loglevel'), 'propagate': False, }, + 'disk_objectstore': { + 'handlers': ['console'], + 'level': lambda: get_config_option('logging.disk_objectstore_loglevel'), + 'propagate': False, + }, 'plumpy': { 'handlers': ['console'], 'level': lambda: get_config_option('logging.plumpy_loglevel'), @@ -221,7 +226,7 @@ def configure_logging(with_orm=False, daemon=False, daemon_log_file=None): # can still configure those manually beforehand through the config options. if CLI_LOG_LEVEL is not None: for name, logger in config['loggers'].items(): - if name in ['aiida', 'verdi']: + if name in ['aiida', 'verdi', 'disk_objectstore']: logger['level'] = CLI_LOG_LEVEL # Add the `DbLogHandler` if `with_orm` is `True` diff --git a/src/aiida/manage/configuration/config.py b/src/aiida/manage/configuration/config.py index 967e6ea4c3..4b1f032271 100644 --- a/src/aiida/manage/configuration/config.py +++ b/src/aiida/manage/configuration/config.py @@ -80,6 +80,9 @@ class ProfileOptionsSchema(BaseModel, defer_build=True): logging__verdi_loglevel: LogLevels = Field( 'REPORT', description='Minimum level to log to console when running a `verdi` command.' ) + logging__disk_objectstore_loglevel: LogLevels = Field( + 'INFO', description='Minimum level to log to daemon log and the `DbLog` table for `disk_objectstore` logger.' + ) logging__db_loglevel: LogLevels = Field('REPORT', description='Minimum level to log to the DbLog table.') logging__plumpy_loglevel: LogLevels = Field( 'WARNING', description='Minimum level to log to daemon log and the `DbLog` table for the `plumpy` logger.' diff --git a/src/aiida/orm/implementation/storage_backend.py b/src/aiida/orm/implementation/storage_backend.py index 601b3ed70d..10a0c96875 100644 --- a/src/aiida/orm/implementation/storage_backend.py +++ b/src/aiida/orm/implementation/storage_backend.py @@ -305,6 +305,132 @@ def maintain(self, full: bool = False, dry_run: bool = False, **kwargs) -> None: :param dry_run: flag to only print the actions that would be taken without actually executing them. """ + def _backup( + self, + dest: str, + keep: Optional[int] = None, + ): + raise NotImplementedError + + def _write_backup_config(self, backup_manager): + import pathlib + import tempfile + + from aiida.common import exceptions + from aiida.common.log import override_log_level + from aiida.manage.configuration import get_config + from aiida.manage.configuration.config import Config + from aiida.manage.configuration.settings import DEFAULT_CONFIG_FILE_NAME + + try: + config = get_config() + profile = config.get_profile(self.profile.name) # Get the profile being backed up + with tempfile.TemporaryDirectory() as tmpdir: + filepath_config = pathlib.Path(tmpdir) / DEFAULT_CONFIG_FILE_NAME + backup_config = Config(str(filepath_config), {}) # Create empty config at temporary file location + backup_config.add_profile(profile) # Add the profile being backed up + backup_config.store() # Write the contents to disk + + # Temporarily disable all logging because the verbose rsync output just for copying the config file + # is a bit much. + with override_log_level(): + backup_manager.call_rsync(filepath_config, backup_manager.path / DEFAULT_CONFIG_FILE_NAME) + except (exceptions.MissingConfigurationError, exceptions.ConfigurationError) as exc: + raise exceptions.StorageBackupError('AiiDA config.json not found!') from exc + + def _validate_or_init_backup_folder(self, dest, keep): + import json + import tempfile + + from disk_objectstore import backup_utils + + from aiida.common import exceptions + from aiida.manage.configuration.config import Config + from aiida.manage.configuration.settings import DEFAULT_CONFIG_FILE_NAME + from aiida.storage.log import STORAGE_LOGGER + + try: + # this creates the dest folder if it doesn't exist + backup_manager = backup_utils.BackupManager(dest, keep=keep) + backup_config_path = backup_manager.path / DEFAULT_CONFIG_FILE_NAME + + if backup_manager.check_path_exists(backup_config_path): + success, stdout = backup_manager.run_cmd(['cat', str(backup_config_path)]) + if not success: + raise exceptions.StorageBackupError(f"Couldn't read {backup_config_path!s}.") + try: + backup_config_existing = json.loads(stdout) + except json.decoder.JSONDecodeError as exc: + raise exceptions.StorageBackupError(f'JSON parsing failed for {backup_config_path!s}: {exc.msg}') + + # create a temporary config file to access the profile info + with tempfile.NamedTemporaryFile() as temp_file: + backup_config = Config(temp_file.name, backup_config_existing, validate=False) + if len(backup_config.profiles) != 1: + raise exceptions.StorageBackupError(f"{backup_config_path!s} doesn't contain exactly 1 profile") + + if ( + backup_config.profiles[0].uuid != self.profile.uuid + or backup_config.profiles[0].storage_backend != self.profile.storage_backend + ): + raise exceptions.StorageBackupError( + 'The chosen destination contains backups of a different profile! Aborting!' + ) + else: + STORAGE_LOGGER.warning('Initializing a new backup folder.') + # make sure the folder is empty + success, stdout = backup_manager.run_cmd(['ls', '-A', str(backup_manager.path)]) + if not success: + raise exceptions.StorageBackupError(f"Couldn't read {backup_manager.path!s}.") + if stdout: + raise exceptions.StorageBackupError("Can't initialize the backup folder, destination is not empty.") + + self._write_backup_config(backup_manager) + + except backup_utils.BackupError as exc: + raise exceptions.StorageBackupError(*exc.args) from exc + + return backup_manager + + def backup( + self, + dest: str, + keep: Optional[int] = None, + ): + """Create a backup of the storage contents. + + :param dest: The path to the destination folder. + :param keep: The number of backups to keep in the target destination, if the backend supports it. + :raises ValueError: If the input parameters are invalid. + :raises StorageBackupError: If an error occurred during the backup procedure. + :raises NotImplementedError: If the storage backend doesn't implement a backup procedure. + """ + from aiida.manage.configuration.settings import DEFAULT_CONFIG_FILE_NAME + from aiida.storage.log import STORAGE_LOGGER + + backup_manager = self._validate_or_init_backup_folder(dest, keep) + + try: + self._backup(dest, keep) + except NotImplementedError: + success, stdout = backup_manager.run_cmd(['ls', '-A', str(backup_manager.path)]) + + if not success: + STORAGE_LOGGER.warning(f'Failed to determine contents of destination folder `{dest}`: not deleting it.') + raise + + # If the backup directory was just initialized for the first time, it should only contain the configuration + # file and nothing else. If anything else is found, do not delete the directory for safety reasons. + if stdout.strip() != DEFAULT_CONFIG_FILE_NAME: + STORAGE_LOGGER.warning(f'The destination folder `{dest}` is not empty: not deleting it.') + raise + + backup_manager.run_cmd(['rm', '-rf', str(backup_manager.path)]) + raise + + STORAGE_LOGGER.report(f'Overwriting the `{DEFAULT_CONFIG_FILE_NAME} file.') + self._write_backup_config(backup_manager) + def get_info(self, detailed: bool = False) -> dict: """Return general information on the storage. diff --git a/src/aiida/storage/psql_dos/backend.py b/src/aiida/storage/psql_dos/backend.py index 99a5905eca..2431f456dd 100644 --- a/src/aiida/storage/psql_dos/backend.py +++ b/src/aiida/storage/psql_dos/backend.py @@ -14,10 +14,12 @@ from contextlib import contextmanager, nullcontext from typing import TYPE_CHECKING, Iterator, List, Optional, Sequence, Set, Union +from disk_objectstore import Container, backup_utils from pydantic import BaseModel, Field from sqlalchemy import column, insert, update from sqlalchemy.orm import Session, scoped_session, sessionmaker +from aiida.common import exceptions from aiida.common.exceptions import ClosedStorage, ConfigurationError, IntegrityError from aiida.common.log import AIIDA_LOGGER from aiida.manage.configuration.profile import Profile @@ -221,8 +223,6 @@ def _clear(self) -> None: ) def get_repository(self) -> 'DiskObjectStoreRepositoryBackend': - from disk_objectstore import Container - from aiida.repository.backend import DiskObjectStoreRepositoryBackend container = Container(get_filepath_container(self.profile)) @@ -482,3 +482,97 @@ def get_info(self, detailed: bool = False) -> dict: results = super().get_info(detailed=detailed) results['repository'] = self.get_repository().get_info(detailed) return results + + def _backup_storage( + self, + manager: backup_utils.BackupManager, + path: pathlib.Path, + prev_backup: Optional[pathlib.Path] = None, + ) -> None: + """Create a backup of the postgres database and disk-objectstore to the provided path. + + :param manager: + BackupManager from backup_utils containing utilities such as for calling the rsync. + + :param path: + Path to where the backup will be created. + + :param prev_backup: + Path to the previous backup. Rsync calls will be hard-linked to this path, making the backup + incremental and efficient. + """ + import os + import shutil + import subprocess + import tempfile + + from aiida.manage.profile_access import ProfileAccessManager + + STORAGE_LOGGER.report('Starting backup...') + + # This command calls `rsync` and `pg_dump` executables. check that they are in PATH + for exe in ['rsync', 'pg_dump']: + if shutil.which(exe) is None: + raise exceptions.StorageBackupError(f"Required executable '{exe}' not found in PATH, please add it.") + + cfg = self._profile.storage_config + container = Container(get_filepath_container(self.profile)) + + # check that the AiiDA profile is not locked and request access for the duration of this backup process + # (locked means that possibly a maintenance operation is running that could interfere with the backup) + try: + ProfileAccessManager(self._profile).request_access() + except exceptions.LockedProfileError as exc: + raise exceptions.StorageBackupError('The profile is locked!') from exc + + # step 1: first run the storage maintenance version that can safely be performed while aiida is running + STORAGE_LOGGER.report('Running basic maintenance...') + self.maintain(full=False, compress=False) + + # step 2: dump the PostgreSQL database into a temporary directory + STORAGE_LOGGER.report('Backing up PostgreSQL...') + pg_dump_exe = 'pg_dump' + with tempfile.TemporaryDirectory() as temp_dir_name: + psql_temp_loc = pathlib.Path(temp_dir_name) / 'db.psql' + + env = os.environ.copy() + env['PGPASSWORD'] = cfg['database_password'] + cmd = [ + pg_dump_exe, + f'--host={cfg["database_hostname"]}', + f'--port={cfg["database_port"]}', + f'--dbname={cfg["database_name"]}', + f'--username={cfg["database_username"]}', + '--no-password', + '--format=p', + f'--file={psql_temp_loc!s}', + ] + try: + subprocess.run(cmd, check=True, env=env) + except subprocess.CalledProcessError as exc: + raise backup_utils.BackupError(f'pg_dump: {exc}') + + if psql_temp_loc.is_file(): + STORAGE_LOGGER.info(f'Dumped the PostgreSQL database to {psql_temp_loc!s}') + else: + raise backup_utils.BackupError(f"'{psql_temp_loc!s}' was not created.") + + # step 3: transfer the PostgreSQL database file + manager.call_rsync(psql_temp_loc, path, link_dest=prev_backup, dest_trailing_slash=True) + + # step 4: back up the disk-objectstore + STORAGE_LOGGER.report('Backing up DOS container...') + backup_utils.backup_container( + manager, container, path / 'container', prev_backup=prev_backup / 'container' if prev_backup else None + ) + + def _backup( + self, + dest: str, + keep: Optional[int] = None, + ): + try: + backup_manager = backup_utils.BackupManager(dest, keep=keep) + backup_manager.backup_auto_folders(lambda path, prev: self._backup_storage(backup_manager, path, prev)) + except backup_utils.BackupError as exc: + raise exceptions.StorageBackupError(*exc.args) from exc diff --git a/src/aiida/storage/sqlite_dos/backend.py b/src/aiida/storage/sqlite_dos/backend.py index d93a04664c..890e082914 100644 --- a/src/aiida/storage/sqlite_dos/backend.py +++ b/src/aiida/storage/sqlite_dos/backend.py @@ -13,7 +13,7 @@ from functools import cached_property from pathlib import Path from shutil import rmtree -from typing import TYPE_CHECKING +from typing import TYPE_CHECKING, Optional from uuid import uuid4 from disk_objectstore import Container @@ -146,6 +146,13 @@ def _initialise_session(self): engine = create_sqla_engine(Path(self._profile.storage_config['filepath']) / 'database.sqlite') self._session_factory = scoped_session(sessionmaker(bind=engine, future=True, expire_on_commit=True)) + def _backup( + self, + dest: str, + keep: Optional[int] = None, + ): + raise NotImplementedError + def delete(self) -> None: # type: ignore[override] """Delete the storage and all the data.""" filepath = Path(self.profile.storage_config['filepath']) diff --git a/tests/cmdline/commands/test_storage.py b/tests/cmdline/commands/test_storage.py index a4efb508a2..8c374885a2 100644 --- a/tests/cmdline/commands/test_storage.py +++ b/tests/cmdline/commands/test_storage.py @@ -8,6 +8,8 @@ ########################################################################### """Tests for `verdi storage`.""" +import json + import pytest from aiida import get_profile from aiida.cmdline.commands import cmd_storage @@ -177,3 +179,54 @@ def mock_maintain(*args, **kwargs): assert ' > full: True' in message_list assert ' > do_repack: False' in message_list assert ' > dry_run: False' in message_list + + +def tests_storage_backup(run_cli_command, tmp_path): + """Test the ``verdi storage backup`` command.""" + result1 = run_cli_command(cmd_storage.storage_backup, parameters=[str(tmp_path)]) + assert 'backed up to' in result1.output + assert result1.exit_code == 0 + assert (tmp_path / 'last-backup').is_symlink() + # make another backup in the same folder + result2 = run_cli_command(cmd_storage.storage_backup, parameters=[str(tmp_path)]) + assert 'backed up to' in result2.output + assert result2.exit_code == 0 + + +def tests_storage_backup_keep(run_cli_command, tmp_path): + """Test the ``verdi storage backup`` command with the keep argument""" + params = [str(tmp_path), '--keep', '1'] + for i in range(3): + result = run_cli_command(cmd_storage.storage_backup, parameters=params) + assert 'backed up to' in result.output + assert result.exit_code == 0 + # make sure only two copies of the backup are kept + assert len(list((tmp_path.glob('backup_*')))) == 2 + + +def tests_storage_backup_nonempty_dest(run_cli_command, tmp_path): + """Test that the ``verdi storage backup`` fails for non-empty destination.""" + # add a file to the destination + (tmp_path / 'test.txt').touch() + result = run_cli_command(cmd_storage.storage_backup, parameters=[str(tmp_path)], raises=True) + assert result.exit_code == 1 + assert 'destination is not empty' in result.output + + +def tests_storage_backup_other_profile(run_cli_command, tmp_path): + """Test that the ``verdi storage backup`` fails for a destination that has been used for another profile.""" + existing_backup_config = { + 'CONFIG_VERSION': {'CURRENT': 9, 'OLDEST_COMPATIBLE': 9}, + 'profiles': { + 'test': { + 'PROFILE_UUID': 'test-uuid', + 'storage': {'backend': 'core.psql_dos'}, + 'process_control': {'backend': 'rabbitmq'}, + } + }, + } + with open(tmp_path / 'config.json', 'w', encoding='utf-8') as fhandle: + json.dump(existing_backup_config, fhandle, indent=4) + result = run_cli_command(cmd_storage.storage_backup, parameters=[str(tmp_path)], raises=True) + assert result.exit_code == 1 + assert 'contains backups of a different profile' in result.output diff --git a/tests/orm/implementation/test_backend.py b/tests/orm/implementation/test_backend.py index b1ea884bc4..001564f057 100644 --- a/tests/orm/implementation/test_backend.py +++ b/tests/orm/implementation/test_backend.py @@ -8,6 +8,10 @@ ########################################################################### """Unit tests for the ORM Backend class.""" +from __future__ import annotations + +import json +import pathlib import uuid import pytest @@ -161,3 +165,46 @@ def test_delete_nodes_and_connections(self): orm.Node.collection.get(id=node_pk) assert len(calc_node.base.links.get_outgoing().all()) == 0 assert len(group.nodes) == 0 + + +def test_backup_not_implemented(aiida_config, backend, monkeypatch, tmp_path): + """Test the backup functionality if the plugin does not implement it.""" + + def _backup(*args, **kwargs): + raise NotImplementedError + + monkeypatch.setattr(backend, '_backup', _backup) + + filepath_backup = tmp_path / 'backup_dir' + + with pytest.raises(NotImplementedError): + backend.backup(str(filepath_backup)) + + # The backup directory should have been initialized but then cleaned up when the plugin raised the exception + assert not filepath_backup.is_dir() + + # Now create the backup directory with the config file and some other content to it. + filepath_backup.mkdir() + (filepath_backup / 'config.json').write_text(json.dumps(aiida_config.dictionary)) + (filepath_backup / 'backup-deadbeef').mkdir() + + with pytest.raises(NotImplementedError): + backend.backup(str(filepath_backup)) + + # The backup directory should not have been delete + assert filepath_backup.is_dir() + assert (filepath_backup / 'config.json').is_file() + + +def test_backup_implemented(backend, monkeypatch, tmp_path): + """Test the backup functionality if the plugin does implement it.""" + + def _backup(dest: str, keep: int | None = None): + (pathlib.Path(dest) / 'backup.file').touch() + + monkeypatch.setattr(backend, '_backup', _backup) + + filepath_backup = tmp_path / 'backup_dir' + backend.backup(str(filepath_backup)) + assert (filepath_backup / 'config.json').is_file() + assert (filepath_backup / 'backup.file').is_file() diff --git a/tests/storage/psql_dos/test_backend.py b/tests/storage/psql_dos/test_backend.py index 4ac7563080..6fe35ac747 100644 --- a/tests/storage/psql_dos/test_backend.py +++ b/tests/storage/psql_dos/test_backend.py @@ -152,3 +152,20 @@ def test_unload_profile(): assert len(_sessions) == current_sessions - 1, str(_sessions) finally: manager.load_profile(profile_name) + + +def test_backup(tmp_path): + """Test that the backup function creates all the necessary files and folders""" + storage_backend = get_manager().get_profile_storage() + + # note: this assumes that rsync and pg_dump are in PATH + storage_backend.backup(str(tmp_path)) + + last_backup = tmp_path / 'last-backup' + assert last_backup.is_symlink() + + # make sure the necessary files are there + # note: disk-objectstore container backup is already tested in its own repo + contents = [c.name for c in last_backup.iterdir()] + for name in ['container', 'db.psql']: + assert name in contents