-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate unified backends #416
Conversation
25726e7
to
c94d0f4
Compare
2db0f3f
to
43adbfd
Compare
62dd60e
to
2058395
Compare
4e2de6b
to
678b0a6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a partial review (up to statements.py
)
205036d
to
ccc73fa
Compare
ec0cac2
to
7428a7e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is neat ! Looking forward to this being merged.
4752d8d
to
06cc6bb
Compare
2058395
to
e6c29ff
Compare
We add the OpenStack Swift data backend implementation. With the `data` parameter changed to an Iterable, we cannot use high level SwiftService API to upload files anymore (it needs a file object source, not an iterable). Changing to Connection lower-level API, which is more flexible.
We add the LDP data backend implementation that is mostly taken from the existing LDPStorage backend.
Add S3 backend under the new common `data` interface
Add ClickHouse backend under the new common 'data' interface. With ClickHouse under the new data interface, tests are updated as well. Storage and Database backends had similar interfaces and usage, so a new Data Backend interface has been created.
We want to simplify our tests that are mocking the request package. Therefore we choose to use the `request_mock` library.
We add the ES data backend implementation that is mostly taken from the existing ESDatabase backend.
We want to provide an LRS backend that could be added to Ralph's LRS without any additional dependencies.
Methods `read_raw` and `parse_bytes_to_dict` are generic and used by multiple backends. Moving them to file `utils.py`.
Add asynchronous base interface for async backends such as async_es or async_mongo
We add the Async elasticsearch data backend mostly taken from the sync backend using the async elasticsearch methods.
We want to improve the current mongo data backend implementation to align it more with other unified data backends.
We want to provide an async version of our MongoDatabase backend.
- `get_query` method for Elasticsearch would be better namespaced under the ESLRSBackend. Changing it to a static method instead of a global function. - At initialization, data backends can either take settings or None. Setting `settings_class` to Optional to anticipate mypy warning when mypy will be added. - Piping x|None is preferred since Python 3.10, changing from Optional to Union[x|None] for backends as it would be easier to switch to pipes. - Changes to backend methods docstrings - Rename variable `new_documents` to be more explicit
Synchronous backends such as Elasticsearch or Mongo need their connection to be closed when finished. This commits adds an abstract method close to the BaseDataBackend interface, and implements it for backends that need it.
The update to a recent version of `motor` highlighted a bug on our side when listing collections. Now asynchronously iterate over collections list.
With the new data backend interface, settings are now close to each backend and not under general conf.py. Unifying stream backend WS to have the same architecture as data backends.
With the new data backend interface, settings are now close to each backend and not under general conf.py. Unifying HTTP backends to have the same architecture as data backends.
After unifying database and storage backends under a common interface, backends settings are now handled directly alongside the backends classes. Modifying the CLI to support new settings and new backends interfaces.
With the addition of new asynchronous backends, it could be useful to be able to use them in the CLI.
Adding a default value for ClickHouse client option `allow_experimental_object_type` highlights a pydantic validation error with type `Literal[0,1]`. Switching to `coint`.
With addition of unified backends and changes to the conf files, API router needs some changes to be able to get the backends instance.
With addition of unified backends, API router needs some changes to be able to use asynchronous backends.
Tests using filesystem failed with pyfakefs in the CI as pyfakefs does not succeed on creating requesting files in the default directory path. The latter is then defined specifically for these tests and forced to be used in the ralph command.
Environment variables `RALPH_BACKENDS__DATABASE__ES__*` have been renamed to `RALPH_BACKENDS__DATA__ES__*`. Changing them in the `tray`.
Changed - Refactor `database` and `storage` backends under the unified `data` backend interface [BC] - Refactor LRS `query_statements` and `query_statements_by_ids` backends methods under the unified `lrs` backend interface [BC]
Following `pylint` upgrade to version > 3.0 , a false negative was corrected resulting in many warnings about methods having too many arguments. Escaping these warnings.
96beb9e
to
27143be
Compare
Purpose
After unifying
database
andstorage
backends under a commondata
interface, backends settings and methods have changed.
Proposal
Stream
backends settings alongside backend classHTTP
backends settings alongside backend classdatabase
andstorage
backendsconf.py
tobackends/conf.py
to avoid acircular import (e.g.
conf.py
->es.py
->conf.py
)_class_path
attributes from backends settingsutils.py
methods to get backend type and backend instancecli.py
to use new unified backendscli.py
to be able to use async backendsunify-data-backends
utils
functions