-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(synchronizer): add synchronizer query to fetch all validation needed data #599
Conversation
🦋 Changeset detectedLatest commit: be5334a The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
9d37a08
to
1b68317
Compare
provider: 'GITHUB', | ||
owner: 'kubeshop', | ||
name: 'monokle-demo', | ||
suppressions: [{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologise since we talked about one big query beforehand.
It just comes to mind that this large query would both have high complexity and be frequently queried. The suppression array is besides unbound in length. That makes this an idea that does not scale well. Furthermore, policies, repositories and suppressions rarely change.
I wonder if we should move towards an offline-first approach with some files / a SQLite database. Iirc we already do that for the policy. At that point, frequently polled query simply contains updatedAt
/ content-hash
/ lastCursor
for policy / repositories / suppressions. That makes the frequently polled query lightweight, after which heavier synchronises can happen when desynced.
Given that there is just one repository and the policy is quite small, we could get away with only taking this road for suppressions. It could also just be repository.suppressions.json
instead of SQLite for now if we want to keep it simple, stupid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could also use the getSuppressions
query (instead of the suppressions
field on the repository model) which accepts .from
and .to
arguments. This would work well with the cache implementation described by @WitoDelnat. We should also implement proper pagination (with grapqhl connections) so that we can paginate the first query. The basis for implementing pagination with connections is already in place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
which accepts
.from
and.to
arguments.
This is interesting 👍 So with this, in fact, we could have incremental updates to stored suppressions, right? To fetch changes for suppressions only from most recent fetch.
We should also implement proper pagination (with grapqhl connections) so that we can paginate the first query. The basis for implementing pagination with connections is already in place
Not sure how using pagination could help here? 🤔 We still need entire suppressions list for validation. Could you expand on that?
From what I checked, looking into current db schema we also have updatedAt
field for policy (ProjectPolicy
) so this can be used as @WitoDelnat suggested to not pull entire policy contents every time.
It will use `policy.updatedAt` to determine if policy content needs to be refetched. For suppressions, `.from` field is used to fetch only new/modified suppressions. Suppressions and timestamps are stored in fs and so are persisted.
Tests are commented, maybe just remove them since they are in git anyway? Or maybe you still wanted to do them before finalising this PR? |
Yes, I wanted to at least add one test which will cover synchronization logic, that's why I left it like that. I will clean it up when adding it 👍 |
This PR is a part of kubeshop/vscode-monokle#93.
The main goal is to allow fetching all data required for integrations at once and to allow fetched data to be used in synchronous manner.
I kept previous
synchronizer
without changes (but deprecated it) since it is used by multiple integrations. And introducedProjectSynchronizer
since the way it works is quite different from the previous one. Putting everything in the same class would be a mess.The main idea is that it exposes
synchronize
method (the same as previous one) which refetches data whenever called (so this is integrator responsibility to know when to refetch). Entire response is cached and then used whenget*
methods are used. This way fetching is decoupled from getting/using data.Changes
ProjectSynchronizer
.getPermissions
query inApiHandler
which may be useful too for other integrations.Fixes
Checklist