-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support "--disallow-inconsistent-metadata" flag in cli-migrations docker image #10599
Comments
…ow use of the --disallow-inconsistent-metadata flag when applying metadata (fixes hasura#10599)
…ow use of the --disallow-inconsistent-metadata flag when applying metadata (fixes hasura#10599)
I ended up putting together an approach for this in PR #10602 - not sure if that's the approach the maintenance team wants to take, or if this is a feature you all actually want to pick up or not. If not, no worries, just wanted to take a stab at it cause I was curious if I could get it working. Let me know either way! |
Please link to #8095 |
thanks @jflambert, and good call! when I opened this, that issue hadn't had any updates for 2 years, and it seemed like the flags/env var names had changed in that time, so I wasn't positive if this was related or not. glad to have confirmation though. I'll reference your issue from my pull request as well! |
Let it be known that I've actually stopped using the |
Is your proposal related to a problem?
My team runs the community edition of the Hasura graphql-engine on Kubernetes in a horizontally replicated configuration. We use rolling updates to deploy, use the
cli-migrations
(v2) docker image as our base image in production (as described here), and use the/healthz?strict=true
endpoint as our container's readiness/liveness check. This allows us to deploy Hasura in prod without exposing the console or metadata API, while still allowing for git-driven continuous delivery of metadata changes by version-controlling our metadata files and packaging them into new docker images that get deployed to our Kube cluster on each commit to main. This works great most of the time, allowing us to roll out changes to our GraphQL API just by merging metadata changes to main.However, this configuration has one major drawback. The
cli-migrations
entrypoint script (ref) runshasura-cli metadata apply
without the--disallow-inconsistent-metadata
flag. This means that if we accidentally merge inconsistent metadata to main (eg, by applying metadata changes before a production DB migration has completed), the new pods for our updated Kubernetes deployment will apply that metadata and persist it to the database regardless of any issues. This metadata update in turn gets picked up by the existing/old pods, which listen for changes to the metadata DB, and "leaks" into them even though they (in theory) were built from an older, functioning revision of our code.In effect, this defeats the purpose of using a deployment tool like Kubernetes, because what should have been a failed deploy ends up taking down the existing pods, rather than simply failing to spin up new ones. This breaks from the standard mental model of rolling updates, in which a pre-existing stable deployment should stay available until new pods are ready to receive traffic. This scenario befell us earlier this week, leading to a full API outage instead of what we'd expect for any other API that can't become ready - a failed rollout of the new without affecting the old.
Related to #8095
Describe the solution you'd like
I'd love to have some way to configure the
cli-migrations
image to disallow the application of inconsistent metadata, the same way I know is possible using the CLI'shasura metadata apply --disallow-inconsistent-metadata
command. I'm open to whether that should be possible via a global Hasura configuration (ie, configure this as a setting on the server itself) or just a flag/env-var that can be passed to thecli-migrations
image similar to theHASURA_GRAPHQL_METADATA_DIR
variable. I expect the latter would be simpler to implement.Doing this would allow us to have a fully safe CI/CD setup in Kubernetes by preventing pods which contain inconsistent metadata from poisoning the shared metadata DB and taking down the entire API.
Just noting - I would be very happy to work on this feature myself, in particular if we'd prefer to scope it to a change to the
cli-migrations
image only. This would be really helpful for my team, and I'd love a chance to help contribute back to Hasura given how much use we've gotten out of the graphql engine over the last year or two!Describe alternatives you've considered
I see a few different ways of working around this, which I'll explain my POV on inline:
cli-migrations
image, and use the CLI to apply metadata in production directly. I'd really prefer not to do this for my team, because it would mean exposing the metadata API in prod, which would then make it accessible to anyone who has the admin key. This would be a pretty major step up in the risk/impact of leaking that key, which I'd rather not take on./healthz
endpoint for our readiness and liveness checks, allow inconsistent metadata to applied to prod, and monitor for it some other way. I'd also prefer to avoid this - given it is possible to determine prior to deployment whether a batch of metadata files will be inconsistent (and thus lead to some amount of API degradation), I see no good reason we'd want our prod service to declare itself ready for traffic in those scenarios.The text was updated successfully, but these errors were encountered: