-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ Update thoughts on APIBindings and roadmap options #241
✨ Update thoughts on APIBindings and roadmap options #241
Conversation
2cc2b92
to
54e80d5
Compare
/cc @yuji-watanabe-jp |
### Denaturing/renaturing | ||
|
||
We could start by not doing this. For some resources, the effect of | ||
leaving these resources natured in the center is only to add | ||
authorizations in the center that are not needed and are undesired in | ||
a well-secured environment but tolerable in early demonstrations --- | ||
provided that there is not a conflict with an object of the same name | ||
that is positively desired in the center. In particular, these are: | ||
`ClusterRole`, `ClusterRoleBinding`, `Role`, `RoleBinding`, | ||
`ServiceAccount`. | ||
|
||
For some kinds of object, lack of denaturing/renaturing means that | ||
edge-mc will simply not be able to support workloads that contain such | ||
objects. These are: `MutatingWebhookConfiguration`, | ||
`ValidatingWebhookConfiguration`, `LimitRange`, `ResourceQuota`. | ||
|
||
For some resources, the need to denature is only a matter of | ||
anticipation. `FlowSchema` and `PriorityLevelConfiguration` currently | ||
are not interpreted by kcp servers and so are effectively already | ||
denatured there. Hopefully they will be given interpretations in the | ||
future, and then those resources will join the previous category. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really important feature for edge syncer to support these resources.
- replace to some alternative kind in workspace side
- edge syncer pull the alternative kind and replace it with native one.
I think this should be one of the priority item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Denaturing and renaturing are coupled. The denaturing is done by a view, the renaturing is done by (the placement translator while the TMC syncer is used, the EMC syncer once it is developed).
The question I am trying to ask is not whether denaturing and renaturing are important; I think there is no question that they are part of the design that we are working towards. The question I am asking is whether we can do some interesting demonstrations before denaturing and renaturing are implemented.
I see now that #242 does not yet clearly enough ask the relevant questions here. So let me ask them here. First I should note that I hope that kcp will be rebased onto Kubernetes release 1.26 soon, which will make https://kubernetes.io/docs/reference/kubernetes-api/extend-resources/validating-admission-policy-v1alpha1/ usable; I wonder whether (as suggested at the end of that review PDF you shared) C2P could use ValidatingAdmissionPolicy
objects. In that future, does C2P need MutatingWebhookConfiguration
, ValidatingWebhookConfiguration
, LimitRange
, or ResourceQuota
? More precisely, could we demo an interesting subset of C2P without using those resources?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we demo an interesting subset of C2P without using those resources?
Kyverno does not require syncing those resources for C2P demo. MutatingWebhookConfiguration
and/or ValidatingWebhookConfiguration
are generated after deploying Kyverno runtime on p-cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuji-watanabe-jp : thank you, I just want to follow up to make sure I understand correctly. When the Kyverno controller generates a MutatingWebhookConfiguration
or ValidatingWebhookConfiguration
object in the edge cluster, we have to be careful about which apiserver this object is written to. It needs to be written to the apiserver that it is intended to affect. My concern is what happens while using the TMC syncer. If the webhook configuration object is written to the mailbox workspace but is intended to configure calls to a container running in the edge cluster then this will be a broken configuration. Is the Kyverno controller described by a Deployment object that will be downsynced with the TMC syncer rewiring its apiserver client config to point back to the mailbox workspace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MikeSpreitzer If I am understanding your questions correctly, I believe those two webhook configurations does not affect to TMC syncer & mailbox workspace. Here is the steps to deploy Kyverno via syncer.
- Kyverno deployment is synced from mailbox workplace to edge cluster.
- Kyverno service (Webhook API backend) is running at edge cluster.
- Kyverno service creates
MutatingWebhookConfiguration
orValidatingWebhookConfiguration
object on edge cluster. - Webhook is enabled on edge cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdettori you are right. we have been deploying Kyverno to p-cluster with TMC syncer + OLM. We confirmed that works.
Without OLM, we need to see how to install cluster-scope (ClusterRoleBinding and ClusterRole) via TMC syncer.
So, we can keep the choice with OLM for installation via TMC syncer. The issue of TMC upsync of the policy reports still remains.
Edge syncer will do not apply such mutation, and so this will never happen there, and resolve the issue of upsync.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yuji-watanabe-jp Cluster-scope deployment and the webhook issue are different. Just to emphasis (and maybe close :-) ) this discussion - if you plan to deploy Kyverno through KCP (i.e., into a KCP workspace) than due to the mutating that TMC performs on the deployed pods your webhooks probably won't work correctly. If you can (for the first stage) deploy the controllers directly on the pCluster(s) - than you avoid the TMC mutating and your pCluster webhooks will probably work without an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The upsync seems not related to this , as for both cases (deploy throgh KCP and externally to KCP) , we can do the same: (1) define the criteria of what to upsync through KCP (2) A controller on the pCLuster (part of the Kyverno syncer) will need to support this upSync (e.g., set the labels/annotations, etc..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the controller that creates the policy objects something that we control? Can we make it put needed labels and annotations on the policy objects, at least in the TMC interim?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we control it. we can add them.
APIExport, and EdgePlacement objects that (1) select those | ||
APIBinding objects for downsync and (2) select objects of kinds | ||
defined through those APIBindings for either downsync or upsync. | ||
- Those resources are built into the edge clusters by pre-deploying |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably only the resources for upsync need to be pre-deployed, right ? Won't the ones we downsync be deployed by the downsync itself ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Downsync cannot propagate an APIBinding all the way to the edge cluster because the edge cluster is not a kcp workspace.
Also, an APIExport, and thus and APIBinding, is for a whole API group; one cannot use that mechanism to import a subset of what was exported. It would be unnatural for C2P to define their policies in one API group and their reports in a different one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was just referring to the "by pre-deploying" part. Why do we need to pre-deploy into edge if we downsync them as well ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of this particular scenario is for the mailbox workspaces to have APIBindings to workload APIExports, so that the workload APIExport view holds the workload objects as they sit in the mailbox workspaces. Particularly for upsynced resources, since this is something that the C2P people expected.
view includes all of the objects (in workload management workspaces, | ||
in mailbox workspaces, and in any other workspaces where they | ||
appear) whose kind is defined through those APIBindings. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a diagram that describe those relations ? It may help to understand the mappings of binding, exports, resources, between the workload WS, mailbox WS, and pCluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not drawn a picture yet.
Is there information missing from the textual description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think a diagram would help explain the flow here. I get the need for pre-deployment, but is there a possibility for these resources to get out-of-sync?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) Drawings almost always help. I hope the lack of one does not hold up merging this.
(2) The "sync" done by our processes is eventually consistent, which explicitly admits there may be interim periods of inconsistency.
(3) I suspect your concern is consistency that our automated processes do not even attempt to maintain (e.g., between a resource's schema built into edge clusters vs. the schema given for the same resource in the center). That can be lacking from the start as well as crop up later! This is an issue I tried to explicitly defer for later consideration.
(4) The scenario outlined here is no longer the one that we intend to use for the Compliance-to-Policy work. For C2P we intend to drive CRDs from workload management workspace through mailbox workspaces to edge clusters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am ok with that answer.
/lgtm
In the latest kcp community meeting, @davidfestal gave an outline of the answer to my question (kcp-dev/contrib-tmc#34) about the |
/approve |
/hold |
/unapprove |
/unhold |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: MikeSpreitzer The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Summary
This PR updates the 2023q1 PoC outline to be more correct and expansive regarding APIBinding objects and also expand options in the roadmap section.
This includes calling out a scenario in which workload objects in mailbox workspaces (as well as all other workspaces) are visible in an APIExport view.
This PR replaces kcp-dev/kcp#238 with a more extensive revision.
Related issue(s)
Fixes #