Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Update thoughts on APIBindings and roadmap options #241

Merged

Conversation

MikeSpreitzer
Copy link
Contributor

Summary

This PR updates the 2023q1 PoC outline to be more correct and expansive regarding APIBinding objects and also expand options in the roadmap section.

This includes calling out a scenario in which workload objects in mailbox workspaces (as well as all other workspaces) are visible in an APIExport view.

This PR replaces kcp-dev/kcp#238 with a more extensive revision.

Related issue(s)

Fixes #

@MikeSpreitzer
Copy link
Contributor Author

/cc @yuji-watanabe-jp
/cc @yana1205
/cc @clubanderson

Comment on lines +131 to +151
### Denaturing/renaturing

We could start by not doing this. For some resources, the effect of
leaving these resources natured in the center is only to add
authorizations in the center that are not needed and are undesired in
a well-secured environment but tolerable in early demonstrations ---
provided that there is not a conflict with an object of the same name
that is positively desired in the center. In particular, these are:
`ClusterRole`, `ClusterRoleBinding`, `Role`, `RoleBinding`,
`ServiceAccount`.

For some kinds of object, lack of denaturing/renaturing means that
edge-mc will simply not be able to support workloads that contain such
objects. These are: `MutatingWebhookConfiguration`,
`ValidatingWebhookConfiguration`, `LimitRange`, `ResourceQuota`.

For some resources, the need to denature is only a matter of
anticipation. `FlowSchema` and `PriorityLevelConfiguration` currently
are not interpreted by kcp servers and so are effectively already
denatured there. Hopefully they will be given interpretations in the
future, and then those resources will join the previous category.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really important feature for edge syncer to support these resources.

  • replace to some alternative kind in workspace side
  • edge syncer pull the alternative kind and replace it with native one.
    I think this should be one of the priority item.

Copy link
Contributor Author

@MikeSpreitzer MikeSpreitzer Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Denaturing and renaturing are coupled. The denaturing is done by a view, the renaturing is done by (the placement translator while the TMC syncer is used, the EMC syncer once it is developed).

The question I am trying to ask is not whether denaturing and renaturing are important; I think there is no question that they are part of the design that we are working towards. The question I am asking is whether we can do some interesting demonstrations before denaturing and renaturing are implemented.

I see now that #242 does not yet clearly enough ask the relevant questions here. So let me ask them here. First I should note that I hope that kcp will be rebased onto Kubernetes release 1.26 soon, which will make https://kubernetes.io/docs/reference/kubernetes-api/extend-resources/validating-admission-policy-v1alpha1/ usable; I wonder whether (as suggested at the end of that review PDF you shared) C2P could use ValidatingAdmissionPolicy objects. In that future, does C2P need MutatingWebhookConfiguration, ValidatingWebhookConfiguration, LimitRange, or ResourceQuota? More precisely, could we demo an interesting subset of C2P without using those resources?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we demo an interesting subset of C2P without using those resources?

Kyverno does not require syncing those resources for C2P demo. MutatingWebhookConfiguration and/or ValidatingWebhookConfiguration are generated after deploying Kyverno runtime on p-cluster.

Copy link
Contributor Author

@MikeSpreitzer MikeSpreitzer Mar 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuji-watanabe-jp : thank you, I just want to follow up to make sure I understand correctly. When the Kyverno controller generates a MutatingWebhookConfiguration or ValidatingWebhookConfiguration object in the edge cluster, we have to be careful about which apiserver this object is written to. It needs to be written to the apiserver that it is intended to affect. My concern is what happens while using the TMC syncer. If the webhook configuration object is written to the mailbox workspace but is intended to configure calls to a container running in the edge cluster then this will be a broken configuration. Is the Kyverno controller described by a Deployment object that will be downsynced with the TMC syncer rewiring its apiserver client config to point back to the mailbox workspace?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@MikeSpreitzer If I am understanding your questions correctly, I believe those two webhook configurations does not affect to TMC syncer & mailbox workspace. Here is the steps to deploy Kyverno via syncer.

  1. Kyverno deployment is synced from mailbox workplace to edge cluster.
  2. Kyverno service (Webhook API backend) is running at edge cluster.
  3. Kyverno service creates MutatingWebhookConfiguration or ValidatingWebhookConfiguration object on edge cluster.
  4. Webhook is enabled on edge cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdettori you are right. we have been deploying Kyverno to p-cluster with TMC syncer + OLM. We confirmed that works.
Without OLM, we need to see how to install cluster-scope (ClusterRoleBinding and ClusterRole) via TMC syncer.
So, we can keep the choice with OLM for installation via TMC syncer. The issue of TMC upsync of the policy reports still remains.

Edge syncer will do not apply such mutation, and so this will never happen there, and resolve the issue of upsync.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yuji-watanabe-jp Cluster-scope deployment and the webhook issue are different. Just to emphasis (and maybe close :-) ) this discussion - if you plan to deploy Kyverno through KCP (i.e., into a KCP workspace) than due to the mutating that TMC performs on the deployed pods your webhooks probably won't work correctly. If you can (for the first stage) deploy the controllers directly on the pCluster(s) - than you avoid the TMC mutating and your pCluster webhooks will probably work without an issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upsync seems not related to this , as for both cases (deploy throgh KCP and externally to KCP) , we can do the same: (1) define the criteria of what to upsync through KCP (2) A controller on the pCLuster (part of the Kyverno syncer) will need to support this upSync (e.g., set the labels/annotations, etc..)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the controller that creates the policy objects something that we control? Can we make it put needed labels and annotations on the policy objects, at least in the TMC interim?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we control it. we can add them.

APIExport, and EdgePlacement objects that (1) select those
APIBinding objects for downsync and (2) select objects of kinds
defined through those APIBindings for either downsync or upsync.
- Those resources are built into the edge clusters by pre-deploying
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably only the resources for upsync need to be pre-deployed, right ? Won't the ones we downsync be deployed by the downsync itself ?

Copy link
Contributor Author

@MikeSpreitzer MikeSpreitzer Mar 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Downsync cannot propagate an APIBinding all the way to the edge cluster because the edge cluster is not a kcp workspace.
Also, an APIExport, and thus and APIBinding, is for a whole API group; one cannot use that mechanism to import a subset of what was exported. It would be unnatural for C2P to define their policies in one API group and their reports in a different one.

Copy link
Contributor

@ezrasilvera ezrasilvera Mar 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just referring to the "by pre-deploying" part. Why do we need to pre-deploy into edge if we downsync them as well ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this particular scenario is for the mailbox workspaces to have APIBindings to workload APIExports, so that the workload APIExport view holds the workload objects as they sit in the mailbox workspaces. Particularly for upsynced resources, since this is something that the C2P people expected.

view includes all of the objects (in workload management workspaces,
in mailbox workspaces, and in any other workspaces where they
appear) whose kind is defined through those APIBindings.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a diagram that describe those relations ? It may help to understand the mappings of binding, exports, resources, between the workload WS, mailbox WS, and pCluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not drawn a picture yet.
Is there information missing from the textual description?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also think a diagram would help explain the flow here. I get the need for pre-deployment, but is there a possibility for these resources to get out-of-sync?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) Drawings almost always help. I hope the lack of one does not hold up merging this.

(2) The "sync" done by our processes is eventually consistent, which explicitly admits there may be interim periods of inconsistency.

(3) I suspect your concern is consistency that our automated processes do not even attempt to maintain (e.g., between a resource's schema built into edge clusters vs. the schema given for the same resource in the center). That can be lacking from the start as well as crop up later! This is an issue I tried to explicitly defer for later consideration.

(4) The scenario outlined here is no longer the one that we intend to use for the Compliance-to-Policy work. For C2P we intend to drive CRDs from workload management workspace through mailbox workspaces to edge clusters.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am ok with that answer.
/lgtm

@MikeSpreitzer
Copy link
Contributor Author

In the latest kcp community meeting, @davidfestal gave an outline of the answer to my question (kcp-dev/contrib-tmc#34) about the NamespaceLocator mentioned wrt upsync in kcp-dev/kcp#1980 and said he would document the answer.

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 24, 2023
@MikeSpreitzer
Copy link
Contributor Author

/approve

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 24, 2023
@MikeSpreitzer
Copy link
Contributor Author

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 24, 2023
@MikeSpreitzer
Copy link
Contributor Author

/unapprove

@MikeSpreitzer MikeSpreitzer removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 24, 2023
@MikeSpreitzer
Copy link
Contributor Author

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 24, 2023
@MikeSpreitzer
Copy link
Contributor Author

/approve

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Mar 24, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: MikeSpreitzer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 24, 2023
@openshift-merge-robot openshift-merge-robot merged commit 5cc524e into kubestellar:main Mar 24, 2023
@MikeSpreitzer MikeSpreitzer deleted the roadmap-and-binding branch September 26, 2023 05:15
@kcp-ci-bot kcp-ci-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants