-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Graduate downwardMetrics to feature KV lifecycle #12650
Conversation
Skipping CI for Draft Pull Request. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
b664fc7
to
712d686
Compare
6faa6bd
to
f3cbf37
Compare
f3cbf37
to
a91adee
Compare
f5868b7
to
065ce62
Compare
/retest-required |
Could @EdDev take a look at this, please? |
065ce62
to
7824733
Compare
7824733
to
305209e
Compare
7a699a9
to
d7c0ff3
Compare
v2: VM do not start, i.e., it does not create a VMI object, if an invalid configuration is detected. |
pkg/virt-config/virt-config.go
Outdated
@@ -107,6 +107,10 @@ func IsS390X(arch string) bool { | |||
return arch == "s390x" | |||
} | |||
|
|||
func (c *ClusterConfig) IsDownwardMetricsEnabled() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IsDownwardMetricsEnabled
and DownwardMetricsEnabled
are very similar names that can easily be confused. Can you give it another name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to IsDownwardMetricsFeatureEnabled
.
pkg/virt-controller/watch/vm/vm.go
Outdated
@@ -1351,6 +1352,13 @@ func (c *Controller) startVMI(vm *virtv1.VirtualMachine) (*virtv1.VirtualMachine | |||
return vm, fmt.Errorf("failed create validation: %v", validateErr) | |||
} | |||
|
|||
if downwardmetrics.IsDownwardMetricsConfigurationInvalid(c.clusterConfig, &vmi.Spec) { | |||
err = fmt.Errorf("DownwardMetrics feature is not enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errors.New
is sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/virt-controller/watch/vm/vm.go
Outdated
@@ -3255,6 +3263,12 @@ func (c *Controller) sync(vm *virtv1.VirtualMachine, vmi *virtv1.VirtualMachineI | |||
vm = vmCopy | |||
} | |||
|
|||
if vmi == nil { | |||
if downwardmetrics.IsDownwardMetricsConfigurationInvalid(c.clusterConfig, &vm.Spec.Template.Spec) { | |||
return vm, common.NewSyncError(fmt.Errorf("DownwardMetrics feature is not enabled"), controller.FeatureNotEnabled), nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fmt.Errorf("DownwardMetrics feature is not enabled")
Make this a const or a custom error in the downwardmetrics pkg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I've also replaced all occurrences of this string with the custom error variable.
}) | ||
|
||
AfterEach(func() { | ||
tests.EnableDownwardMetrics(virtClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be reset to original value, i.e. reset the KV object after? I think we have helpers for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be the benefit of reset the entire KV object? IMHO, if we are only changing one specific value, it makes sense to just reset that value back.
@@ -75,6 +75,11 @@ func AdjustKubeVirtResource() { | |||
}, | |||
}} | |||
|
|||
// Add the DownwardMetrics configuration, it will avoid to make some test to run on serial | |||
if kv.Spec.DownwardMetrics == nil { | |||
kv.Spec.DownwardMetrics = &v1.DownwardMetricsConfiguration{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This enables the feature which should be off by default for everyone else? I don't think this is good practice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't enable it by default, live migration downwardMetric tests will need to run on serial because we are updating the KubeVirt CR. In the past, in order to allow these test to run on parallel, we enabled the downwardMetrics FG by default in the tests. This is why I've included this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is marked as Serial already? I'd rather not enable something globally to avoid Serial just for a single test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the particular test I'm referring to: https://github.com/kubevirt/kubevirt/pull/12650/files#diff-5034f7c82b377dd518dd147554ad7df560b0cc667f1f0aa08e7a632d1436237aR471
It's just two tests, but if we disabled the feature globally, we need to make them Serial. I'm not against it, whatever we consider best.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the tests require it to be enabled but it is not by default, then they should be Serial.
kv := libkubevirt.GetCurrentKv(client) | ||
kv.Spec.DownwardMetrics = &v1.DownwardMetricsConfiguration{} | ||
|
||
updateKubevirtSpec(kv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use libkubevirt.UpdateKubeVirtConfigValueAndWait
and drop updateKubevirtSpec
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I was able to understand, UpdateKubeVirtConfigValueAndWait
will just update the kv.Spec.Configuration
, while the downwardMetrics configurable is under kv.Spec
. Therefore, if I use UpdateKubeVirtConfigValueAndWait
the change won't be reflected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, can libkubevirt
be updated to do what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are updating the kv.Spec
, that's what updateKubevirtSpec
is already doing. Sorry, maybe I'm missing something here. Could you please elaborate a bit more?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still, can libkubevirt be used instead? And the Disable/Enable helpers should live closer to where they are used. We should not grow tests/utils.go
at all.
@@ -661,6 +662,9 @@ func (c *Controller) updateStatus(vmi *virtv1.VirtualMachineInstance, pod *k8sv1 | |||
c.syncVolumesUpdate(vmiCopy) | |||
} | |||
|
|||
// Requires some action in the VM to trigger this check, is this ok? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a good question.
@xpivarc @fossedihelm Can you help us shed some light on this? Will the VMs/VMIs be resynced at some point?
Also, can you please highlight a bit more that this is a breaking change? The feature moves from being enabled to being disabled by default. |
Yet another piece of feedback: Could you please split deprecation of the FG and adding the new configurable into separate PRs? |
It deprecates the `DownwardMetrics` feature gate, since this feature already reached GA. This feature gate is not longer required to use the downwardMetrics feature. Signed-off-by: Javier Cano Cano <[email protected]>
Currently, this function only accepts VMI objects to determine if a downwardMetrics volume is being used. It generalizes the function to accept a list of Volumes instead. Signed-off-by: Javier Cano Cano <[email protected]>
It add a new configurable field to enable/disable the downwardMetrics feature cluster-wide. Signed-off-by: Javier Cano Cano <[email protected]>
d7c0ff3
to
2da9406
Compare
Currently, the feature is disabled by default. If the user does not enable the FG, the feature keeps disabled. However, it is true that users with the feature already enabled need some manual action or if they reboot a VM, it would fail to start. |
I would like to keep it the same PR if possible. If we deprecate the FG and then add the new configurable in a separate PR, it would create a gap in which the FG will be enabled by default. I would like to avoid this scenario by all means. This is because having this feature enabled by default could be considered a security issue. |
Enables the field: `spec.downwardMetrics: {}`. The configurable is disabled by default and it brings the option to the user to enable it cluster-wide if desired. Moreover, it adds checks to block a VM starting if the feature is out of sync and runtime checks to inform the user when the feature is out of sync too. Signed-off-by: Javier Cano Cano <[email protected]>
2da9406
to
bceced8
Compare
@jcanocan: The following tests failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@@ -1026,6 +1030,10 @@ func (c *Controller) sync(vmi *virtv1.VirtualMachineInstance, pod *k8sv1.Pod, da | |||
if validateErr := errors.Join(validateErrors...); validateErrors != nil { | |||
return common.NewSyncError(fmt.Errorf("failed create validation: %v", validateErr), "FailedCreateValidation") | |||
} | |||
if downwardmetrics.IsDownwardMetricsConfigurationInvalid(c.clusterConfig, &vmi.Spec) { | |||
c.recorder.Eventf(vmi, k8sv1.EventTypeWarning, controller.FeatureNotEnabled, downwardmetrics.DownwardMetricsNotEnabledError.Error()) | |||
return common.NewSyncError(fmt.Errorf("virtual machine is requesting a disabled feature: %s", "DownwardMetrics"), controller.FeatureNotEnabled) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use `downwardmetrics.DownwardMetricsNotEnabledError here too?
The use of fmt.Error with static strings looks weird.
@@ -17,6 +19,8 @@ const ( | |||
DownwardMetricsChannelSocket = DownwardMetricsChannelDir + "/downwardmetrics.sock" | |||
) | |||
|
|||
var DownwardMetricsNotEnabledError = errors.New("DownwardMetrics feature is not enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
var DownwardMetricsNotEnabledError = errors.New("DownwardMetrics feature is not enabled") | |
var NotEnabledError = errors.New("DownwardMetrics feature is not enabled") |
When used as downwardmetrics.NotEnabledError
this avoids repetition.
@@ -580,6 +580,9 @@ const ( | |||
|
|||
// Indicates whether the VMI is live migratable | |||
VirtualMachineInstanceIsStorageLiveMigratable VirtualMachineInstanceConditionType = "StorageLiveMigratable" | |||
|
|||
// Indiates that the VMI has a configuration out of sync with the cluster-wide configuration | |||
VirtualMachineInstanceConfigurationOutOfSync VirtualMachineInstanceConditionType = "FeatureConfigurationOutOfSync" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VirtualMachineInstanceConfigurationOutOfSync VirtualMachineInstanceConditionType = "FeatureConfigurationOutOfSync" | |
VirtualMachineInstanceConfigurationOutOfSync VirtualMachineInstanceConditionType = "ConfigurationOutOfSync" |
BeforeEach(func() { | ||
virtClient = kubevirt.Client() | ||
tests.EnableDownwardMetrics(virtClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Still it looks strange to me that we enable DownwardMetrics globally in the test setup and then again in the BeforeEach of these tests.
Consistently(func() bool { | ||
_, err := virtClient.VirtualMachineInstance(vm.Namespace).Get(context.Background(), vm.Name, metav1.GetOptions{}) | ||
return errors.IsNotFound(err) | ||
}, 60*time.Second, 5*time.Second).Should(BeTrue()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert the condition on the VM instead and try once to get the VMI expecting an IsNotFound error as not to prolong the test unnecessarily?
var vmi *v1.VirtualMachineInstance | ||
|
||
BeforeEach(func() { | ||
tests.EnableDownwardMetrics(virtClient) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, it looks strange to me that we call this in BeforeEach/AfterEach and in the global test setup.
@@ -75,6 +75,11 @@ func AdjustKubeVirtResource() { | |||
}, | |||
}} | |||
|
|||
// Add the DownwardMetrics configuration, it will avoid to make some test to run on serial | |||
if kv.Spec.DownwardMetrics == nil { | |||
kv.Spec.DownwardMetrics = &v1.DownwardMetricsConfiguration{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is marked as Serial already? I'd rather not enable something globally to avoid Serial just for a single test.
kv := libkubevirt.GetCurrentKv(client) | ||
kv.Spec.DownwardMetrics = &v1.DownwardMetricsConfiguration{} | ||
|
||
updateKubevirtSpec(kv) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, can libkubevirt
be updated to do what we want?
Since the potential concerns about the breaking change that this implies. Let's split the graduation in multiple releases/PRs. |
/close |
@jcanocan: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What this PR does
It deprecates the
DownwardMetrics
feature gate in favor of a configurable field:spec.downwardMetrics: {}
. The configurable is disabled by default, and it brings the option to the user to enable it cluster-wide if desired. Moreover, it adds checks to block a VM starting if the feature is out of sync and runtime checks to inform the user when the feature is out of sync too.Fixes # CNV-43919
Why we need it and why it was done in this way
The following tradeoffs were made:
The following alternatives were considered:
Links to places where the discussion took place:
Special notes for your reviewer
You can use the following steps to try it yourself:
Pending
state.$ ./cluster-up/kubectl.sh patch kubevirt kubevirt -n kubevirt --type json -p '[{"op":"add", "path":"/spec/downwardMetrics", "value": {}}]'
$ ./cluster-up/kubectl.sh patch kubevirt kubevirt -n kubevirt --type json -p '[{"op":"remove", "path":"/spec/configuration/downwardMetrics", "value": {}}]'
Checklist
This checklist is not enforcing, but it's a reminder of items that could be relevant to every PR.
Approvers are expected to review this list.
Release note