Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Make node ready only after static pods are registered #2078

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

haircommander
Copy link
Member

What type of PR is this?

/kind bug

What this PR does / why we need it:

for testing kubernetes#126870

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?


Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@openshift-ci-robot openshift-ci-robot added the backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. label Sep 6, 2024
@openshift-ci-robot
Copy link

@haircommander: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug. labels Sep 6, 2024
Copy link

openshift-ci bot commented Sep 6, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: haircommander
Once this PR has been reviewed and has the lgtm label, please assign mrunalp for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@haircommander haircommander changed the title WIP: UPSTREAM: carry: Make node ready only after static pods are registered UPSTREAM: carry: Make node ready only after static pods are registered Sep 6, 2024
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 6, 2024
@haircommander haircommander changed the title UPSTREAM: carry: Make node ready only after static pods are registered test Make node ready only after static pods are registered Sep 6, 2024
@haircommander
Copy link
Member Author

/payload-job

Copy link

openshift-ci bot commented Sep 6, 2024

@haircommander: it appears that you have attempted to use some version of the payload command, but your comment was incorrectly formatted and cannot be acted upon. See the docs for usage info.

@haircommander
Copy link
Member Author

/payload 4.18 nightly blocking

Copy link

openshift-ci bot commented Sep 6, 2024

@haircommander: trigger 9 job(s) of type blocking for the nightly release of OCP 4.18

  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-rt-upgrade
  • periodic-ci-openshift-hypershift-release-4.18-periodics-e2e-aws-ovn-conformance
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-fips-payload-scan
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-bm
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5c23b740-6c76-11ef-884f-b3f32e2bdbf7-0

Node registration and pod syncs are done in separate Go routines,
leading to a potential race condition. Static Pods might not get
registered because the kubelet is not registered, causing scheduler to
overcommit the node due to unawareness of static pod resource usage.
This resulted in kubelet rejecting pods due to insufficient resources.

The initial fix involved making the node schedulable only after static
pod registration, but this introduced a 1-1.5 minute latency due to
kubelet's resync interval for pods.

To address this latency, we now resync static pods immediately upon node
registration, ensuring the node becomes ready without additional delay.
@openshift-ci-robot
Copy link

@haircommander: the contents of this pull request could not be automatically validated.

The following commits could not be validated and must be approved by a top-level approver:

Comment /validate-backports to re-evaluate validity of the upstream PRs, for example when they are merged upstream.

@haircommander haircommander changed the title test Make node ready only after static pods are registered WIP: Make node ready only after static pods are registered Oct 3, 2024
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 3, 2024
@haircommander
Copy link
Member Author

/payload 4.18 nightly blocking

Copy link

openshift-ci bot commented Oct 3, 2024

@haircommander: trigger 9 job(s) of type blocking for the nightly release of OCP 4.18

  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-rt-upgrade
  • periodic-ci-openshift-hypershift-release-4.18-periodics-e2e-aws-ovn-conformance
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-fips-payload-scan
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-bm
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-ipv6

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8e7bc7a0-819d-11ef-9569-8f6afbeb2dae-0

@kannon92
Copy link

kannon92 commented Oct 3, 2024

/payload 4.18 nightly informing

Copy link

openshift-ci bot commented Oct 3, 2024

@kannon92: trigger 73 job(s) of type informing for the nightly release of OCP 4.18

  • periodic-ci-openshift-release-master-nightly-4.18-e2e-agent-compact-ipv4-conformance
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-agent-ha-dualstack-conformance
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-agent-single-node-ipv6
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-nightly-4.18-console-aws
  • periodic-ci-openshift-cluster-control-plane-machine-set-operator-release-4.18-periodics-e2e-aws
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-csi
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-cgroupsv2
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-fips
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node-csi
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node-serial
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node-techpreview
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-single-node-techpreview-serial
  • periodic-ci-openshift-release-master-nightly-4.18-upgrade-from-stable-4.17-e2e-aws-upgrade-ovn-single-node
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-techpreview
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-aws-ovn-upgrade-out-of-change
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-upi
  • periodic-ci-openshift-cluster-control-plane-machine-set-operator-release-4.18-periodics-e2e-azure
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-azure-csi
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-techpreview
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-azure-ovn-upgrade-out-of-change
  • periodic-ci-openshift-release-master-cnv-nightly-4.18-e2e-azure-deploy-cnv
  • periodic-ci-openshift-release-master-cnv-nightly-4.18-e2e-azure-upgrade-cnv
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-driver-toolkit
  • periodic-ci-openshift-cluster-control-plane-machine-set-operator-release-4.18-periodics-e2e-gcp
  • periodic-ci-openshift-release-master-ci-4.18-e2e-gcp-ovn
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-gcp-ovn-csi
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-gcp-ovn-rt
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-gcp-ovn-serial
  • periodic-ci-openshift-release-master-ci-4.18-e2e-gcp-ovn-techpreview
  • periodic-ci-openshift-release-master-ci-4.18-e2e-gcp-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-release-master-ci-4.18-e2e-gcp-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-bm-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-dualstack
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-dualstack-techpreview
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-ipv6-techpreview
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-serial-ipv4
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-serial-virtualmedia
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-ovn-techpreview
  • periodic-ci-openshift-release-master-nightly-4.18-upgrade-from-stable-4.17-e2e-metal-ipi-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-serial-ovn-ipv6
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-serial-ovn-dualstack
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ipi-upgrade-ovn-ipv6
  • periodic-ci-openshift-release-master-nightly-4.18-upgrade-from-stable-4.17-e2e-metal-ipi-upgrade-ovn-ipv6
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ovn-assisted
  • periodic-ci-openshift-release-master-nightly-4.18-metal-ovn-single-node-recert-cluster-rename
  • periodic-ci-openshift-microshift-release-4.18-periodics-e2e-aws-ovn-ocp-conformance
  • periodic-ci-openshift-microshift-release-4.18-periodics-e2e-aws-ovn-ocp-conformance-serial
  • periodic-ci-openshift-osde2e-main-nightly-4.18-osd-aws
  • periodic-ci-openshift-osde2e-main-nightly-4.18-conformance-osd-aws
  • periodic-ci-openshift-osde2e-main-nightly-4.18-osd-gcp
  • periodic-ci-openshift-osde2e-main-nightly-4.18-conformance-osd-gcp
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-aws-ovn-proxy
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-metal-ovn-single-node-live-iso
  • periodic-ci-openshift-osde2e-main-nightly-4.18-rosa-classic-sts
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-telco5g
  • periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-aws-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-csi
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-serial
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-techpreview
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-techpreview-serial
  • periodic-ci-openshift-release-master-ci-4.18-upgrade-from-stable-4.17-e2e-vsphere-ovn-upgrade
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-upi
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-ovn-upi-serial
  • periodic-ci-openshift-release-master-nightly-4.18-e2e-vsphere-static-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/1bfa5e50-81c8-11ef-96d6-03534b196db9-0

@rphillips
Copy link

/test e2e-aws-ovn-serial

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backports/unvalidated-commits Indicates that not all commits come to merged upstream PRs. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants