Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race Condition with installed extras (metrics-server, custom role bindings) #946

Open
tcrowder-koerber opened this issue Sep 10, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@tcrowder-koerber
Copy link

tcrowder-koerber commented Sep 10, 2024

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version and Provider Version

Terraform 1.7.5
OCI provider 6.9.0

Affected Resource(s)

metrics server, autoscaler and custom role bindings

Terraform Configuration Files

OKE Module 5.1.8

Attributes:
cluster_type = "enhanced"
create_service_account = true
service_accounts = merge({
kubeconfig-sa = {
sa_namespace = "kube-system"
sa_name = "kubeconfig-sa"
sa_cluster_role = "cluster-admin"
sa_cluster_role_binding = "sa-crb"
}
}, var.cluster_service_accounts)
create_iam_resources = true
create_operator = true
operator_install_helm = true
cluster_autoscaler_install = true
metrics_server_install = true

...
nodepool-1 = {
allow_autoscaler = true
}

Debug Output

No output related. Generally the autoscaler and metrics-server will install OK but the CRB failed but reports OK. Before adding the CRB, the metrics-server would have issues. The clusters create fine, kubectl is fine, but metrics-server is missing or the CRB is missing. I run post installation validation checks on these resources now due to how often it fails.

The resolution for me is to set the values to false for the affected resources and apply. Turn them back to true and apply.

Panic Output

Expected Behavior

It should add the resources or error it failed.

Actual Behavior

The apply succeeds but some resources such as metrics-server or custom role bindings are not present.

Steps to Reproduce

  1. terraform apply

Important Factoids

The resolution for me is to set the values to false for the affected resources and apply. Turn them back to true and apply.

References

@tcrowder-koerber tcrowder-koerber added the bug Something isn't working label Sep 10, 2024
@robo-cap
Copy link
Member

robo-cap commented Sep 20, 2024

Regarding the creation of RB/CRB, it may be that some of the commands are silently failing. Please check the note in the terraform documentation for remote-exec.

Please append set -o errexit to the list of inline commands executed to create the CRB, here, and attach any log that may explain why the resource creation fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants