Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(Azure DNS): external_dns_registry_errors_total metrics counter value #4563

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

dongjiang1989
Copy link
Contributor

@dongjiang1989 dongjiang1989 commented Jun 19, 2024

Description

The Azure provider should increment external_dns_registry_errors_total when changes cannot be submitted(found, updated, deleted, or created).

Fixes #4510

Checklist

  • Unit tests updated
  • End user documentation updated

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign mloiseleur for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jun 19, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @dongjiang1989. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 19, 2024
@dongjiang1989
Copy link
Contributor Author

@mloiseleur PTAL

@mloiseleur
Copy link
Contributor

Thanks for this PR.
Do you think you can add a test to ensure it won't happen again ?

@dongjiang1989
Copy link
Contributor Author

dongjiang1989 commented Jun 19, 2024

Thanks for this PR. Do you think you can add a test to ensure it won't happen again ?

Hmmm... It seems difficult to add unittest case.

records, err := c.Registry.Records(ctx)
if err != nil {
registryErrorsTotal.Inc()
deprecatedRegistryErrors.Inc()
return err
}

err = c.Registry.ApplyChanges(ctx, plan.Changes)
if err != nil {
registryErrorsTotal.Inc()
deprecatedRegistryErrors.Inc()
return err
}

Is there any better suggestion to add unittest case? 🤔️
@mloiseleur just return errror cases. PTAL

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 27, 2024
@dongjiang1989
Copy link
Contributor Author

@mloiseleur @szuecs PTAL re-check

dongjiang1989 and others added 2 commits July 4, 2024 21:24
Co-authored-by: Michel Loiseleur <[email protected]>
Co-authored-by: Michel Loiseleur <[email protected]>
@mloiseleur
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jul 5, 2024
@dongjiang1989
Copy link
Contributor Author

Thanks. @mloiseleur @szuecs Please re-check

@mloiseleur
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 10, 2024
@dongjiang1989
Copy link
Contributor Author

@mloiseleur Do you agree to Merge it? Thanks

@mloiseleur
Copy link
Contributor

@dongjiang1989 An other maintainer will do the final review when he has time. Thanks for your understanding.

@dongjiang1989
Copy link
Contributor Author

cc @agimenoapinity

@dongjiang1989
Copy link
Contributor Author

@johngmyers @szuecs PTAL, when you hava time.
thanks.

return nil
err1 := p.deleteRecords(ctx, deleted)
err2 := p.updateRecords(ctx, updated)
return errors.Join(err1, err2)
Copy link
Contributor

@Raffo Raffo Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we generally fine with the change of semantic of how we deal the error? Bubbling this back up has obvious consequences. Specifically, do we want a soft error or not?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using errors.join makes it easier to see where an error occurred.
WDYT? @mloiseleur @johngmyers @szuecs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn’t answer my question. This new error handling will definitely change the way metrics are reported, but also the semantics on what happens when we reach an error. I am not so sure this won’t be a surprise for some users, although we can say that erroring “the hard way” is most likely always the right thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn’t answer my question. This new error handling will definitely change the way metrics are reported, but also the semantics on what happens when we reach an error. I am not so sure this won’t be a surprise for some users, although we can say that erroring “the hard way” is most likely always the right thing.

Thanks @Raffo WDYT?
like:

if err := p.deleteRecords(ctx, deleted); err != nil {
    return err
}

if err := p.updateRecords(ctx, updated); err != nil {
    return err
}

return nil

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a different change in a different direction. You won't execute updateRecords in case deleteRecords fail and I am not sure this is the intended change. I think we should not proceed with this change as is unless you can provide a clear picture of what the error handling from the provider should be and what the consequences could be for the users of the provider.

@mloiseleur
Copy link
Contributor

/retitle fix(Azure DNS): external_dns_registry_errors_total metrics counter value

@k8s-ci-robot k8s-ci-robot changed the title fix(azure): fix external_dns_registry_errors_total metrics counter value 0 fix(Azure DNS): external_dns_registry_errors_total metrics counter value Nov 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Azure Provider - external_dns_registry_errors_total metrics showing 0
4 participants