Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTP service discovery #16

Merged
merged 5 commits into from
Nov 27, 2023
Merged

HTTP service discovery #16

merged 5 commits into from
Nov 27, 2023

Conversation

errm
Copy link
Contributor

@errm errm commented Nov 23, 2023

In our environment we make use of prometheus operator to configure prometheus for us.

Whilst it is possible to use file_sd in our environment it's a bit more of a pain to configure, as would need to add a sidecar to our prometheus, and add a shared volume so the file can be read.

With http_sd we can just run a Deployment of prometheus-msk-discovery and drop a ScrapeConfig resource into our manifests.

The shape of the API is exactly the same, only difference is it needs to be json, rather than yaml.

I added a flag -http-sd to enable this second mode - without it everything should behave exactly the same as it did before, so as to not cause any issues for existing users.

@joshm91
Copy link
Contributor

joshm91 commented Nov 24, 2023

Hey, thanks for this!

The only thing I'm slightly unsure about is that the MSK clusters get scraped for every request to the HTTP endpoint you've added, especially with the current interval flag we have which is meant to control that scraping frequency.

What do you think about changing it so that the current infinite for loop runs in both modes to scrape for clusters at the set interval and maintains an internal state for the latest fetch. When you have file_sd mode enabled then this gets written to a file at the same frequency and when you have http_sd mode enabled then the handler just reads and returns this internal state rather than initiating its own scrape?

@errm
Copy link
Contributor Author

errm commented Nov 27, 2023

Hi @joshm91 with http_sd you can set the interval that prometheus scrapes the endpoint with the refresh_interval configured in the prometheus config.

I am not 100% about service discovery endpoints, but for exporters, it is usually best practice to not do any calculations / calls until the endpoint is actually scraped.

If the endpoints are refreshed at a different rate from that at which prometheus is calling the endpoint then one of two things could happen:

  1. Prometheus is calling the endpoint, but the data is stale by some unknown amount of time.
  2. or more likely, prometheus is configured to scrape the endpoint less often than the refresh interval, so we end up calling the aws api more often than required, (but the data at the point when we scrape the endpoint may well still be stale)

With file_sd because prometheus can monitor the file we (prometheus-msk-discovery) are essentially pushing any updates to prometheus, with http_sd prometheus itself is in control of how much poling to do, so I think it makes more sense to have prometheus be in control of the interval via the refresh_interval attribute in it's config.

@joshm91
Copy link
Contributor

joshm91 commented Nov 27, 2023

Thanks for the clarification - that seems totally reasonable.

I'll merge and release shortly.

@joshm91 joshm91 merged commit c686e6d into statsbomb:main Nov 27, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants