Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DO NOT MERGE] Ringbuf redux #1687

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

dave-tucker
Copy link
Collaborator

Brings back the ringbuf work, just to see if it's something we want to continue with or not.

dave-tucker and others added 3 commits August 9, 2024 13:02
This uses a ring buffer to send events from eBPF back to userspace.
Doing so allows for our eBPF probes to complete quickly, and pushes
all delta calculation and summary back into userspace.

Moves the logic that resolves comm from procfs into its own pkg.
In addition, a cache is added to avoid hitting procfs every time
we update process metrics.

Removes the bpftest pkg and includes the test programs in the
main kepler.bpf.c file. This prevents drift in the go generate
flags and generally makes it easier to write tests.

Signed-off-by: Dave Tucker <[email protected]>
The resolves pid 0 to system_processes without (throwing) any error
since there is no command associated with it. This commit fixes the
following log

I0802 03:24:03.647656  733341 process_bpf_collector.go:100] failed to resolve comm for PID 0: process not running: stat /proc/0: no such file or directory, set comm=system_processes
I0802 03:24:03.648079  733341 process_bpf_collector.go:100] failed to resolve comm for PID 0: process not running, set comm=system_processes
0802 process_bpf_collector.go:100] failed to resolve comm for PID 0: process not running: stat /proc/0: no such file or directory, set comm=system_processes
...
I0802 process_bpf_collector.go:100] failed to resolve comm for PID 0: process not running, set comm=system_processes

Signed-off-by: Sunil Thaha <[email protected]>
…#1671)

Processing events in the same goroutine as the ring buffer reader
requires acquiring a mutex, which blocks ringbuf event processing
causing a backlog. To avoid this, send events via a buffered
channel to a dedicated event processing goroutine to ensure
that the ringbuf remains unblocked. This has decreased CPU
load from 1-3% on my machine to 0-1% CPU load.

Signed-off-by: Dave Tucker <[email protected]>
Copy link

github-actions bot commented Aug 9, 2024

🤖 SeineSailor

Here is a concise summary of the pull request changes:

The "[DO NOT MERGE] Ringbuf redux" pull request reintroduces the ring buffer functionality to the codebase, affecting the build process, testing procedures, and internal implementation. Key modifications include:

  1. Build and testing changes: Modifications to the Makefile, go.mod, and go.sum files, as well as changes to the testing procedures, with the SUDO_TEST_PKGS variable now listing pkg/bpf instead of pkg/bpftest.
  2. Ring buffer functionality: Addition of a ring buffer map, perf event arrays, and related functions for getting hardware event counters. New functions for handling IRQ, page cache hits, and process free events have been added.
  3. eBPF exporter changes: Introduction of a new flag, bpfDebugMetricsEnabled, which enables debug metrics for eBPF. A new goroutine is started to handle errors from the eBPF exporter's Start method.
  4. Process metrics and communication resolution changes: Updates to the updateSWCounters function, processesData structure, and comm variable replacement with processComm from the commResolver.
  5. Vendor package updates: Upgrades to the github.com/cilium/ebpf package from v0.15.0 to v0.16.0, which may involve changes in the underlying implementation, affecting the behavior of the code.
  6. Testing script changes: Updates to the run-tests.sh script to include a quote_env function and modifications to the preserved_env array.

Observations and suggestions for improvement:

  • The pull request includes a mix of build process changes, new functionality, and vendor package updates, which may make it challenging to review and test.
  • It would be beneficial to break down the changes into smaller, more focused pull requests to facilitate easier review and testing.
  • The impact of the vendor package updates on the codebase's behavior should be thoroughly tested and documented.
  • The new ring buffer functionality and eBPF exporter changes may require additional testing and validation to ensure they do not introduce any regressions or issues.

@vimalk78
Copy link
Collaborator

should we start a new branch for this change?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants