Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream events directly to GCS #560

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mattmoor
Copy link
Member

This takes a fairly different approach to how we emit our logs to GCS. Previously we received them in one container and wrote them out to the local filesystem, and a sidecar would periodically enumerate the files emitted by that process, concatenate them, and send them up to GCS in a single upload.

When we shifted to Cloud Run, this approach became problematic because the filesystem is backed by memory, so under heavy load the event handler could see a lot of memory pressure between rotations and between the filesystem and the concatenation for the upload they end up in memory twice.

By collapsing the two processes together and simply uploading things directly, we can initiate a new file write, trickle events to that writer, and then flush the active writers. Worst case the client is dumb and buffers things once, but in a perfect world this would initiate an upload of unknown size and we would stream events as they come in, which would dramatically reduce our memory pressure to roughly O(active events).

@mattmoor mattmoor marked this pull request as draft September 18, 2024 22:51
This takes a fairly different approach to how we emit our logs to GCS.  Previously we received them in one container and wrote them out to the local filesystem, and a sidecar would periodically enumerate the files emitted by that process, concatenate them, and send them up to GCS in a single upload.

When we shifted to Cloud Run, this approach became problematic because the filesystem is backed by memory, so under heavy load the event handler could see a lot of memory pressure between rotations and between the filesystem and the concatenation for the upload they end up in memory twice.

By collapsing the two processes together and simply uploading things directly, we can initiate a new file write, trickle events to that writer, and then flush the active writers.  Worst case the client is dumb and buffers things once, but in a perfect world this would initiate an upload of unknown size and we would stream events as they come in, which would dramatically reduce our memory pressure to roughly O(active events).

Signed-off-by: Matt Moore <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant