Stream events directly to GCS #560

mattmoor · 2024-09-18T22:51:40Z

This takes a fairly different approach to how we emit our logs to GCS. Previously we received them in one container and wrote them out to the local filesystem, and a sidecar would periodically enumerate the files emitted by that process, concatenate them, and send them up to GCS in a single upload.

When we shifted to Cloud Run, this approach became problematic because the filesystem is backed by memory, so under heavy load the event handler could see a lot of memory pressure between rotations and between the filesystem and the concatenation for the upload they end up in memory twice.

By collapsing the two processes together and simply uploading things directly, we can initiate a new file write, trickle events to that writer, and then flush the active writers. Worst case the client is dumb and buffers things once, but in a perfect world this would initiate an upload of unknown size and we would stream events as they come in, which would dramatically reduce our memory pressure to roughly O(active events).

This takes a fairly different approach to how we emit our logs to GCS. Previously we received them in one container and wrote them out to the local filesystem, and a sidecar would periodically enumerate the files emitted by that process, concatenate them, and send them up to GCS in a single upload. When we shifted to Cloud Run, this approach became problematic because the filesystem is backed by memory, so under heavy load the event handler could see a lot of memory pressure between rotations and between the filesystem and the concatenation for the upload they end up in memory twice. By collapsing the two processes together and simply uploading things directly, we can initiate a new file write, trickle events to that writer, and then flush the active writers. Worst case the client is dumb and buffers things once, but in a perfect world this would initiate an upload of unknown size and we would stream events as they come in, which would dramatically reduce our memory pressure to roughly O(active events). Signed-off-by: Matt Moore <[email protected]>

mattmoor marked this pull request as draft September 18, 2024 22:51

mattmoor force-pushed the stream-to-gcs branch from 7c7fc51 to 751aef9 Compare September 18, 2024 23:06

mattmoor force-pushed the stream-to-gcs branch from 751aef9 to 3b6c915 Compare September 18, 2024 23:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stream events directly to GCS #560

Stream events directly to GCS #560

mattmoor commented Sep 18, 2024

Stream events directly to GCS #560

Are you sure you want to change the base?

Stream events directly to GCS #560

Conversation

mattmoor commented Sep 18, 2024