Skip to content
This repository has been archived by the owner on Jan 8, 2024. It is now read-only.

Parsing csv log. #9

Open
jalexandre0 opened this issue Jun 30, 2021 · 3 comments
Open

Parsing csv log. #9

jalexandre0 opened this issue Jun 30, 2021 · 3 comments

Comments

@jalexandre0
Copy link

Hi there.

Theres a way to parse and replay from csv logs for people who can't change their log_line_prefix (aka RDS customers).

@mvasilenko
Copy link

mvasilenko commented Sep 20, 2021

hit this issue too
@jalexandre0 did you managed to replay RDS logs by pgreplay-go?

so far, I was able to convert RDS csvlog to pgreplay-go compatible log by writing a simple parser, which is converting one csvlog into two lines like this:

csvlog:

2021-09-09 17:00:00.006 UTC,"user","database",27752,"172.30.1.2:34106",613a286d.6c68,13992,
"SELECT",2021-09-09 15:29:49 UTC,229/3866470,0,LOG,00000,
"execute <unnamed>: SELECT ""jobs"".* FROM ""jobs"" WHERE ""jobs"".""deleted_at"" IS NULL
AND ""jobs"".""user_id"" = $1","parameters: $1 = '124765'",,,,,,,,"bin/rails"

pgreplay-go log

2021-09-09 17:00:00.006 UTC |user|database|613a286d.6c68|LOG:  execute <unnamed>: SELECT "jobs".* FROM "jobs" WHERE "jobs"."deleted_at" IS NULL AND "jobs"."user_id" = $1
2021-09-09 17:00:00.006 UTC |user|database|613a286d.6c68|DETAIL:  parameters: $1 = '124765'

@lawrencejones does such an approach looks valid to you or am I missing something?

@benwh
Copy link
Contributor

benwh commented Jul 22, 2022

Hey @jalexandre0 - Unfortunately we don't have any native support for CSV logs at the moment, correct.

In theory adding another parser to do this would be pretty straight-forward, given that the logs contain all of the required data, in a roughly similar format. Contributions are welcome!

In the meantime, you could possibly craft some sed/awk commands to meld the log into the correct format. It's also worth noting that pgreplay-go has a (less documented) JSON input format. You can get a feel for what this format looks like by converting an errlog-format log into this: pgreplay-go --debug filter --errlog-input=errlog.log --output=logs.json

@ghunti
Copy link

ghunti commented Oct 3, 2022

I've created another simple parser that converts an RDS CSV into a pgreplay-go JSON file.
You can specify the number of lines and filter them out for a specific use. I've used this to parse CSVs with 10-60GB, and although it was not improved for speed, it could parse the bigger files within 30/45min.
https://gist.github.com/ghunti/66f32304b7d7ed9f9b5714ba41a725dc

@noqcks noqcks mentioned this issue Jun 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants