-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quotes or backslashes in fields break API Gateway logs JSON #4348
Comments
Switching to WAF logging may be feasible workaround. See my comment on the blocker. We'd need to establish if the WAF logs contain sufficient information like response status and latency. I have the feeling, it doesn't but we should make sure in a spike after the blocker lands. |
@hannes-ucsc: "There may be a solution: there is now |
Assignee to consider next steps. |
Assignee to create a reproduction. |
To reproduce, make a request to a service endpoint which forces the
Then, once the associated logs are available after waiting a few minutes, use the following CW Log Insight Query to find the associated malformed (non-JSON parsable) APIGateway logs: CloudWatch Logs Insights
Which looks as follows in the CloudWatch Logs Insights console: Alternatively, to find these type of malformed logs, one may use the following message filter hooks (verbatim):
Both seemed equally effective in returning the desired log events. |
Assignee to expand screenshot to show that CloudWatch isn't able to parse the JSON in |
Screenshot in #4348 (comment) has been updated to include the requested details. A correctly parsed API Gateway log entry in CloudWatch is like the first entry in the following screenshot: |
The field that breaks the JSON in the above reproduction is populated by API Gateway. While this is a valid reproduction, and one we need to fix, we should also try a reproduction in which the field value originates in the client. The severity of this issue would be elevated by the possibility to reproduce the issue with a field originating from the client, because that scenario could represent a potential string injection vulnerability. Looking at the list of log entry fields, the |
Spike to add a reproduction as outlined above. |
The following cURL requests with an altered User-Agent header did not cause malformations in the API Gateway Logs, as seen from the console. cURL requests: curl -H 'User-Agent: ""This is a special user-agent""' 'https://service.dev.singlecell.gi.ucsc.edu/index/projects' curl -H 'User-Agent: "This is a special user-agent"' 'https://service.dev.singlecell.gi.ucsc.edu/index/projects'
CloudWatch Logs Insights
All the API Gateway logs corresponding to the cURL requests above show a properly parsed structure in the console. |
Assignee to consider next steps. |
It looks like the JSON escaping works properly for all fields except |
For demo, attempt to reproduce. |
We've determined that there is an issue with emitting JSON in API Gateway Logs. The documentation suggests that one can use a JSON template to do so but that suggestion is naive because the approach quickly fails for fields whose values might contain quote or backslash. Something would need to escape those but there is currently no mechanism to do so.
We will whittle down the list of logged fields in #4073 and then split the log template into two blocks: one JSON and one custom, like so:
A safe field is one whose value is guaranteed (or extremely unlikely) to contain quotes or backslashes. The CloudWatch heuristic that detects and parses JSON in log messages will then be able to parse the JSON from the first block and ignore the second block.
Originally posted by @hannes-ucsc in #4009 (comment)
The text was updated successfully, but these errors were encountered: