Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Content-Length header leaks original length of placeholder values #137

Open
smallnamespace opened this issue May 8, 2017 · 5 comments
Open

Comments

@smallnamespace
Copy link

smallnamespace commented May 8, 2017

Since we record all the headers, even when we replace sensitive tokens with their placeholders, the original total length of all tokens is preserved.

A few possible options:

  1. Fake the content-length by incrementing by placeholder length - original length when using placeholders bidirectionally
  2. Simply drop Content-Length when placeholders are used?
  3. Keep current behavior but note the risk in the docs

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@sigmavirus24
Copy link
Collaborator

Thanks for reporting this!

Sorry for the delay. The last few months have been hectic for me. This gets tricky but can be done. Would you be willing to work on a PR for this?

@hroncok
Copy link
Member

hroncok commented Oct 28, 2017

Rather than trying to be too clever, I think 3. is the correct approach here. We could mention that in the docs and point the reader to a way to implement a protection against this (such as 2.) in a hook.

@smallnamespace Still interested?

@smallnamespace
Copy link
Author

@hroncok Sorry for dropping this earlier -- I can take a look, but will take me awhile to get around to.

@hroncok
Copy link
Member

hroncok commented Oct 31, 2017

Thanks! This has been sitting here for a couple of months, so don't worry if it's not fixed this week :)

@sigmavirus24
Copy link
Collaborator

Yeah, I'm honestly a bit torn on this, and I should have explained sooner.

The main reason why it's tricky (which is likely solvable):

Betamax does its best to preserve the Content-Encoding if it is present. If we receive a compressed response, we try to preserve that, the headers, and that includes Content-Length. Perhaps we should always decrypt it and alter those headers. If we change that behaviour, though, we probably want a way to re-enable it for more advanced users who may be checking those attributes.

After we figure that out, we need to figure out what to store in Content-Length. Do we always re-calculate it on replay? Do we never store the real content-length in the cassette? Do we only use a placeholder for it when we've replaced something else?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants