-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Escaping netloc #4
Comments
That's interesting -- though URLs are described over like 5 different RFCs, I'm pretty sure that
As such, the hostname portion should never need escaping. Punycoding, in some cases, but not escaping. In these cases I like to check other implementations:
If I get a chance today, I might take a look at the source for |
I figured I should check some of the other RFCs, and it turns out that RCF 3986 is much more permissive on the matter than 2396:
If we want to adhere to that, then it seems that yes, any netloc must be escaped. |
I'm trying to use url.py to sanitize url's input from the user so I can stick them in html as links. I came across this behavior, which I consider a bug.
I'm working around it by escaping everything aggressively (for my use case, it's alright to accept only "nice looking" urls as they're expected to be gateway urls). I'm not sure of the exact escaping that needs to be done to submit a patch/pull request (assuming you agree it's a bug.)
Thanks for the nice library.
The text was updated successfully, but these errors were encountered: