Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

whitespace truncated #4

Closed
smeevil opened this issue Feb 24, 2016 · 8 comments
Closed

whitespace truncated #4

smeevil opened this issue Feb 24, 2016 · 8 comments
Assignees

Comments

@smeevil
Copy link

smeevil commented Feb 24, 2016

Hi,

I noticed the following behaviour :

iex(1)> HtmlSanitizeEx.basic_html("This <b>is</b> <b>an</b> <i>example</i> of <u>space</u> eating.")
"This <b>is</b><b>an</b><i>example</i> of <u>space</u> eating."

The sanitiser is eating whitespace if directly followed by an other whitelisted tag. I would expect the result to be :

"This <b>is</b> <b>an</b> <i>example</i> of <u>space</u> eating."

Gerard.

rrrene added a commit that referenced this issue Feb 24, 2016
Reproduces #4

Thx to @smeevil for both the bug report and the term "white-space eating" :)
@rrrene rrrene self-assigned this Feb 24, 2016
@rrrene
Copy link
Owner

rrrene commented Feb 24, 2016

Hi Gerard,

this is definitely a bug! Thx for reporting! 👍

@smeevil
Copy link
Author

smeevil commented Feb 26, 2016

I'm trying to investigate the origin.
It seems its related to mochiweb_html.

for example :

test "It should conserve spaces between tags" do
    input ="<test>hello <b>world</b> <b>how</b> are you doing ?</test>"
    assert [{"test", [], ["hello ", {"b", [], ["world"]}, " ", {"b", [], ["how"]}, " are you doing ?"]}] = :mochiweb_html.parse(input)
end

returns

match (=) failed
     code: [{"test", [], ["hello ", {"b", [], ["world"]}, " ", {"b", [], ["how"]}, " are you doing ?"]}] = :mochiweb_html.parse(input)
     rhs:  {"test", [], ["hello ", {"b", [], ["world"]}, {"b", [], ["how"]}," are you doing ?"]}

Here you can see it misses the " " between the b tags.

@smeevil
Copy link
Author

smeevil commented Feb 26, 2016

I think this is the point of origin : https://github.com/mochi/mochiweb/blob/d024b4a5804fe4e0061c4ed2d1c52bdd168995e9/src/mochiweb_html.erl#L81

I'm really out of my comfort zone here though, but maybe we should forward this case to them, what do you think ?

@rrrene
Copy link
Owner

rrrene commented Feb 26, 2016

I investigated to that point as well and then ... went asleep. I concluded that since the parse function does not take a second parameter, is must be (a) intended behaviour or (b) a bug.

But I am really outside my "comfort zone" here as well. My Erlang-Fu is weak. 😞

@smeevil
Copy link
Author

smeevil commented Feb 26, 2016

ok, something I just found :

iex(1)> :mochiweb_html.parse("<html>just <b>an</b> <b>other</b> test</html>")
{"html", [], ["just ", {"b", [], ["an"]}, {"b", [], ["other"]}, " test"]}
iex(2)> :mochiweb_html.parse("<html>just <b>an</b>&nbsp;<b>other</b> test</html>")
{"html", [], ["just ", {"b", [], ["an"]}, " ", {"b", [], ["other"]}, " test"]}

first example, is with a regular space between and
Second one is with &nbsp; so ...</b>&nbsp;<b>...
As you can see, that does solve the problem....

Maybe....regex replace spaces between a closing and opening tag with  's ?
It's ugly...but gets the job done right ? :D

@smeevil
Copy link
Author

smeevil commented Feb 26, 2016

how about this :
smeevil@b702c11

The only thing i do not understand is why the test fails even if both sides are the same ?

example

Do you have an explanation for that ?
If I change the left == right to left = right it does pass....

@rrrene
Copy link
Owner

rrrene commented Feb 26, 2016

I filed an issue with mochiweb: mochi/mochiweb#166

I'd suggest we wait until we know whether or not this is a bug. If it is not, we will have to find a solution 👍

@rrrene rrrene closed this as completed in 69912a2 May 17, 2016
@rrrene
Copy link
Owner

rrrene commented May 17, 2016

@smeevil fixed it. Sorry this took so long, I somehow lost track of this issue. 😢

Version 1.0.0 contains the fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants