Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for 8-bit ASCII and unescaped UTF8 characters #28

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

dbnski
Copy link

@dbnski dbnski commented Oct 14, 2019

The original code uses signed char type for variables that hold the character being processed. It breaks 8-bit ASCII characters used in many European languages, but also prevents the library from correctly handling unescaped UTF8 characters. The attached patch addresses both these problems and further improves support for UTF8-encoded strings. I am aware that updating all methods wasn't necessary to make this work, but I thought perhaps it was a good idea to still do so for consistency.

@dbnski
Copy link
Author

dbnski commented Oct 14, 2019

It was just something quick I put together while trying to use your library in a PoC, but I see it needs some polishing to be able to pass through tests. If you are interested in merging this patch, I can update the PR when I have some spare time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant