Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: No more crashing/skipped changes for certain changes #1714

Merged
merged 3 commits into from
Nov 14, 2019

Conversation

dmsnell
Copy link
Member

@dmsnell dmsnell commented Nov 13, 2019

See: Simperium/node-simperium#91
See: google/diff-match-patch#80

Possibly fixes:

Maybe related but likely not:

Other repos:

There have been certain cases where consecutive surrogate pairs crash
diff-match-patch when it's building the delta string. This is
because the diffing algorithm finds a common prefix up to and including
half of the surrogate pair match when three consecutive code points
share the same high surrogate.

// before - \ud83d\ude4b\ud83d\ue4b
// after  - \ud83d\ude4b\ud83d\ue4c\ud83d\ude4b
// deltas - eq \ud83d\ude4b\ud83d -> add \ude4c -> eq \ud83d \ude4b

This crashes when trying to encode the invalid sequence of UTF-16 code
units into URIEncode, which expects a valid Unicode string.

After the fix the delta groups are normalized to prevent this situation
and the node-simperium library should be able to process those
problematic changesets.

This patch updates the dependency on the patched version of
node-simperium.

Testing

Follow the sequence described in Automattic/simplenote-macos#397

Add a new note
Enter the following emojis: ☺️🖖🏿
Insert the following emoji in between: 😃

In develop you should notice an exception in the dev tools showing that the update failed. Your updates will continue to fail for that note until the emoji sequence is removed. If you clear the browser cache or signout/signin the changes, even those you made after the emoji, will disappear.

In this branch the update should work as expected. If you clear the browser cache or signout/signin the changes will remain.

Before
BrokenSurrogates mov

After
WorkingSurrogates mov

See: Simperium/node-simperium#91
See: google/diff-match-patch#80

There have been certain cases where consecutive surrogate pairs crash
`diff-match-patch` when it's building the _delta string_. This is
because the diffing algorithm finds a common prefix up to and including
half of the surrogate pair match when three consecutive code points
share the same high surrogate.

```
// before - \ud83d\ude4b\ud83d\ue4b
// after  - \ud83d\ude4b\ud83d\ue4c\ud83d\ude4b
// deltas - eq \ud83d\ude4b\ud83d -> add \ude4c -> eq \ud83d \ude4b
```

This crashes when trying to encode the invalid sequence of UTF-16 code
units into `URIEncode`, which expects a valid Unicode string.

After the fix the delta groups are normalized to prevent this situation
and the `node-simperium` library should be able to process those
problematic changesets.

This patch updates the dependency on the patched version of
`node-simperium`.
@dmsnell dmsnell requested a review from a team November 13, 2019 00:04
@dmsnell dmsnell added this to the 1.12 milestone Nov 14, 2019
Copy link
Contributor

@belcherj belcherj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this merged and test heavily on Develop.

@dmsnell dmsnell merged commit 7efcf19 into develop Nov 14, 2019
@dmsnell dmsnell deleted the fix/crashing-on-surrogate-pair-changes branch November 14, 2019 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants