Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconvenient emoji ascii replacement #96

Open
felpsio opened this issue Aug 1, 2022 · 8 comments
Open

Inconvenient emoji ascii replacement #96

felpsio opened this issue Aug 1, 2022 · 8 comments

Comments

@felpsio
Copy link

felpsio commented Aug 1, 2022

The regex that identifies ascii emojis and replaces them by the actual emojis should check for whitespace before and after the match. Or it may add inconvenient emojis in some cases. Example:

"chrome://settings/content/notifications" is translated to "chrome😕/settings/content/notifications"

Suggestion:

only apply the ascii conversion when:

  • the emoji is between "start of line" and "end of line"
  • the emoji is between whitespace
@felpsio
Copy link
Author

felpsio commented Aug 2, 2022

To solve my user case I updated the asciiRegex to:

(?: |^)(${edgeCases})?(${names}|:)([${allowedAliasCharacters}]*:)?(?: |$)

I basically added (?: |^) and (?: |$)

This forces the regex to make a emoji substitution only when the ascii match is separated by a whitespace or start/end of line

@enzoferey
Copy link
Collaborator

Hey @felpsio, chrome:// URLs are indeed edge cases. I prepared the regex to support those kind of issues via the edgeCases variable that be found here:

const startOfURL = "https?\\S*";
const names = flatten(
Object.keys(asciiAliases).map(name => {
return asciiAliases[name].map(escapeStringToBeUsedInRegExp);
})
).sort().reverse().join("|"); // reverse sort for most specific match
const edgeCases = [startOfURL].join("|");

In the same way we have startOfURL, we could have startOfChromeURL and cover that.

@felpsio
Copy link
Author

felpsio commented Aug 2, 2022

hey @enzoferey, but this can help on many other cases.

For example, instead of chrome: safari, brave, firefox, file, base64 etc.

So it's probably safer require a whitespace or start/end of line to avoid all these cases.

Still going to work on these cases: :), :), :). But it won't work on <any string without whitespace>:)<any string without whitespace>

@enzoferey
Copy link
Collaborator

Right, the design decisions for this library was to support emojis without required whitespaces, hence going for this solution. I'm happy to add as many edge cases as needed.

Note that you can always escape certain parts of a string, transform the rest of the string with this library, and then concatenate back.

@felpsio
Copy link
Author

felpsio commented Aug 4, 2022

Thanks @enzoferey.

You're right. My use case may not suit the goal of this lib. So for now, I'll probably keep my forked version and bring the updates from the main version manually as they update.

But I suggest then adding these edge cases: brave://, chrome:// and file://.

Feel free to close my issue then :).

If somebody else needs to add whitespaces as a requirement, you can use my adapted version: https://github.com/felpsio/react-emoji-render or just fork and apply the same regex change

@enzoferey
Copy link
Collaborator

As you prefer @felpsio, happy to accept PRs for such edge cases 👍🏻

@nicolasvidelac
Copy link

nicolasvidelac commented Nov 21, 2023

Hi @felpsio, I encountered an issue really close to this one so i didn't feel like opening a new issue was necessary.
If i have the text: "https://somexample.com/tommoor/react-emoji-render/issues/96", it will not add an emoji to it.

But, if i have "Https" instead of "https", it will, so "Https://somexample.com/issues/96" becomes "Https😕/somexample.com/issues/96".

So, the case should not matter when checking if it's an exception or not.

@felpsio
Copy link
Author

felpsio commented Nov 24, 2023

@nicolasvidelac maybe for your case you can just transform the string in lower case. Https is not an usual way for URLs, so change the regex to accept the upper case H may cause some undesired side effects or other edge cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants