Optimize single-character alternatives #58
Labels
C-optimize
Issue or feature request for an optimization
enhancement
New feature or request
good first issue
Good for newcomers
Is your feature request related to a problem? Please describe.
BNF grammars (including dialects) commonly denote large character sets like this
(More examples on Wikipedia)
While there are of course better ways to denote character ranges in pomsky, the union of character sets and list of characters (as in
character
in the above example) is quite common. However, pomsky currently produces quite suboptimal regexes for this pattern.Example
Describe the solution you'd like
Please merge adjacent single-character alternatives into one character set. E.g.
a|b|c
->[abc]
.This optimization is particularly useful because it enabled further optimizations within character sets.
Additional context
For a reference implementation of this optimization, checkout the
regexp/prefer-character-class
rule. Note that this rule also does some interesting analysis to merge non-adjacent single-character alternatives.The text was updated successfully, but these errors were encountered: