Generate a matched string with a given regular expression, it's useful if you want to mock some strings from a regexp rule. It strictly abide by the standard javascript regex rule, but you still need pay attentions with the Special cases.
-
Support named capture group, e.g.
(?<named>\w)\k<named>
, and also allowing to override it by expose a config fieldnamedGroupConf
. -
Support unicode property class
\p{Lu}
by setting the staticUPCFactory
handle, see the example for more details. -
Support
u
flag, so you can use unicode ranges. -
Allow you get the capture group values.
# npm
npm install --save reregexp
# or yarn
yarn add reregexp
// Commonjs module
const ReRegExp = require('reregexp').default;
// ESM module
// since v1.6.1
import ReRegExp from 'reregexp';
// before v1.6.1
import re from 'reregexp';
const ReRegExp = re.default;
// For the first parameter of the constructor
// You can use a regex literal or a RegExp string
// if you need use some features that are not well supported by all browsers
// such as a named group, you should always choose a RegExp string
// Example 1: use group reference
const r1 = new ReRegExp(/([a-z0-9]{3})_\1/);
r1.build(); // => 'a2z_a2z' '13d_13d'
// Example 2: use named group
const r2 = new ReRegExp(/(?<named>\w{1,2})_\1_\k<named>/);
r2.build(); // => 'b5_b5_b5' '9_9_9'
// Example 3: use named group and with `namedGroupConf` config
// it will use the string in the config insteadof the string that will generated by the named group
// of course, it will trigger an error if the string in config not match the rule of named group.
const r3 = new ReRegExp('/(a)\\1(?<named>b)\\k<named>(?<override>\\w+)/', {
namedGroupConf: {
override: ['cc', 'dd'],
},
});
r3.build(); // => "aabbcc" "aabbdd"
// Example 4: use a character set
const r4 = new ReRegExp(/[^\w\W]+/);
r4.build(); // will throw error, because the [^\w\W] will match nothing.
// Example 5: also a character set with negative operator
const r5 = new ReRegExp(/[^a-zA-Z0-9_\W]/);
r5.build(); // will throw error, this is the same as [^\w\W]
// Example 6: with the `i` flag, ignore the case.
const r6 = new ReRegExp(/[a-z]{3}/i);
r6.build(); // => 'bZD' 'Poe'
// Example 7: with the `u` flag, e.g. make some chinese characters.
const r7 = new ReRegExp('/[\\u{4e00}-\\u{9fcc}]{5,10}/u');
r7.build(); // => '偤豄酌菵呑', '孜垟与醽奚衜踆猠'
// Example 8: set a global `maxRepeat` when use quantifier such as '*' and '+'.
ReRegExp.maxRepeat = 10;
const r8 = new ReRegExp(/a*/);
r8.build(); // => 'aaaaaaa', 'a' will repeated at most 10 times.
// Example 9: use a `maxRepeat` in constructor config, it will override `maxRepeat` of the global.
const r9 = new ReRegExp(/a*/, {
maxRepeat: 20,
});
r9.build(); // => 'aaaaaaaaaaaaaa', 'a' will repeated at most 20 times
// Example 10: use a `extractSetAverage` config for character sets.
const r10 = new ReRegExp(/[\Wa-z]/, {
// \W will extract as all the characters match \W, a-z now doesn't have the same chance as \W
extractSetAverage: true,
});
// Example 11: use a `capture` config if cared about the capture data
const r11 = new ReRegExp(/(aa?)b(?<named>\w)/), {
capture: true, // if you cared about the group capture data, set the `capture` config true
});
r11.build(); // => 'abc'
console.log(r11.$1); // => 'a'
console.log(r11.$2); // => 'c'
console.log(r11.groups); // => {named: 'c'}
// Example 12: use the unicode property class by setting the `UPCFactory`
ReRegExp.UPCFactory = (data: UPCData) => {
/*
UPCData: {
negate: boolean; // if the symbol is 'P'
short: boolean; // take '\pL' as a short for '\p{Letter}'
key?: string; // if has a unicode property name, such as `Script`
value: string; // unicode property value, binary or non-binary
}
*/
return {
generate(){
return 'x'; // return an object that has a `generate` method.
}
}
};
const r12 = new ReRegExp('/\\p{Lu}/u');
console.log(r12.build()); // => 'x', should handle in the `UPCFactory` method.
// The meaning of the config fields can seen in the examples.
{
maxRepeat?: number;
namedGroupConf?: {
[index: string]: string[]|boolean;
};
extractSetAverage?: boolean;
capture?: boolean;
}
-
i
ignore case,/[a-z]/i
is same as/[a-zA-Z]/
-
u
unicode flag -
s
dot all flag
the flags g
m
y
will ignore.
.build()
build a string that match the regexp.
.info()
get a regexp parsed queues, flags, lastRule after remove named captures.
{
rule: '',
context: '',
flags: [],
lastRule: '',
queues: [],
}
-
^
$
the start,end anchors will be ignored. -
(?=)
(?!)
(?<=)
(?<!)
the regexp lookhead,lookbehind will throw an error when runbuild()
. -
\b
\B
will be ignored.
-
/\1(o)/
the capture group\1
will match null, thebuild()
will just outputo
, and/^\1(o)$/.test('o') === true
-
/(o)\1\2/
the capture group\2
will treated as code point of unicode. so thebuild()
will outputoo\u0002
./^(o)\1\2$/.test('oo\u0002') === true
-
/(o\1)/
the capture group\1
will match null,build()
will outputo
,/^(o\1)$/.test('o') === true
-
/[]/
empty character class, thebuild()
method will throw an error, because no character will match it. -
/[^]/
negative empty character class, thebuild()
method will output any character. -
/[^\w\W]/
for the negative charsets, if all the characters are eliminated, thebuild()
will throw an error. the same such as/[^a-zA-Z0-9_\W]/
、/[^\s\S]/
...
Welcome to report to us with issue if you meet any question or bug.