regex <[ ]>
Documentation for regex
<[ ]> assembled from the following types:
Sometimes the pre-existing wildcards and character classes are not enough. Fortunately, defining your own is fairly simple. Within
<[ ]>, you can put any number of single characters and ranges of characters (expressed with two dots between the end points), with or without whitespace.
"abacabadabacaba" ~~ / /;# Unicode hex codepoint range"ÀÁÂÃÄÅÆ" ~~ / /;# Unicode named codepoint range"αβγ" ~~ //;
< > you can use
- to add or remove multiple range definitions and even mix in some of the unicode categories above. You can also write the backslashed forms for character classes between the
/ /;# starts with \d and removes odd ASCII digits, but not quite the same as/ /;# because the first one also contains "weird" unicodey digits
To negate a character class, put a
- after the opening angle:
say 'no quotes' ~~ / + /; # matches characters except "
A common pattern for parsing quote-delimited strings involves negated character classes:
say '"in quotes"' ~~ / '"' * '"'/;
This first matches a quote, then any characters that aren't quotes, and then a quote again. The meaning of
+ in the examples above are explained in section Quantifier.
Just as you can use the
- for both set difference and negation of a single value, you can also explicitly put a
+ in front:
/ / # same as <>