regex <:property>

Documentation for regex <:property> assembled from the following types:

language documentation Regexes

From Regexes

(Regexes) regex <:property>

The character classes mentioned so far are mostly for convenience; another approach is to use Unicode character properties. These come in the form <:property>, where property can be a short or long Unicode General Category name. These use pair syntax.

To match against a Unicode Property:

"a".uniprop('Script');                 # OUTPUT: «Latin␤» 
"a" ~~ / <:Script<Latin>/;           # OUTPUT: «「a」␤» 
"a".uniprop('Block');                  # OUTPUT: «Basic Latin␤» 
"a" ~~ / <:Block('Basic Latin')> /;    # OUTPUT: «「a」␤» 

Unicode General Categories:

Short Long
L Letter
LC Cased_Letter
Lu Uppercase_Letter
Ll Lowercase_Letter
Lt Titlecase_Letter
Lm Modifier_Letter
Lo Other_Letter
M Mark
Mn Nonspacing_Mark
Mc Spacing_Mark
Me Enclosing_Mark
N Number
Nd Decimal_Number (also Digit)
Nl Letter_Number
No Other_Number
P Punctuation (also punct)
Pc Connector_Punctuation
Pd Dash_Punctuation
Ps Open_Punctuation
Pe Close_Punctuation
Pi Initial_Punctuation (may behave like Ps or Pe depending on usage)
Pf Final_Punctuation (may behave like Ps or Pe depending on usage)
Po Other_Punctuation
S Symbol
Sm Math_Symbol
Sc Currency_Symbol
Sk Modifier_Symbol
So Other_Symbol
Z Separator
Zs Space_Separator
Zl Line_Separator
Zp Paragraph_Separator
C Other
Cc Control (also cntrl)
Cf Format
Cs Surrogate
Co Private_Use
Cn Unassigned

For example, <:Lu> matches a single, upper-case letter.

Its negation is this: <:!property>. So, <:!Lu> matches a single character that isn't an upper-case letter.

Categories can be used together, with an infix operator:

Operator Meaning
+ set union
| set union
& set intersection
- set difference (first minus second)
^ symmetric set intersection / XOR

To match either a lower-case letter or a number, write <:Ll+:N> or <:Ll+:Number> or <+ :Lowercase_Letter + :Number>.

It's also possible to group categories and sets of categories with parentheses; for example:

'perl6' ~~ m{\w+(<:Ll+:N>)}  # OUTPUT: «0 => 「6」␤»