regex :

Documentation for regex : assembled from the following types:

language documentation Regexes

From Regexes

(Regexes) regex :

You can prevent backtracking in regexes by attaching a : modifier to the quantifier:

my $str = "ACG GCT ACT An interesting chain";
say $str ~~ /<[ACGT\s]>+ \s+ (<[A..Z a..z \s]>+)/;
# OUTPUT: «「ACG GCT ACT An interesting chain」␤ 0 => 「An interesting chain」␤» 
say $str ~~ /<[ACGT\s]>+\s+ (<[A..Z a..z \s]>+)/;
# OUTPUT: «Nil␤» 

In the second case, the A in An had already been "absorbed" by the pattern, preventing the matching of the second part of the pattern, after \s+. Generally we will want the opposite: prevent backtracking to match precisely what we are looking for.

In most cases, you will want to prevent backtracking for efficiency reasons, for instance here:

    say $str ~~ m:g/[(<[ACGT]> **: 3) \s*]+ \s+ (<[A..Z a..z \s]>+)/;
    # OUTPUT: 
    # (「ACG GCT ACT An interesting chain」 
    # 0 => 「ACG」 
    # 0 => 「GCT」 
    # 0 => 「ACT」 
    # 1 => 「An interesting chain」) 

Although in this case, eliminating the : from behind ** would make it behave exactly in the same way. The best use is to create tokens that will not be backtracked:

$_ = "ACG GCT ACT IDAQT";
say  m:g/[(\w+:) \s*]+ (\w+$$/;
# OUTPUT: 
# (「ACG GCT ACT IDAQT」 
# 0 => 「ACG」 
# 0 => 「GCT」 
# 0 => 「ACT」 
# 1 => 「IDAQT」) 

Without the : following \w+, the ID part captured would have been simply T, since the pattern would go ahead and match everything, leaving a single letter to match the \w+ expression at the end of tne line.