Traps to avoid

Traps to avoid when getting started with Perl 6

When learning a programming language, possibly with the background of being familiar with another programming language, there are always some things that can surprise you and might cost valuable time in debugging and discovery.

This document aims to show common misconceptions.

During the making of Perl 6 great pains were taken to get rid of warts in the syntax. When you whack one wart, though, sometimes another pops up. So a lot of time was spent finding the minimum number of warts or trying to put them where they would rarely be seen. Because of this, Perl 6's warts are in different places than you may expect them to be when coming from another language.

Variables and Constants

Constants are Compile Time

Constants are computed at compile time, so if you use them in modules keep in mind their values will be frozen due to pre-compilation:

# WRONG (most likely): 
unit module Something::Or::Other;
constant $config-file = "config.txt".IO.slurp;

The $config-file will be slurped during precompilation and changes to config.txt file won't be re-loaded when you start the script again; only when the module is re-compiled.

Avoiding using a container and binding a value to a variable can offer behaviour similar to constants, so use that instead, if you want the value to get updated:

# Good; file gets updated from 'config.txt' file on each script run: 
unit module Something::Or::Other;
my $config-file := "config.txt".IO.slurp;

Objects

Assigning to attributes

Newcomers often think that, because attributes with accessors are declared as has $.x, they can assign to $.x inside the class. That's not the case.

For example

use v6.c;
class Point {
    has $.x;
    has $.y;
    method double {
        $.x *= 2;   # WRONG 
        $.y *= 2;   # WRONG 
        self;
    }
}
 
say Point.new(x => 1=> -2).double.x

dies with

Cannot modify an immutable Int

in the first line marked with # WRONG, because $.x, short for $( self.x ), is a call to a read-only accessor.

The syntax has $.x is short for something like has $!x; method x() { $!x }, so the actual attribute is called $!x, and a read-only accessor method is automatically generated.

Thus the correct way to write method double is

method double {
    $!x *= 2;
    $!y *= 2;
    self;
}

which operates on the attributes directly.

BUILD prevents automatic attribute initialization from constructor arguments

When you define your own BUILD submethod, you must take care of initializing all attributes yourself. For example

use v6.c;
class A {
    has $.x;
    has $.y;
    submethod BUILD {
        $!y = 18;
    }
}
 
say A.new(x => 42).x;       # OUTPUT: «Any␤» 

leaves $!x uninitialized, because the custom BUILD doesn't initialize it.

One possible remedy is to explicitly initialize the attribute in BUILD:

submethod BUILD(:$x{
    $!y = 18;
    $!x := $x;
}

which can be shortened to:

submethod BUILD(:$!x{
    $!y = 18;
}

Another, more general approach is to leave BUILD alone, and hook into the BUILDALL mechanism instead:

use v6.c;
class A {
    has $.x;
    has $.y;
    method BUILDALL(|c{
        callsame;
        $!y = 18;
        self
    }
}
 
say A.new(x => 42).x;       # OUTPUT: «42␤» 

(Note that BUILDALL is a method, not a submethod. That's because by default, there is only one such method per class hierarchy, whereas BUILD is explicitly called per class. Which also explains why you need the nextsame inside BUILDALL, but not inside BUILD).

Regexes

Whitespace in Regexes does not match literally

$ perl6 -e "say 'a b' ~~ /a b/"
False

Whitespace in regexes is, by default, considered an optional filler without semantics, just like in the rest of the Perl 6 language.

Ways to match whitespace:

Captures

Containers versus values in a Capture

Beginners might expect a variable in a Capture to supply its current value when that Capture is later used. For example:

$ perl6 -e 'my $a = 2; say join ",", ($a, ++$a)'
3,3

Here the Capture contained the container pointed to by $a and the value of the result of the expression ++$a. Since the Capture must be reified before &say can use it, the ++$a may happen before &say looks inside the container in $a and so it may already be incremented.

Instead, use an expression that produces a value when you want a value.

$ perl6 -e 'my $a = 2; say join ",", (+$a, ++$a)'
2,3

Arrays

Referencing the last element of an array

In Perl 5 one could reference the last element of an array by asking for the "-1th" element of the array, e.g.:

my @array = qw{victor alice bob charlie eve};
say @array[-1];    # OUTPUT: «eve␤» 

In Perl 6 it is not possible to use negative subscripts, however the same is achieved by actually using a function, namely *-1. Thus accessing the last element of an array becomes:

my @array = qw{victor alice bob charlie eve};
say @array[*-1];   # OUTPUT: «eve␤» 

Yet another way is to utilize the array's tail method:

my @array = qw{victor alice bob charlie eve};
say @array.tail;      # OUTPUT: «eve␤» 
say @array.tail(2);   # OUTPUT: «(charlie eve)␤» 

Typed Array parameters

Quite often new users will happen to write something like:

sub foo(Array @a{ ... }

...before they have gotten far enough in the documentation to realize that this is asking for an Array of Arrays. To say that @a should only accept Arrays, use instead:

sub foo(@a where Array{ ... }

It is also common to expect this to work, when it does not:

sub bar(Int @a{ 42.say };
bar([123]);             # expected Positional[Int] but got Array 

The problem here is that [1, 2, 3] is not an Array[Int], it is a plain old Array that just happens to have Ints in it. To get it to work, the argument must also be an Array[Int].

my Int @b = 123;
bar(@b);                    # OUTPUT: «42␤» 
bar(Array[Int].new(123));

This may seem inconvenient, but on the upside it moves the type-check on what is assigned to @b to where the assignment happens, rather than requiring every element to be checked on every call.

Strings

Quotes and interpolation

Interpolation in string literals can be too clever for your good.

"$foo<html></html>" # Perl 6 understands that as: 
$foo<html> ~ '</html>'
 
"$foo(" ~ @args ~ ")" # Perl 6 understands that as: 
$foo~ @args ~ ")"   # and will produce very confusing error messages 

You can avoid those problems by surrounding all variables with {}, with or without using Q:c as your quoting construct.

my $a = 1;
say Q:c«{$a}()$b()»;
OUTPUT: «1()$b()␤»

Strings are not iterable

There are methods that Str inherits from Any that work on iterables like lists. Iterators on Strings contain one element that is the whole string. To use sort, reverse and friends, you need to split the string into a list first.

say "cba".sort;              # OUTPUT: «cba␤» (what is wrong) 
say "cba".comb.sort.join;    # OUTPUT: «abc␤» 

Operators

Some operators commonly shared among other languages were repurposed in Perl 6 for other, more common, things:

Junctions

The ^, |, and & are not bitwise operators, they create Junctions. The corresponding bitwise operators in Perl 6 are: +^, +|, +& for integers and ?^, ?|, ?& for booleans.

String Ranges/Sequences

In some languages, using strings as range end points, considers the entire string when figuring out what the next string should be; loosely treating the strings as numbers in a large base. Here's Perl 5 version:

say join """az".."bc"
azbabbbc

Such a range in Perl 6 will produce a different result, where each letter will be ranged to a corresponding letter in the end point, producing more complex sequences:

say join """az".."bc"'
az, ay, ax, aw, av, au, at, as, ar, aq, ap, ao, an, am, al, ak, aj, ai, ah,
ag, af, ae, ad, ac, bz, by, bx, bw, bv, bu, bt, bs, br, bq, bp, bo, bn, bm,
bl, bk, bj, bi, bh, bg, bf, be, bd, bc
 
say join ", ", "r2".."t3"
r2, r3, s2, s3, t2, t3

To achieve simpler behaviour, similar to the Perl 5 example above, use a sequence operator that calls .succ method on the starting string:

say join "", ("az"*.succ ... "bc")
azbabbbc

Topicalizing Operators

The smart match operator ~~ and andthen set the topic $_ to their LHS. In conjunction with implicit method calls on the topic this can lead to surprising results.

my &method = { note $_$_ };
$_ = 'object';
say .&method;
# OUTPUT: «object␤object␤» 
say 'topic' ~~ .&method;
# OUTPUT: «topic␤True␤» 

In many cases flipping the method call to the LHS will work.

my &method = { note $_$_ };
$_ = 'object';
say .&method;
# OUTPUT: «object␤object␤» 
say .&method ~~ 'topic';
# OUTPUT: «object␤False␤» 

Fat Arrow and Constants

The fat arrow operator => will turn words on its left hand side to Str without checking the scope for constants or \-sigiled variables. Use explicit scoping to get what you mean.

constant V = 'x';
my %h = => 'oi‽', ::=> 42;
say %h.perl
# OUTPUT: «{:V("oi‽"), :x(42)}␤» 

Common Precedence Mistakes

Adverbs and Precedence

Adverbs do have a precedence that may not follow the order of operators that is displayed on your screen. If two operators of equal precedence are followed by an adverb it will pick the first operator it finds in the abstract syntax tree. Use parentheses to help Perl 6 understand what you mean or use operators with looser precedence.

my %x = => 42;
say !%x<b>:exists;            # dies with X::AdHoc 
say %x<b>:!exists;            # this works 
say !(%x<b>:exists);          # works too 
say not %x<b>:exists;         # works as well 
say True unless %x<b>:exists# avoid negation altogether 

Ranges and Precedence

The loose precedence of .. can lead to some errors. It is usually best to parenthesize ranges when you want to operate on the entire range.

1..3.say    # says "3" (and warns about useless "..") 
(1..3).say  # says "1..3" 

Loose boolean operators

The precedence of and, or, etc. is looser then routine calls. This can have surprising results for calls to routines that would be operators or statements in other languages like return, last and many others.

sub f {
    return True and False;
    # this is actually 
    # (return True) and False; 
}
say f# OUTPUT: «True␤» 

Exponentiation Operator and Prefix Minus

say -1²;   # OUTPUT: «-1␤» 
say -1**2# OUTPUT: «-1␤» 

When performing a regular mathematical calculation, the power takes precedence over the minus; so -1² can be written as -(1²). Perl 6 matches these rules of mathematics and the precedence of ** operator is tighter than that of the prefix -. If you wish to raise a negative number to a power, use parentheses:

say (-1)²;   # OUTPUT: «1␤» 
say (-1)**2# OUTPUT: «1␤» 

The operator infix:<but> is narrower then the list constructor. When providing a list of roles to mix in, always use parentheses.

role R1 { method m {} }
role R2 { method n {} }
my $a = 1 but R1,R2# R2 is in sink context 
say $a.^name;
# OUTPUT: «Int+{R1}␤» 

Subroutine and method calls

Subroutine and method calls can be made using one of two forms:

foo(...); # function call form, where ... represent the required arguments 
foo ...;  # list op form, where ... represent the required arguments 

The function call form can cause problems for the unwary when whitespace is added after the function or method name and before the opening parenthesis.

First we consider functions with zero or one parameter:

sub foo() { say 'no arg' }
sub bar($a{ say "one arg: $a" }

Then execute each with and without a space after the name:

foo();    # okay: no arg 
foo ();   # FAIL: Too many positionals passed; expected 0 arguments but got 1 
bar($a);  # okay: one arg: 1 
bar ($a); # okay: one arg: 1 

Now declare a function of two parameters:

sub foo($a$b{ say "two args: $a$b" }

Execute it with and without the space after the name:

foo($a$b);  # okay: two args: 1, 2 
foo ($a$b); # FAIL: Too few positionals passed; expected 2 arguments but got 1 

The lesson is: "be careful with spaces following sub and method names when using the function call format." As a general rule, good practice might be to avoid the space after a function name when using the function call format.

Note that there are clever ways to eliminate the error with the function call format and the space, but that is bordering on hackery and will not be mentioned here. For more information, consult Functions.

Finally, note that, currently, when declaring the functions whitespace may be used between a function or method name and the parentheses surrounding the parameter list without problems.

Named Parameters

Many built-in subroutines and method calls accept named parameters and your own code may accept them as well, but be sure the arguments you pass when calling your routines are actually named parameters:

sub foo($a:$b{ ... }
foo(1'b' => 2); # FAIL: Too many positionals passed; expected 1 argument but got 2 

What happened? That second argument is not a named parameter argument, but a Pair passed as a positional argument. If you want a named parameter it has to look like a name to Perl:

foo(1=> 2); # okay 
foo(1:b(2));  # okay 
foo(1:b<it>); # okay 
 
my $b = 2;
foo(1:b($b)); # okay, but redundant 
foo(1:$b);    # okay 
 
# Or even... 
my %arg = 'b' => 2;
foo(1|%arg);  # okay too 

That last one may be confusing, but since it uses the | prefix on a Hash, it means "treat this hash as holding named arguments."

If you really do want to pass them as pairs you should use a List or Capture instead:

my $list = ('b' => 2); # this is a List containing a single Pair 
foo(|$list:$b); # okay: we passed the pair 'b' => 2 to the first argument 
foo(1|$list);   # FAIL: Too many positionals passed; expected 1 argument but got 2 
 
my $cap = \('b' => 2); # a Capture with a single positional value 
foo(|$cap:$b); # okay: we passed the pair 'b' => 2 to the first argument 
foo(1|$cap);   # FAIL: Too many positionals passed; expected 1 argument but got 2 

A Capture is usually the best option for this as it works exactly like the usual capturing of routine arguments during a regular call.

The nice thing about the distinction here is that it gives the developer the option of passing pairs as either named or positional arguments, which can be handy in various instances.

Exception Handling

Sunk Proc

Some methods return a Proc object. If it represents a failed process, Proc itself won't be exception-like, but sinking it will cause an X::Proc::Unsuccessful exception to be thrown. That means this construct will throw, despite the try in place:

try run("perl6""-e""exit 42");
say "still alive";
# OUTPUT: «The spawned process exited unsuccessfully (exit code: 42)␤» 

This is because try receives a Proc and returns it, at which point it sinks and throws. Explicitly sinking it inside the try avoids the issue and ensures the exception is thrown inside the try:

try sink run("perl6""-e""exit 42");
say "still alive";
# OUTPUT: «still alive␤» 

If you're not interested in catching any exceptions, then use an anonymous variable to keep the returned Proc in; this way it'll never sink:

$ = run("perl6""-e""exit 42");
say "still alive";
# OUTPUT: «still alive␤» 

Using Shortcuts

The ^ twigil

Using the ^ twigil can save a fair amount of time and space when writing out small blocks of code. As an example:

for 1..8 -> $a$b { say $a + $b}

can be shortened to just

for 1..8 { say $^a + $^b}

The trouble arises when a person wants to use more complex names for the variables, instead of just one letter. The ^ twigil is able to have the positional variables be out of order and named whatever you want, but assigns values based on the variable's Unicode ordering. In the above example, we can have $^a and $^b switch places, and those variables will keep their positional values. This is because the Unicode character 'a' comes before the character 'b'. For example:

#In order 
sub f1 { say "$^first $^second"}
f1 "Hello""there";    # OUTPUT: «Hello There␤» 
 
#Out of order 
sub f2 { say "$^second $^first"}
f2 "Hello""there";    # OUTPUT: «there Hello␤» 

Due to the variables allowed to be called anything, this can cause some problems if you are not accustomed to how Perl 6 handles these variables.

for 1..4 { say "$^one $^two $^three $^four"}    # OUTPUT: «2 4 3 1␤» 

Scope

Using a once block

The once block is a block of code that will only run once when its parent block is run. As an example:

my $var = 0;
for 1..10 {
    once { $var++}
}
say "Variable = $var";    # OUTPUT: «Variable = 1␤» 

This functionality also applies to other code blocks like sub and while, not just for loops. Problems arise though, when trying to nest once blocks inside of other code blocks:

my $var = 0;
for 1..10 {
    do { once { $var++} }
}
say "Variable = $var";    # OUTPUT: «Variable = 10␤» 

In the above example, the once block was nested inside of a code block which was inside of a for loop code block. This causes the once block to run multiple times, because the once block uses state variables to determine whether it has run previously. This means that if the parent code block goes out of scope, then the state variable the once block uses to keep track of if it has run previously, goes out of scope as well. This is why once blocks and state variables can cause some unwanted behaviour when buried within more than one code block.

If you want to have something that will emulate the functionality of a once block, but still work when buried a few code blocks deep, we can manually build the functionality of a once block. Using the above example, we can change it so that it will only run once, even when inside the do block by changing the scope of the state variable.

my $var = 0;
for 1..10 {
    state $run-code = True;
    do { if ($run-code{ $run-code = False$var++} }
}
say "Variable = $var";    # OUTPUT: «Variable = 1␤» 

In this example, we essentially manually build a once block by making a state variable called $run-code at the highest level that will be run more than once, then checking to see if $run-code is True using a regular if. If the variable $run-code is True, then make the variable False and continue with the code that should only be completed once.

The main difference between using a state variable like the above example and using a regular once block is what scope the state variable is in. The scope for the state variable created by the once block, is the same as where you put the block (imagine that the word 'once' is replaced with a state variable and an if to look at the variable). The example above using state variables works because the variable is at the highest scope that will be repeated; whereas the example that has a once block inside of a do, made the variable within the do block which is not the highest scope that is repeated.

Using a once block inside a class method will cause the once state to carry across all instances of that class. For example:

class A {
    method sayit() { once say 'hi' }
}
my $a = A.new;
$a.sayit;      # OUTPUT: «hi␤» 
my $b = A.new;
$b.sayit;      # nothing