Traps to avoid

Traps to avoid when getting started with Perl 6

When learning a programming language, possibly with the background of being familiar with another programming language, there are always some things that can surprise you and might cost valuable time in debugging and discovery.

This document aims to show common misconceptions.

During the making of Perl 6 great pains were taken to get rid of warts in the syntax. When you whack one wart, though, sometimes another pops up. So a lot of time was spent finding the minimum number of warts or trying to put them where they would rarely be seen. Because of this, Perl 6's warts are in different places than you may expect them to be when coming from another language.

Variables and Constants

Constants are Compile Time

Constants are computed at compile time, so if you use them in modules keep in mind their values will be frozen due to pre-compilation:

    # WRONG (most likely): 
    unit module Something::Or::Other;
    constant $config-file = "config.txt".IO.slurp;

The $config-file will be slurped during precompilation and changes to config.txt file won't be re-loaded when you start the script again; only when the module is re-compiled.

Avoiding using a container and binding a value to a variable can offer behaviour similar to constants, so use that instead, if you want the value to get updated:

    # Good; file gets updated from 'config.txt' file on each script run: 
    unit module Something::Or::Other;
    my $config-file := "config.txt".IO.slurp;

Objects

Assigning to attributes

Newcomers often think that, because attributes with accessors are declared as has $.x, they can assign to $.x inside the class. That's not the case.

For example

    use v6.c;
    class Point {
        has $.x;
        has $.y;
        method double {
            $.x *= 2;   # WRONG 
            $.y *= 2;   # WRONG 
            self;
        }
    }
 
    say Point.new(x => 1=> -2).double.x
    # OUTPUT: «Cannot assign to an immutable value␤» 

in the first line marked with # WRONG, because $.x, short for $( self.x ), is a call to a read-only accessor.

The syntax has $.x is short for something like has $!x; method x() { $!x }, so the actual attribute is called $!x, and a read-only accessor method is automatically generated.

Thus the correct way to write method double is

        method double {
            $!x *= 2;
            $!y *= 2;
            self;
        }

which operates on the attributes directly.

BUILD prevents automatic attribute initialization from constructor arguments

When you define your own BUILD submethod, you must take care of initializing all attributes yourself. For example

    use v6.c;
    class A {
        has $.x;
        has $.y;
        submethod BUILD {
            $!y = 18;
        }
    }
 
    say A.new(x => 42).x;       # OUTPUT: «Any␤» 

leaves $!x uninitialized, because the custom BUILD doesn't initialize it.

One possible remedy is to explicitly initialize the attribute in BUILD:

        submethod BUILD(:$x{
            $!y = 18;
            $!x := $x;
        }

which can be shortened to:

        submethod BUILD(:$!x{
            $!y = 18;
        }

Another, more general approach is to leave BUILD alone, and hook into the BUILDALL mechanism instead:

    use v6.c;
    class A {
        has $.x;
        has $.y;
        method BUILDALL(|c{
            callsame;
            $!y = 18;
            self
        }
    }
 
    say A.new(x => 42).x;       # OUTPUT: «42␤» 

(Note that BUILDALL is a method, not a submethod. That's because by default, there is only one such method per class hierarchy, whereas BUILD is explicitly called per class. Which also explains why you need the callsame inside BUILDALL, but not inside BUILD).

Whitespace

Whitespace in Regexes does not match literally

    say 'a b' ~~ /a b/# OUTPUT: «False␤» 

Whitespace in regexes is, by default, considered an optional filler without semantics, just like in the rest of the Perl 6 language.

Ways to match whitespace:

Ambiguities in Parsing

While some languages will let you get away with removing as much whitespace between tokens as possible, Perl 6 is less forgiving. The overarching mantra is we discourage code golf, so don't scrimp on whitespace (the more serious underlying reason behind these restrictions is single-pass parsing and ability to parse Perl 6 programs with virtually no backtracking).

The common areas you should watch out for are:

Block vs. Hash slice ambiguity

    # WRONG; trying to hash-slice a Bool: 
    while ($++ > 5){ .say }
    # RIGHT: 
    while ($++ > 5{ .say }
 
    # EVEN BETTER; Perl 6 does not require parentheses there: 
    while $++ > 5 { .say }

Reduction vs. Array constructor ambiguity

    # WRONG; ambiguity with `[<]` meta op: 
    my @a = [[<foo>],];
    # RIGHT; reductions cannot have spaces in them, so put one in: 
    my @a = [[ <foo>],];
 
    # No ambiguity here, natural spaces between items suffice to resolve it: 
    my @a = [[<foo bar ber>],];

Captures

Containers versus values in a Capture

Beginners might expect a variable in a Capture to supply its current value when that Capture is later used. For example:

    my $a = 2say join ",", ($a++$a):  # OUTPUT: «3,3␤» 

Here the Capture contained the container pointed to by $a and the value of the result of the expression ++$a. Since the Capture must be reified before &say can use it, the ++$a may happen before &say looks inside the container in $a and so it may already be incremented.

Instead, use an expression that produces a value when you want a value.

    my $a = 2say join ",", (+$a++$a); # OUTPUT: «2,3␤» 

Cool tricks

Perl6 includes a Cool class, which provides some of the DWIM behaviors we got used to in Perl5 by coercing arguments when necessary. However, DWIM is never perfect.

Strings are not Lists, so beware indexing

In Perl6, strings are not lists of characters. One cannot iterate over them or index into them as you can with lists, despite the name of the .index routine.

Lists become strings, so beware .index()ing

List inherits from Cool, which provides access to .index. Because of the way .index coerces a List into a Str, this can sometimes appear to be returning the index of an element in the list, but that is not how the behavior is defined.

    my @a = <a b c d>;
    say @a.index(a);    # 0 
    say @a.index('c');    # 4 -- not 2! 
    say @a.index('b c');  # 2 -- not undefined! 
    say @a.index(<a b>);  # 0 -- not undefined! 

These same caveats apply to .rindex.

Lists become strings, so beware .contains

Similarly, .contains does not look for elements in the list.

    my @menu = <hamburger fries milkshake>;
    say @menu.contains('hamburger');            # True 
    say @menu.contains('hot dog');              # False 
    say @menu.contains('milk');                 # True! 
    say @menu.contains('er fr');                # True! 
    say @menu.contains(<es mi>);                # True! 

If you actually want to check for the presence of an element, use the (cont) operator for single elements, and the superset and strict superset operators for multiple elements.

    my @menu = <hamburger fries milkshake>;
    say @menu (cont) 'fries';                   # True 
    say @menu (cont) 'milk';                    # False 
    say @menu (>) <hamburger fries>;            # True 
    say @menu (>) <milkshake fries>;            # True (! NB: order doesn't matter) 

If you are doing a lot of element testing, you may be better off using a Set.

Numeric literals are parsed before coercion

Experienced programmers will probably not be surprised by this, but Numeric literals will be parsed into their numeric value before being coerced into a string, which may create nonintuitive results.

    say 0xff.contains(55);      # True 
    say 0xff.contains(0xf);     # False 
    say 12_345.contains("23");  # True 
    say 12_345.contains("2_");  # False 

Getting a random item from a List

A common task is to retrieve one or more random elements from a collection, but List.rand isn't the way to do that. Cool provides rand, but that first coerces the List into the number of items in the list, and returns a random real number between 0 and that value. To get random elements, see pick and roll.

    my @colors = <red orange yellow green blue indigo violet>;
    say @colors.rand;       # 2.21921955680514 
    say @colors.pick;       # orange 
    say @colors.roll;       # blue 
    say @colors.pick(2);    # yellow violet  (cannot repeat) 
    say @colors.roll(3);    # red green red  (can repeat) 

Arrays

Referencing the last element of an array

In Perl 5 one could reference the last element of an array by asking for the "-1th" element of the array, e.g.:

    my @array = qw{victor alice bob charlie eve};
    say @array[-1];    # OUTPUT: «eve␤» 

In Perl 6 it is not possible to use negative subscripts, however the same is achieved by actually using a function, namely *-1. Thus accessing the last element of an array becomes:

    my @array = qw{victor alice bob charlie eve};
    say @array[*-1];   # OUTPUT: «eve␤» 

Yet another way is to utilize the array's tail method:

    my @array = qw{victor alice bob charlie eve};
    say @array.tail;      # OUTPUT: «eve␤» 
    say @array.tail(2);   # OUTPUT: «(charlie eve)␤» 

Typed Array parameters

Quite often new users will happen to write something like:

    sub foo(Array @a{ ... }

...before they have gotten far enough in the documentation to realize that this is asking for an Array of Arrays. To say that @a should only accept Arrays, use instead:

    sub foo(@a where Array{ ... }

It is also common to expect this to work, when it does not:

    sub bar(Int @a{ 42.say };
    bar([123]);             # expected Positional[Int] but got Array 

The problem here is that [1, 2, 3] is not an Array[Int], it is a plain old Array that just happens to have Ints in it. To get it to work, the argument must also be an Array[Int].

    my Int @b = 123;
    bar(@b);                    # OUTPUT: «42␤» 
    bar(Array[Int].new(123));

This may seem inconvenient, but on the upside it moves the type-check on what is assigned to @b to where the assignment happens, rather than requiring every element to be checked on every call.

Strings

Quotes and interpolation

Interpolation in string literals can be too clever for your own good.

    "$foo<html></html>" # Perl 6 understands that as: 
    "$foo{'html'}{'/html'}"
    "$foo(" ~ @args ~ ")" # Perl 6 understands that as: 
    "$foo(' ~ @args ~ ')"  

You can avoid those problems using non-interpolating single quotes and switching to more liberal interpolation with \qq[] escape sequence:

    my $a = 1;
    say '\qq[$a]()$b()';
    # OUTPUT: «1()$b()␤» 

Another alternative is to use Q:c quoter, and use code blocks {} for all interpolation:

    my $a = 1;
    say Q:c«{$a}()$b()»;
    # OUTPUT: «1()$b()␤» 

Strings are not iterable

There are methods that Str inherits from Any that work on iterables like lists. Iterators on strings contain one element that is the whole string. To use list-based methods like sort, reverse, you need to convert the string into a list first.

    say "cba".sort;              # OUTPUT: «(cba)␤» 
    say "cba".comb.sort.join;    # OUTPUT: «abc␤» 

Allomorphs Generally Follow Numeric Semantics

Str "0" is True, while Numeric is False. So what's the Bool value of allomorph <0>?

In general, allomorphs follow Numeric semantics, so the ones that numerically evaluate to zero are False:

say so   <0># OUTPUT: «False␤» 
say so <0e0># OUTPUT: «False␤» 
say so <0.0># OUTPUT: «False␤» 

To force comparison being done for the Stringy part of the allomorph, use prefix ~ operator or the Str method to coerce the allomorph to Str, or use the chars routine to test whether the allomorph has any length:

say so      ~<0>;     # OUTPUT: «True␤» 
say so       <0>.Str# OUTPUT: «True␤» 
say so chars <0>;     # OUTPUT: «True␤» 

Operators

Some operators commonly shared among other languages were repurposed in Perl 6 for other, more common, things:

Junctions

The ^, |, and & are not bitwise operators, they create Junctions. The corresponding bitwise operators in Perl 6 are: +^, +|, +& for integers and ?^, ?|, ?& for booleans.

String Ranges/Sequences

In some languages, using strings as range end points, considers the entire string when figuring out what the next string should be; loosely treating the strings as numbers in a large base. Here's Perl 5 version:

    say join """az".."bc";
    # OUTPUT: «az, ba, bb, bc␤» 

Such a range in Perl 6 will produce a different result, where each letter will be ranged to a corresponding letter in the end point, producing more complex sequences:

    say join """az".."bc";
    #`{ OUTPUT: «
        az, ay, ax, aw, av, au, at, as, ar, aq, ap, ao, an, am, al, ak, aj, ai, ah,
        ag, af, ae, ad, ac, bz, by, bx, bw, bv, bu, bt, bs, br, bq, bp, bo, bn, bm,
        bl, bk, bj, bi, bh, bg, bf, be, bd, bc
    ␤»}
    say join """r2".."t3";
    # OUTPUT: «r2, r3, s2, s3, t2, t3␤» 

To achieve simpler behaviour, similar to the Perl 5 example above, use a sequence operator that calls .succ method on the starting string:

    say join "", ("az"*.succ ... "bc");
    # OUTPUT: «az, ba, bb, bc␤» 

Topicalizing Operators

The smart match operator ~~ and andthen set the topic $_ to their LHS. In conjunction with implicit method calls on the topic this can lead to surprising results.

    my &method = { note $_$_ };
    $_ = 'object';
    say .&method;
    # OUTPUT: «object␤object␤» 
    say 'topic' ~~ .&method;
    # OUTPUT: «topic␤True␤» 

In many cases flipping the method call to the LHS will work.

    my &method = { note $_$_ };
    $_ = 'object';
    say .&method;
    # OUTPUT: «object␤object␤» 
    say .&method ~~ 'topic';
    # OUTPUT: «object␤False␤» 

Fat Arrow and Constants

The fat arrow operator => will turn words on its left hand side to Str without checking the scope for constants or \-sigiled variables. Use explicit scoping to get what you mean.

    constant V = 'x';
    my %h = => 'oi‽', ::=> 42;
    say %h.perl
    # OUTPUT: «{:V("oi‽"), :x(42)}␤» 

Common Precedence Mistakes

Adverbs and Precedence

Adverbs do have a precedence that may not follow the order of operators that is displayed on your screen. If two operators of equal precedence are followed by an adverb it will pick the first operator it finds in the abstract syntax tree. Use parentheses to help Perl 6 understand what you mean or use operators with looser precedence.

    my %x = => 42;
    say !%x<b>:exists;            # dies with X::AdHoc 
    say %x<b>:!exists;            # this works 
    say !(%x<b>:exists);          # works too 
    say not %x<b>:exists;         # works as well 
    say True unless %x<b>:exists# avoid negation altogether 

Ranges and Precedence

The loose precedence of .. can lead to some errors. It is usually best to parenthesize ranges when you want to operate on the entire range.

    1..3.say;    # says "3" (and warns about useless "..") 
    (1..3).say;  # says "1..3" 

Loose boolean operators

The precedence of and, or, etc. is looser than routine calls. This can have surprising results for calls to routines that would be operators or statements in other languages like return, last and many others.

    sub f {
        return True and False;
        # this is actually 
        # (return True) and False; 
    }
    say f# OUTPUT: «True␤» 

Exponentiation Operator and Prefix Minus

    say -1²;   # OUTPUT: «-1␤» 
    say -1**2# OUTPUT: «-1␤» 

When performing a regular mathematical calculation, the power takes precedence over the minus; so -1² can be written as -(1²). Perl 6 matches these rules of mathematics and the precedence of ** operator is tighter than that of the prefix -. If you wish to raise a negative number to a power, use parentheses:

    say (-1)²;   # OUTPUT: «1␤» 
    say (-1)**2# OUTPUT: «1␤» 

The operator infix:<but> is narrower than the list constructor. When providing a list of roles to mix in, always use parentheses.

    role R1 { method m {} }
    role R2 { method n {} }
    my $a = 1 but R1,R2# R2 is in sink context 
    say $a.^name;
    # OUTPUT: «Int+{R1}␤» 

Subroutine and method calls

Subroutine and method calls can be made using one of two forms:

  foo(...); # function call form, where ... represent the required arguments 
  foo ...;  # list op form, where ... represent the required arguments 

The function call form can cause problems for the unwary when whitespace is added after the function or method name and before the opening parenthesis.

First we consider functions with zero or one parameter:

  sub foo() { say 'no arg' }
  sub bar($a{ say "one arg: $a" }

Then execute each with and without a space after the name:

  foo();    # okay: no arg 
  foo ();   # FAIL: Too many positionals passed; expected 0 arguments but got 1 
  bar($a);  # okay: one arg: 1 
  bar ($a); # okay: one arg: 1 

Now declare a function of two parameters:

  sub foo($a$b{ say "two args: $a$b" }

Execute it with and without the space after the name:

  foo($a$b);  # okay: two args: 1, 2 
  foo ($a$b); # FAIL: Too few positionals passed; expected 2 arguments but got 1 

The lesson is: "be careful with spaces following sub and method names when using the function call format." As a general rule, good practice might be to avoid the space after a function name when using the function call format.

Note that there are clever ways to eliminate the error with the function call format and the space, but that is bordering on hackery and will not be mentioned here. For more information, consult Functions.

Finally, note that, currently, when declaring the functions whitespace may be used between a function or method name and the parentheses surrounding the parameter list without problems.

Named Parameters

Many built-in subroutines and method calls accept named parameters and your own code may accept them as well, but be sure the arguments you pass when calling your routines are actually named parameters:

    sub foo($a:$b{ ... }
    foo(1'b' => 2); # FAIL: Too many positionals passed; expected 1 argument but got 2 

What happened? That second argument is not a named parameter argument, but a Pair passed as a positional argument. If you want a named parameter it has to look like a name to Perl:

    foo(1=> 2); # okay 
    foo(1:b(2));  # okay 
    foo(1:b<it>); # okay 
 
    my $b = 2;
    foo(1:b($b)); # okay, but redundant 
    foo(1:$b);    # okay 
 
    # Or even... 
    my %arg = 'b' => 2;
    foo(1|%arg);  # okay too 

That last one may be confusing, but since it uses the | prefix on a Hash, it means "treat this hash as holding named arguments."

If you really do want to pass them as pairs you should use a List or Capture instead:

    my $list = ('b' => 2); # this is a List containing a single Pair 
    foo(|$list:$b); # okay: we passed the pair 'b' => 2 to the first argument 
    foo(1|$list);   # FAIL: Too many positionals passed; expected 1 argument but got 2 
    my $cap = \('b' => 2); # a Capture with a single positional value 
    foo(|$cap:$b); # okay: we passed the pair 'b' => 2 to the first argument 
    foo(1|$cap);   # FAIL: Too many positionals passed; expected 1 argument but got 2 

A Capture is usually the best option for this as it works exactly like the usual capturing of routine arguments during a regular call.

The nice thing about the distinction here is that it gives the developer the option of passing pairs as either named or positional arguments, which can be handy in various instances.

Input and Output

Closing Open File Handles and Pipes

Unlike some other languages, Perl 6 does not use reference counting, and so the file handles are NOT closed when they go out of scope. You have to explicitly close them either by using close routine or using the :close argument several of IO::Handle's methods accept. See IO::Handle.close for details.

The same rules apply to IO::Handle's subclass IO::Pipe, which is what you operate on when reading from a Proc you get with routines run and shell.

The caveat applies to IO::CatHandle type as well, though not as severely. See IO::CatHandle.close for details.

IO::Path Stringification

Partly for historical reasons and partly by design, an IO::Path object stringifies without considering its CWD attribute, which means if you chdir and then stringify an IO::Path, the resultant string won't reference the original filesystem object:

    with 'foo'.IO {
        .Str.say;       # OUTPUT: «foo␤» 
        .relative.say;  # OUTPUT: «foo␤» 
 
        chdir "/tmp";
        .Str.say;       # OUTPUT: «foo␤» 
        .relative.say   # OUTPUT: «../home/camelia/foo␤» 
    }

The easy way to avoid this issue is to not stringify an IO::Path object at all. Core routines that work with paths can take an IO::Path object, so you don't need to stringify the paths.

If you do have a case where you need a stringified version of an IO::Path, use absolute or relative methods to stringify it into an absolute or relative path, respectively.

Exception Handling

Sunk Proc

Some methods return a Proc object. If it represents a failed process, Proc itself won't be exception-like, but sinking it will cause an X::Proc::Unsuccessful exception to be thrown. That means this construct will throw, despite the try in place:

    try run("perl6""-e""exit 42");
    say "still alive";
    # OUTPUT: «The spawned process exited unsuccessfully (exit code: 42)␤» 

This is because try receives a Proc and returns it, at which point it sinks and throws. Explicitly sinking it inside the try avoids the issue and ensures the exception is thrown inside the try:

    try sink run("perl6""-e""exit 42");
    say "still alive";
    # OUTPUT: «still alive␤» 

If you're not interested in catching any exceptions, then use an anonymous variable to keep the returned Proc in; this way it'll never sink:

    $ = run("perl6""-e""exit 42");
    say "still alive";
    # OUTPUT: «still alive␤» 

Using Shortcuts

The ^ twigil

Using the ^ twigil can save a fair amount of time and space when writing out small blocks of code. As an example:

    for 1..8 -> $a$b { say $a + $b}

can be shortened to just

    for 1..8 { say $^a + $^b}

The trouble arises when a person wants to use more complex names for the variables, instead of just one letter. The ^ twigil is able to have the positional variables be out of order and named whatever you want, but assigns values based on the variable's Unicode ordering. In the above example, we can have $^a and $^b switch places, and those variables will keep their positional values. This is because the Unicode character 'a' comes before the character 'b'. For example:

    #In order 
    sub f1 { say "$^first $^second"}
    f1 "Hello""there";    # OUTPUT: «Hello There␤» 
    #Out of order 
    sub f2 { say "$^second $^first"}
    f2 "Hello""there";    # OUTPUT: «there Hello␤» 

Due to the variables allowed to be called anything, this can cause some problems if you are not accustomed to how Perl 6 handles these variables.

    for 1..4 { say "$^one $^two $^three $^four"}    # OUTPUT: «2 4 3 1␤» 

Scope

Using a once block

The once block is a block of code that will only run once when its parent block is run. As an example:

    my $var = 0;
    for 1..10 {
        once { $var++}
    }
    say "Variable = $var";    # OUTPUT: «Variable = 1␤» 

This functionality also applies to other code blocks like sub and while, not just for loops. Problems arise though, when trying to nest once blocks inside of other code blocks:

    my $var = 0;
    for 1..10 {
        do { once { $var++} }
    }
    say "Variable = $var";    # OUTPUT: «Variable = 10␤» 

In the above example, the once block was nested inside of a code block which was inside of a for loop code block. This causes the once block to run multiple times, because the once block uses state variables to determine whether it has run previously. This means that if the parent code block goes out of scope, then the state variable the once block uses to keep track of if it has run previously, goes out of scope as well. This is why once blocks and state variables can cause some unwanted behaviour when buried within more than one code block.

If you want to have something that will emulate the functionality of a once block, but still work when buried a few code blocks deep, we can manually build the functionality of a once block. Using the above example, we can change it so that it will only run once, even when inside the do block by changing the scope of the state variable.

    my $var = 0;
    for 1..10 {
        state $run-code = True;
        do { if ($run-code{ $run-code = False$var++} }
    }
    say "Variable = $var";    # OUTPUT: «Variable = 1␤» 

In this example, we essentially manually build a once block by making a state variable called $run-code at the highest level that will be run more than once, then checking to see if $run-code is True using a regular if. If the variable $run-code is True, then make the variable False and continue with the code that should only be completed once.

The main difference between using a state variable like the above example and using a regular once block is what scope the state variable is in. The scope for the state variable created by the once block, is the same as where you put the block (imagine that the word 'once' is replaced with a state variable and an if to look at the variable). The example above using state variables works because the variable is at the highest scope that will be repeated; whereas the example that has a once block inside of a do, made the variable within the do block which is not the highest scope that is repeated.

Using a once block inside a class method will cause the once state to carry across all instances of that class. For example:

    class A {
        method sayit() { once say 'hi' }
    }
    my $a = A.new;
    $a.sayit;      # OUTPUT: «hi␤» 
    my $b = A.new;
    $b.sayit;      # nothing