A low-level explanation of Perl 6 containers
This article started as a conversation on IRC explaining the difference between the
Array and the
List type in Perl 6. It explains the levels of indirection involved in dealing with variables and container elements.
Some people like to say "everything is an object", but in fact a variable is not a user-exposed object in Perl 6.
When the compiler encounters a variable declaration like
my $x, it registers it in some internal symbol table. This internal symbol table is used to detect undeclared variables and to tie the code generation for the variable to the correct scope.
At run time, a variable appears as an entry in a lexical pad, or lexpad for short. This is a per-scope data structure that stores a pointer for each variable.
In the case of
my $x, the lexpad entry for the variable
$x is a pointer to an object of type
Scalar, usually just called the container.
Although objects of type
Scalar are everywhere in Perl 6, you rarely see them directly as objects, because most operations decontainerize, which means they act on the
Scalar container's contents instead of the container itself.
In code like
my = 42;say ;
$x = 42 stores a pointer to the
Int object 42 in the scalar container to which the lexpad entry for
The assignment operator asks the container on the left to store the value on its right. What exactly that means is up to the container type. For
Scalar it means "replace the previously stored value with the new one".
Note that subroutine signatures allow passing around of containers:
sub f( is rw)my = 42;f();say ; # OUTPUT: «23␤»
Inside the subroutine, the lexpad entry for
$a points to the same container that
$x points to outside the subroutine. Which is why assignment to
$a also modifies the contents of
Likewise a routine can return a container if it is marked as
my = 23;sub f() is rw ;f() = 42;say ; # OUTPUT: «42␤»
For explicit returns,
return-rw instead of
return must be used.
Returning a container is how
is rw attribute accessors work. So
is equivalent to
Scalar containers are transparent to type checks and most kinds of read-only accesses. A
.VAR makes them visible:
my = 42;say .^name; # OUTPUT: «Int␤»say .VAR.^name; # OUTPUT: «Scalar␤»
is rw on a parameter requires the presence of a writable Scalar container:
sub f( is rw) ;f 42;CATCH ;# OUTPUT: «X::Parameter::RW: Parameter '$x' expected a writable container, but got Int value␤»
Callable containers provide a bridge between the syntax of a Routine call and the actual call of the method CALL-ME of the object that is stored in the container. The sigil
& is required when declaring the container and has to be omitted when executing the
Callable. The default type constraint is Callable.
my = ->callable( ⅓ );callable( 3 );
The sigil has to be provided when referring to the value stored in the container. This in turn allows
Routines to be uses as arguments to calls.
sub f()my = subsub caller(, )caller(, );
Next to assignment, Perl 6 also supports binding with the
:= operator. When binding a value or a container to a variable, the lexpad entry of the variable is modified (and not just the container it points to). If you write
my := 42;
then the lexpad entry for
$x directly points to the
Int 42. Which means that you cannot assign to it anymore:
my := 42;= 23;CATCH ;# OUTPUT: «X::AdHoc: Cannot assign to an immutable value␤»
You can also bind variables to other variables:
my = 0;my = 0;:= ;= 42;say ; # OUTPUT: «42␤»
Here, after the initial binding, the lexpad entries for
$b both point to the same scalar container, so assigning to one variable also changes the contents of the other.
You've seen this situation before: it is exactly what happened with the signature parameter marked as
my = 42;my \b = ;b++;say ; # OUTPUT: «43␤»sub f( is raw)f();say ; # OUTPUT: «44␤»
There are a number of positional container types with slightly different semantics in Perl 6. The most basic one is List; it is created by the comma operator.
say (1, 2, 3).^name; # OUTPUT: «List␤»
A list is immutable, which means you cannot change the number of elements in a list. But if one of the elements happens to be a scalar container, you can still assign to it:
my = 42;(, 1, 2) = 23;say ; # OUTPUT: «23␤»(, 1, 2) = 23; # Cannot modify an immutable valueCATCH ;# OUTPUT: «X::Assignment::RO: Cannot modify an immutable Int␤»
So the list doesn't care about whether its elements are values or containers, they just store and retrieve whatever was given to them.
Lists can also be lazy; in that case, elements at the end are generated on demand from an iterator.
Array is just like a list, except that it forces all its elements to be containers, which means that you can always assign to elements:
my = 1, 2, 3; = 42;say ; # OUTPUT: «[42 2 3]␤»
@a actually stores three scalar containers.
@a returns one of them, and the assignment operator replaces the integer value stored in that container with the new one,
Assignment to a scalar variable and to an array variable both do the same thing: discard the old value(s), and enter some new value(s).
Nevertheless, it's easy to observe how different they are:
my = 42; say .^name; # OUTPUT: «Int␤»my = 42; say .^name; # OUTPUT: «Array␤»
This is because the
Scalar container type hides itself well, but
Array makes no such effort. Also assignment to an array variable is coercive, so you can assign a non-array value to an array variable.
To place a non-
Array into an array variable, binding works:
my := (1, 2, 3);say .^name; # OUTPUT: «List␤»
As a curious side note, Perl 6 supports binding to array elements:
my = (1, 2, 3); := my ;= 42;say ; # OUTPUT: «[42 2 3]␤»
If you've read and understood the previous explanations, it is now time to wonder how this can possibly work. After all, binding to a variable requires a lexpad entry for that variable, and while there is one for an array, there aren't lexpad entries for each array element, because you cannot expand the lexpad at run time.
The answer is that binding to array elements is recognized at the syntax level and instead of emitting code for a normal binding operation, a special method (called
BIND-KEY) is called on the array. This method handles binding to array elements.
Note that, while supported, one should generally avoid directly binding uncontainerized things into array elements. Doing so may produce counter-intuitive results when the array is used later.
my = (1, 2, 3); := 42; # This is not recommended, use assignment instead.my := 42; := ; # Nor is this. = ; # ...but this is fine.[1, 2] := 1, 2; # runtime error: X::Bind::SliceCATCH ;# OUTPUT: «X::Bind::Slice: Cannot bind to Array slice␤»
Operations that mix Lists and Arrays generally protect against such a thing happening accidentally.
@ sigils in Perl 6 generally indicate multiple values to an iteration construct, whereas the
$ sigil indicates only one value.
my = 1, 2, 3;for ; # 3 iterationsmy = (1, 2, 3);for ; # 1 iteration
@-sigiled variables do not flatten in list context:
my = 1, 2, 3;my = , 4, 5;say .elems; # OUTPUT: «3␤»
There are operations that flatten out sublists that are not inside a scalar container: slurpy parameters (
*@a) and explicit calls to
my = 1, 2, 3;say (flat , 4, 5).elems; # OUTPUT: «5␤»sub f(*) ;say f , 4, 5; # OUTPUT: «5␤»
As hinted above, scalar containers prevent that flattening:
sub f(*) ;my = 1, 2, 3;say f $, 4, 5; # OUTPUT: «3␤»
@ character can also be used as a prefix to coerce the argument to a list, thus removing a scalar container:
my = (1, 2, 3);.say for @; # 3 iterations
<> is more appropriate to decontainerize items that aren't lists:
my = ^Inf .grep: *.is-prime;say "$_ is prime" for @; # WRONG! List keeps values, thus leaking memorysay "$_ is prime" for <>; # RIGHT. Simply decontainerize the Seqmy := ^Inf .grep: *.is-prime; # Even better; no Scalars involved at all
Methods generally don't care whether their invocant is in a scalar, so
my = (1, 2, 3);.map(*.say); # 3 iterations
maps over a list of three elements, not of one.
Containers types, including
Hash, allow you to create self-referential structures.
my ; = ;put .perl;# OUTPUT: «((my @Array_75093712) = [@Array_75093712,])␤»
Perl 6 does not prevent you from creating and using self-referential data; You may end up in a loop trying to dump the data; as a last resort, you can use Promises to handle timeouts.
Any container can have a type constraint in the form of a type object or a subset. Both can be placed between a declarator and the variable name or after the trait of. The constraint is a property of the container, not the variable.
Variables may have no container in them, yet still offer the ability to re-bind and typecheck that rebind. The reason for that is in such cases the binding operator := performs the typecheck:
my Int \z = 42;z := 100; # OKz := "x"; # Typecheck failure
The same isn't the case when, say, binding to a Hash key, as there the binding is handled by a method call (even though the syntax remains the same, using
The default type constraint of a
Scalar container is Mu. Introspection of type constraints on containers is provided by
.VAR.of method, which for
% sigiled variables gives the constraint for values:
my Str ;say .VAR.of; # OUTPUT: «(Str)␤»my Num ;say .VAR.of; # OUTPUT: «(Num)␤»my Int ;say .VAR.of; # OUTPUT: «(Int)␤»
To provide custom containers Perl 6 provides the class
Proxy. It takes two methods that are called when values are stored or fetched from the container. Type checks are not done by the container itself and other restrictions like readonlyness can be broken. The returned value must therefore be of the same type as the type of the variable it is bound to. We can use type captures to work with types in Perl 6.
sub lucky(::T )my Int := lucky(Int);say = 12; # OUTPUT: «12␤»say = 'FOO'; # X::TypeCheck::Bindingsay = 13; # X::OutOfRangeCATCH ;