Lists, Sequences, and Arrays
Positional data constructs
Lists have been a central part of computing since before there were computers, during which time many devils have taken up residence in their details. They were actually one of the hardest parts of Perl 6 to design, but through persistence and patience, Perl 6 has arrived with an elegant system for handling them.
Lists are created with commas and semicolons not with parentheses, so:
1, 2; # This is two-element listour = (1, 2); # This is also a List, in parentheses= (1; 2); # same List (see below)= (1); # This is not a List, just a 1 in parentheses= (1,); # This is a one-element List
There is one exception, empty lists are created with just parenthesis:
(); # This is an empty List(,); # This is a syntax error
Note that hanging commas are just fine as long as the beginning and end of a list are clear, so feel free to use them for easy code editing.
Parentheses can be used to mark the beginning and end of a
(1, 2), (1, 2); # This is a list of two lists.
Lists can also be created by combining comma and semicolon. This is also called multi-dimensional syntax, because it is most often used to index multidimensional arrays.
say so (1,2; 3,4) eqv ((1,2), (3,4));# OUTPUT: «True␤»say so (1,2; 3,4;) eqv ((1,2), (3,4));# OUTPUT: «True␤»say so ("foo";) eqv ("foo") eqv (("foo")); # not a list# OUTPUT: «True␤»
Unlike a comma, a hanging semicolon does not create a multidimensional list in a literal. However, be aware that this behavior changes in most argument lists, where the exact behavior depends on the function... but will usually be:
say('foo';); # a list with one element and the empty list# OUTPUT: «(foo)()␤»say(('foo';)); # no list, just the string "foo"# OUTPUT: «foo␤»
Because the semicolon doubles as a statement terminator it will end a literal list when used at the top level, instead creating a statement list. If you want to create a statement list inside parenthesis, use a sigil before the parenthesis:
say so (42) eqv $(my = 42; ;);# OUTPUT: «True␤»say so (42,42) eqv (my = 42; ;);# OUTPUT: «True␤»
Individual elements can be pulled out of a list using a subscript. The first element of a list is at index number zero:
say (1, 2); # says 1say (1, 2); # says 2say (1, 2); # says Nilsay (1, 2)[-1]; # Errorsay ((<a b>,<c d>),(<e f>,<g h>))[1;0;1]; # says "f"
Variables in Perl 6 whose names bear the
@ sigil are expected to contain some sort of list-like object. Of course, other variables may also contain these objects, but
@-sigiled variables always do, and are expected to act the part.
By default, when you assign a
List to an
@-sigiled variable, you create an
Array. Those are described below. If, instead you want to put an actual
List into an
@-sigiled variable, you can use binding with
my := 1, 2, 3;
my := 1; # Type check failed in binding; expected Positional but got Int
To remove all elements from a Positional container assign
Empty, the empty list
() or a
Slip of the empty list to the container.
my = 1, 2, 3;= ();= Empty;= |();
All lists may be iterated, which means taking each element from the list in order and stopping after the last element:
for 1, 2, 3 # OUTPUT: «1␤2␤3␤»
It is the rule by which the set of parameters passed to an iterator such as
for is treated as a single argument, instead of several arguments; that is
some-iterator( a, b, c, ...) will always be treated as
some-iterator( list-or-array(a, b, c)) and never as
(some-iterator(a))(b)..., that is, iterative application of the iterator to the first argument, then the result to the next argument, and so on. In this example
my = [ (1, 2, 3),(1, 2, ),[<a b c>, <d e f>],[] ];for -># OUTPUT#1 2 3 → List#1#2#3#1 2 → List#1#2#a b c d e f → Array#(a b c)#(d e f)#1 → Array#1
for receives is a single argument, it will be treated as a list of elements to iterate over. The rule of thumb is that if there's a comma, anything preceding it is an element and the list thus created becomes the single element. That happens in the case of the two arrays separated by a comma which are the third in the
Array we are iterating. In general, quoting the article linked above, the single argument rule ... makes for behave as the programmer would expect.
my = <foo bar buzz>;say .Set<bar buzz>; # OUTPUT: «(True True)␤»say so 'bar' ∈ ; # OUTPUT: «True␤»
Not all lists are born full of elements. Some only create as many elements as they are asked for. These are called sequences, which are of type Seq. As it so happens, loops return
(loop ) # OUTPUT: «42␤42␤42␤»
So, it is fine to have infinite lists in Perl 6, just so long as you never ask them for all their elements. In some cases, you may want to avoid asking them how long they are too – Perl 6 will try to return
Inf if it knows a sequence is infinite, but it cannot always know.
Seq class does provide some positional subscripting, it does not provide the full interface of
Positional, so an
@-sigiled variable may not be bound to a
my := Seq.new(<a b c>); CATCH# OUTPUT «Type check failed in binding to $iter; expected Iterator but got List ($("a", "b", "c"))␤ in block <unit> at <tmp> line 1␤␤»
This is because the
Seq does not keep values around after you have used them. This is useful behavior if you have a very long sequence, as you may want to throw values away after using them, so that your program does not fill up memory. For example, when processing a file of a million lines:
for 'filename'.IO.lines ->
You can be confident that the entire content of the file will not stay around in memory, unless you are explicitly storing the lines somewhere.
On the other hand, you may want to keep old values around in some cases. It is possible to hide a
Seq inside a
List, which will still be lazy, but will remember old values. This is done by calling the
.list method. Since this
List fully supports
Positional, you may bind it directly to an
my := (loop ).list;; # says 42 three times; # does not say anything; # says 42 two more times
You may also use the
.cache method instead of
.list, depending on how you want the references handled. See the page on
Seq for details.
Sometimes you want to insert the elements of a list into another list. This can be done with a special type of list called a Slip.
say (1, (2, 3), 4) eqv (1, 2, 3, 4); # OUTPUT: «False␤»say (1, Slip.new(2, 3), 4) eqv (1, 2, 3, 4); # OUTPUT: «True␤»say (1, slip(2, 3), 4) eqv (1, 2, 3, 4); # OUTPUT: «True␤»
Another way to make a
Slip is with the
| prefix operator. Note that this has a tighter precedence than the comma, so it only affects a single value, but unlike the above options, it will break Scalars.
say (1, |(2, 3), 4) eqv (1, 2, 3, 4); # OUTPUT: «True␤»say (1, |$(2, 3), 4) eqv (1, 2, 3, 4); # OUTPUT: «True␤»say (1, slip($(2, 3)), 4) eqv (1, 2, 3, 4); # OUTPUT: «False␤»
Lists can be lazy, what means that their values are computed on demand and stored for later use. To create a lazy list use gather/take or the sequence operator. You can also write a class that implements the role Iterable and returns
True on a call to is-lazy. Please note that some methods like
elems cannot be called on a lazy List and will result in a thrown Exception.
# This list is lazy and elements will not be available# until explicitly requested.my = lazy 1, 11, 121 ... 10**100;say .is-lazy; # OUTPUT: «True␤»say ; # OUTPUT: «[...]␤»# Once all elements have been retrieved, the List# is no longer considered lazy.my = eager ; # Forcing eager evaluationsay .is-lazy; # OUTPUT: «False␤»say ; # OUTPUT: (sequence starting with «[1 11 121» ending with a 300 digit number)
A common use case for lazy Lists are the processing of infinite sequences of numbers, whose values have not been computed yet and cannot be computed in their entirety. Specific values in the List will only be computed when they are needed.
my = 1, 2, 4, 8 ... Inf;say [0..16];# OUTPUT: «(1 2 4 8 16 32 64 128 256 512 1024 2048 4096 8192 16384 32768 65536)␤»
The lists we have talked about so far (
Slip) are all immutable. This means you cannot remove elements from them, or re-bind existing elements:
(1, 2, 3):delete; # Error Can not remove elements from a List(1, 2, 3) := 0; # Error Cannot use bind operator with this left-hand side(1, 2, 3) = 0; # Error Cannot modify an immutable Int
However, if any of the elements is wrapped in a
Scalar you can still change the value which that
Scalar points to:
my = 2;(1, , 3) = 42;.say; # OUTPUT: «42␤»
...that is, it is only the list structure itself – how many elements there are and each element's identity – that is immutable. The immutability is not contagious past the identity of the element.
So far we have mostly dealt with lists in neutral contexts. Lists are actually very context sensitive on a syntactical level.
When a list appears on the right hand side of an assignment into a
@-sigiled variable, it is "eagerly" evaluated. This means that a
Seq will be iterated until it can produce no more elements. This is one of the places you do not want to put an infinite list, lest your program hang and, eventually, run out of memory:
my = 3;my = (loop ); # OUTPUT: «3␤2␤1␤»say "take off!";
When you have a list that contains sub-lists, but you only want one flat list, you may flatten the list to produce a sequence of values as if all parentheses were removed. This works no matter how many levels deep the parentheses are nested.
say (1, (2, (3, 4)), 5).flat eqv (1, 2, 3, 4, 5) # OUTPUT: «True␤»
This is not really a syntactical "context" as much as it is a process of iteration, but it has the appearance of a context.
Scalars around a list will make it immune to flattening:
for (1, (2, $(3, 4)), 5).flat # OUTPUT: «1␤2␤(3 4)␤5␤»
@-sigiled variable will spill its elements.
my := 2, (3, 4);for (1, , 5).flat ; # OUTPUT: «1␤2␤3␤4␤5␤»my = 2, (3, 4); # Arrays are special, see belowfor (1, , 5).flat ; # OUTPUT: «1␤2␤(3 4)␤5␤»
When a list appears as arguments to a function or method call, special syntax rules are at play: the list is immediately converted into a
Capture itself has a List (
.list) and a Hash (
Pair literals whose keys are not quoted, or which are not parenthesized, never make it into
.list. Instead, they are considered to be named arguments and squashed into
.hash. See the page on
Capture for the details of this processing.
Consider the following ways to make a new
Array from a
List. These ways place the
List in an argument list context and because of that, the
Array only contains
2 but not the
:c(3), which is ignored.
Array.new(1, 2, :c(3));Array.new: 1, 2, :c(3);new Array: 1, 2, :c(3);
In contrast, these ways do not place the
List in argument list context, so all the elements, even the
:c(3), are placed in the
Array.new((1, 2, :c(3)));(1, 2, :c(3)).Array;my = 1, 2, :c(3); Array.new();my = 1, 2, :c(3); Array.new: ;my = 1, 2, :c(3); new Array: ;
In argument list context the
| prefix operator applied to a
Positional will always slip list elements as positional arguments to the Capture, while a
| prefix operator applied to an
Associative will slip pairs in as named parameters:
my := 2, "c" => 3;Array.new(1, |, 4); # Array contains 1, 2, :c(3), 4my = "c" => 3;Array.new(1, |, 4); # Array contains 1, 4
From the perspective of the
List inside a slice subscript, is only remarkable in that it is unremarkable: because adverbs to a slice are attached after the
], the inside of a slice is not an argument list, and no special processing of pair forms happens.
Positional types will enforce an integer coercion on each element of a slice index, so pairs appearing there will generate an error, anyway:
(1, 2, 3)[1, 2, :c(3)] # OUTPUT: «Method 'Int' not found for invocant of class 'Pair'␤»
...however this is entirely up to the type – if it defines an order for pairs, it could consider
:c(3) a valid index.
Indices inside a slice are usually not automatically flattened, but neither are sublists usually coerced to
Int. Instead, the list structure is kept intact, causing a nested slice operation that replicates the structure in the result:
say ("a", "b", "c")[(1, 2), (0, 1)] eqv (("b", "c"), ("a", "b")) # OUTPUT: «True␤»
Range is a container for a lower and a upper boundary. Generating a slice with a
Range will include any index between those bounds, including the bounds. For infinite upper boundaries we agree with mathematicians that
my = 1..5;say [0..2]; # OUTPUT: «(1 2 3)␤»say [0..^2]; # OUTPUT: «(1 2)␤»say [0..*]; # OUTPUT: «(1 2 3 4 5)␤»say [0..^*]; # OUTPUT: «(1 2 3 4 5)␤»say [0..Inf-1]; # OUTPUT: «(1 2 3 4 5)␤»
Inside an Array Literal, the list of initialization values is not in capture context and is just a normal list. It is, however, eagerly evaluated just as in assignment.
say so [ 1, 2, :c(3) ] eqv Array.new((1, 2, :c(3))); # OUTPUT: «True␤»[while $++ < 2 ].map: *.say; # OUTPUT: «42␤42␤43␤43␤»(while $++ < 2 ).map: *.say; # OUTPUT: «42␤43␤42␤43␤»
Which brings us to Arrays...
Arrays differ from lists in three major ways: Their elements may be typed, they automatically itemize their elements, and they are mutable. Otherwise they are Lists and are accepted wherever lists are.
say Array ~~ List # OUTPUT: «True␤»
A fourth, more subtle, way they differ is that when working with Arrays, it can sometimes be harder to maintain laziness or work with infinite sequences.
Arrays may be typed such that their slots perform a typecheck whenever they are assigned to. An Array that only allows
Int values to be assigned is of type
Array[Int] and one can create one with
Array[Int].new. If you intend to use an
@-sigiled variable only for this purpose, you may change its type by specifying the type of the elements when declaring it:
my Int = 1, 2, 3; # An Array that contains only Intsmy := Array[Int].new(1, 2, 3); # Same thing, but the variable is not typedsay eqv ; # says True.my = 1, 2, 3; # An Array that can contain anythingsay eqv ; # says False because types do not matchsay eqv (1, 2, 3); # says False because one is a Listsay eq ; # says True, because eq only checks valuessay eq (1, 2, 3); # says True, because eq only checks values = 42; # fine = "foo"; # error: Type check failed in assignment
In the above example we bound a typed Array object to a
@-sigil variable for which no type had been specified. The other way around does not work – you may not bind an Array that has the wrong type to a typed
my := Array[Int].new(1, 2, 3); # fine:= Array[Str].new("a", "b"); # fine, can be re-boundmy Int := Array[Int].new(1, 2, 3); # fine:= Array.new(1, 2, 3); # error: Type check failed in binding
When working with typed arrays, it is important to remember that they are nominally typed. This means the declared type of an array is what matters. Given the following sub declaration:
sub mean(Int )
Calls that pass an Array[Int] will be successful:
my Int = 1, 3, 5;say mean(); # @b is Array[Int]say mean(Array[Int].new(1, 3, 5)); # Anonymous Array[Int]say mean(my Int @ = 1, 3, 5); # Another anonymous Array[Int]
However, the following calls will all fail, due to passing an untyped array, even if the array just happens to contain Int values at the point it is passed:
my = 1, 3, 5;say mean(); # Fails, passing untyped Arraysay mean([1, 3, 5]); # Samesay mean(Array.new(1, 3, 5)); # Same again
Note that in any given compiler, there may be fancy, under-the-hood, ways to bypass the type check on arrays, so when handling untrusted input, it can be good practice to perform additional type checks, where it matters:
for -> Int ;
However, as long as you stick to normal assignment operations inside a trusted area of code, this will not be a problem, and typecheck errors will happen promptly during assignment to the array, if they cannot be caught at compile time. None of the core functions provided in Perl 6 for operating on lists should ever produce a wonky typed Array.
Nonexistent elements (when indexed), or elements to which
Nil has been assigned, will assume a default value. This default may be adjusted on a variable-by-variable basis with the
is default trait. Note that an untyped
@-sigiled variable has an element type of
Mu, however its default value is an undefined
my ;.of.perl.say; # OUTPUT: «Mu␤».default.perl.say; # OUTPUT: «Any␤».say; # OUTPUT: «(Any)␤»my Numeric is default(Real);.of.perl.say; # OUTPUT: «Numeric␤».default.perl.say; # OUTPUT: «Real␤».say; # OUTPUT: «(Real)␤»
To limit the dimensions of an
Array provide the dimensions separated by
; in brackets after the name of the array container. The values of such an
Arrays will default to
Any. The shape can be accessed at runtime via the
my [2,2];say .perl;# OUTPUT: «Array.new(:shape(2, 2), [Any, Any], [Any, Any])␤»say .shape;# OUTPUT: «(2 2)␤»
Assignment to a fixed size Array will promote a List of Lists to an Array of Arrays.
my [2;2] = (1,2; 3,4);[1;1] = 42;say .perl;# OUTPUT: «Array.new(:shape(2, 2), [1, 2], [3, 42])␤»
For most uses, Arrays consist of a number of slots each containing a
Scalar of the correct type. Each such
Scalar, in turn, contains a value of that type. Perl 6 will automatically type-check values and create Scalars to contain them when Arrays are initialized, assigned to, or constructed.
This is actually one of the trickiest parts of Perl 6 list handling to get a firm understanding of.
First, be aware that because itemization in Arrays is assumed, it essentially means that
$(…)s are being put around everything that you assign to an array, if you do not put them there yourself. On the other side, Array.perl does not put
$ to explicitly show scalars, unlike List.perl:
((1, 2), $(3, 4)).perl.say; # says "((1, 2), $(3, 4))"[(1, 2), $(3, 4)].perl.say; # says "[(1, 2), (3, 4)]"# ...but actually means: "[$(1, 2), $(3, 4)]"
It was decided all those extra dollar signs and parentheses were more of an eye sore than a benefit to the user. Basically, when you see a square bracket, remember the invisible dollar signs.
Second, remember that these invisible dollar signs also protect against flattening, so you cannot really flatten the elements inside of an Array with a normal call to
((1, 2), $(3, 4)).flat.perl.say; # OUTPUT: «(1, 2, $(3, 4)).Seq␤»[(1, 2), $(3, 4)].flat.perl.say; # OUTPUT: «($(1, 2), $(3, 4)).Seq␤»
Since the square brackets do not themselves protect against flattening, you can still spill the elements out of an Array into a surrounding list using
(0, [(1, 2), $(3, 4)], 5).flat.perl.say; # OUTPUT: «(0, $(1, 2), $(3, 4), 5).Seq␤»
...the elements themselves, however, stay in one piece.
This can irk users of data you provide if you have deeply nested Arrays where they want flat data. Currently they have to deeply map the structure by hand to undo the nesting:
say gather [0, [(1, 2), [3, 4]], $(5, 6)].deepmap: *.take; # OUTPUT: «(1 2 3 4 5 6)␤»
...future versions of Perl 6 might find a way to make this easier. However, not returning Arrays or itemized lists from functions, when non-itemized lists are sufficient, is something that one should consider as a courtesy to their users:
use Slips when you want to always merge with surrounding lists
use non-itemized lists when you want to make it easy for the user to flatten
use itemized lists to protect things the user probably will not want flattened
use Arrays as non-itemized lists of itemized lists, if appropriate,
use Arrays if the user is going to want to mutate the result without copying it first.
The fact that all elements of an array are itemized (in
Scalar containers) is more a gentleman's agreement than a universally enforced rule, and it is less well enforced that typechecks in typed arrays. See the section below on binding to Array slots.
Literal Arrays are constructed with a List inside square brackets. The List is eagerly iterated (at compile time if possible) and values in the list are each type-checked and itemized. The square brackets themselves will spill elements into surrounding lists when flattened, but the elements themselves will not spill due to the itemization.
Unlike lists, Arrays are mutable. Elements may deleted, added, or changed.
my = "a", "b", "c";.say; # OUTPUT: «[a b c]␤».pop.say; # OUTPUT: «c␤».say; # OUTPUT: «[a b]␤».push("d");.say; # OUTPUT: «[a b d]␤»[1, 3] = "c", "c";.say; # OUTPUT: «[a c d c]␤»
Assignment of a list to an Array is eager. The list will be entirely evaluated, and should not be infinite or the program may hang. Assignment to a slice of an
Array is, likewise, eager, but only up to the requested number of elements, which may be finite:
my ;[0, 1, 2] = (loop );.say; # OUTPUT: «[42 42 42]␤»
During assignment, each value will be typechecked to ensure it is a permitted type for the
Scalar will be stripped from each value and a new
Scalar will be wrapped around it.
Individual Array slots may be bound the same way
$-sigiled variables are:
my = "foo";my = 1, 2, 3; := ;.say; # OUTPUT: «[1 2 "foo"]␤»= "bar";.say; # OUTPUT: «[1 2 "bar"]␤»
...but binding Array slots directly to values is strongly discouraged. If you do, expect surprises with built-in functions. The only time this would be done is if a mutable container that knows the difference between values and Scalar-wrapped values is needed, or for very large Arrays where a native-typed array cannot be used. Such a creature should never be passed back to unsuspecting users.