Concurrency and Asynchronous Programming
In common with most modern programming languages, Perl 6 is designed to support concurrency (allowing more than one thing to happen at the same time) and asynchronous programming (sometimes called event driven or reactive programming - that is an event or change in some part of a program may lead to an event or change in some other part of the program asynchronously to the program flow).
The aim of the Perl concurrency design is to provide a high-level, composable, consistent interface, regardless of how a virtual machine may implement it for a particular operating system, through layers of facilities as described below.
Additionally, certain Perl features may implicitly operate in an asynchronous fashion, so in order to ensure predictable interoperation with these features, user code should, where possible, avoid the lower level concurrency APIs (i.e. Thread and Scheduler ) and use the higher-level interfaces.
A Promise (also called future in other programming environments) encapsulates the result of a computation that may not have completed or even started at the time the promise is obtained. A
Promise starts from a
Planned status and can result in either a
Kept status, meaning the promise has been successfully completed, or a
Broken status meaning that the promise has failed. Usually this is much of the functionality that user code needs to operate in a concurrent or asynchronous manner.
my = Promise.new;say .status; # Planned;.keep('result');say .status; # Keptsay .result; # result# (since it has been kept, a result is available!)my = Promise.new;.break('oh no');say .status; # Brokensay .result; # dies with "oh no", because the promise has been broken
Promises gain much of their power by being composable, for example by chaining, usually by the then method:
my = Promise.new();my = .then(->);.keep("First Result");say .result; # First Result \n Second Result
Here the then method schedules code to be executed when the first Promise is kept or broken, itself returning a new Promise which will be kept with the result of the code when it is executed (or broken if the code fails.)
keep changes the status of the promise to
Kept setting the result to the positional argument.
result blocks the current thread of execution until the promise is kept or broken, if it was kept then it will return the result (that is the value passed to
keep, ) otherwise it will throw an exception based on the value passed to
break. The latter behaviour is illustrated with:
my = Promise.new();my = .then(-> );.break("First Result");try .result;say .cause; # Handled but : \n First Result
break will cause the code block of the
then to throw an exception when it calls the
result method on the original promise that was passed as an argument, which will subsequently cause the second promise to be broken, raising an exception in turn when its result is taken. The actual Exception object will then be available from
cause. If the promise had not been broken
cause would raise a X::Promise::CauseOnlyValidOnBroken exception.
A Promise can also be scheduled to be automatically kept at a future time:
my = Promise.in(5);my = .then(-> );say .result;
A very frequent use of promises is to run a piece of code, and keep the promise once it returns successfully, or break it when the code dies. The start method provides a shortcut for that:
my = Promise.start();say .result; # 55
result of the promise returned is the value returned from the code. Similarly if the code fails (and the promise is thus broken), then
cause will be the Exception object that was thrown:
my = Promise.start();try .result;say .cause;
This is considered to be such a commonly required pattern that it is also provided as subroutines:
my = startmy = await ;say ;
await is almost equivalent to calling
result on the promise object returned by
start but it will also take a list of promises and return the result of each:
my = start;my = start;my = await , ;say ; # 55 -55
In addition to
await, two class methods combine several Promise objects into a new promise:
allof returns a promise that is kept when all the original promises are kept or broken:
my = Promise.allof(Promise.in(2),Promise.in(3));await ;say "All done"; # Should be not much more than three seconds later
anyof returns a new promise that will be kept when any of the original promises is kept or broken:
my = Promise.anyof(Promise.in(3),Promise.in(8600));await ;say "All done"; # Should be about 3 seconds later
await however the results of the original kept promises are not available without referring to the original, so these are more useful when the completion or otherwise of the tasks is more important to the consumer than the actual results, or when the results have been collected by other means. You may, for example, want to create a dependent Promise that will examine each of the original promises:
my ;for 1..5 ->say await Promise.allof().then();
Which will give True if all of the promises were kept with True, False otherwise.
If you are creating a promise that you intend to keep or break yourself then you probably don't want any code that might receive the promise to inadvertently (or otherwise) keep or break the promise before you do. For this purpose there is the method vow, which returns a Vow object which becomes the only mechanism by which the promise can be kept or broken. If an attempt to keep or break the Promise is made directly then the exception X::Promise::Vowed will be thrown, as long as the vow object is kept private, the status of the promise is safe:
sub get_promisemy = get_promise();# Will throw an exception# "Access denied to keep/break this Promise; already vowed".keep;
The methods that return a promise that will be kept or broken automatically such as
start will do this, so it is not necessary to do it for these.
A Supply is an asynchronous data streaming mechanism that can be consumed by one or more consumers simultaneously in a manner similar to "events" in other programming languages and can be seen as enabling "Event Driven" or reactive designs.
At its simplest, a Supply is a message stream that can have multiple subscribers created with the method
tap on to which data items can be placed with
The Supply can either be
live supply is like a TV broadcast: those who tune in don't get previously emitted values. An
on-demand broadcast is like Netflix: everyone who starts streaming a movie (taps a supply), always starts it from the beginning (gets all the values), regardless of how many people are watching it right now. Note that no history is kept for
on-demand supplies, instead, the
supply block is run for each tap of the supply.
my = Supplier.new;my = .Supply;.tap( -> );for 1 .. 10
Or alternatively as an
on-demand Supply created by the
my = supply.tap( -> );
In this case the code in the supply block is executed every time the Supply returned by
supply is tapped, as demonstrated by:
my = supply.tap( -> );.tap( -> );
tap method returns a Tap object which can be used to obtain information about the tap and also to turn it off when we are no longer interested in the events:
my = Supplier.new;my = .Supply;my = .tap( -> );.emit("OK");.close;.emit("Won't trigger the tap");
done on the supply object calls the
done callback that may be specified for any taps, but does not prevent any further events being emitted to the stream, or taps receiving them.
interval returns a new
on-demand supply which periodically emits a new event at the specified interval. The data that is emitted is an integer starting at 0 that is incremented for each event. The following code outputs 0 .. 5 :
my = Supply.interval(2);.tap(-> );sleep 10;
A second argument can be supplied to
interval which specifies a delay in seconds before the first event is fired. Each tap of a supply created by
interval has its own sequence starting from 0, as illustrated by the following:
my = Supply.interval(2);.tap(-> );sleep 6;.tap(-> );sleep 10;
on-demand Supply can also be created from a list of values that will be emitted in turn, thus the first
on-demand example could be written as:
An existing supply object can be filtered or transformed, using the methods
map respectively, to create a new supply in a manner like the similarly named list methods:
grep returns a supply such that only those events emitted on the source stream for which the
grep condition is true is emitted on the second supply:
my = Supplier.new;my = .Supply;.tap(-> );my = .grep();.tap(-> );my = .grep();.tap(-> );for 0 .. 10
map returns a new supply such that for each item emitted to the original supply a new item which is the result of the expression passed to the
map is emitted:
my = Supplier.new;my = .Supply;.tap(-> );my = .map();.tap(-> );for 0 .. 10
If you need to have an action that runs when the supply finishes, you can do so by setting the
quit options in the call to
.tap: ,done => ,quit =>;
quit block works very similar to a
CATCH. If the exception is marked as seen by a
default block, the exception is caught and handled. Otherwise, the exception continues to up the call tree (i.e., the same behavior as when
quit is not set).
The behavior here is the same as setting
A Channel is a thread-safe queue that can have multiple readers and writers that could be considered to be similar in operation to a "fifo" or named pipe except it does not enable inter-process communication. It should be noted that, being a true queue, each value sent to the Channel will only be available to a single reader on a first read, first served basis: if you want multiple readers to be able to receive every item sent you probably want to consider a Supply.
my = Channel.new;.send('Channel One');say .receive; # 'Channel One'
If the channel has been closed with the method close then any
send will cause the exception X::Channel::SendOnClosed to be thrown, and a
receive will throw a X::Channel::ReceiveOnClosed if there are no more items on the queue.
my = Channel.new;await (^10).map: ->.close;for .list ->
There is also the non-blocking method poll that returns an available item from the channel or Nil if there is no item or the channel is closed, this does of course mean that the channel must be checked to determine whether it is closed:
my = Channel.new;# Start three Promises that sleep for 1..3 seconds, and then# send a value to our Channel^3 .map: -># Wait 3 seconds before closing the channelPromise.in(3).then:# Continuously loop and poll the channel, until it's closedmy = .closed;loop# Doing some unrelated things...# Doing some unrelated things...# 2 from thread 5 received after 1.2063182 seconds# Doing some unrelated things...# Doing some unrelated things...# 1 from thread 4 received after 2.41117376 seconds# Doing some unrelated things...# 0 from thread 3 received after 3.01364461 seconds# Doing some unrelated things...
.poll method can be used in combination with
.receive method, as a caching mechanism where lack of value returned by `.poll` is a signal that more values need to be fetched and loaded into the channel:
sub get-valuesub replenish-cache
Channels can be used in place of the Supply in the
whenever of a
react block described earlier:
my = Channel.new;my = startawait (^10).map: ->.close;await ;
my = Supplier.new;my = .Supply;my = .Channel;my = startawait (^10).map: ->.done;await ;
Channel will return a different Channel fed with the same data each time it is called. This could be used, for instance, to fan-out a Supply to one or more Channels to provide for different interfaces in a program.
Proc::Async builds on the facilities described to run and interact with an external program asynchronously:
my = Proc::Async.new('echo', 'foo', 'bar');.stdout.tap(-> );.stderr.tap(-> );say "Starting...";my = .start;await ;say "Done.";# Output:# Starting...# Output: foo bar# Done.
The path to the command as well as any arguments to the command are supplied to the constructor. The command will not be executed until start is called, which will return a Promise that will be kept when the program exits. The standard output and standard error of the program are available as Supply objects from the methods stdout and stderr respectively which can be tapped as required.
If you want to write to the standard input of the program you can supply the
:w adverb to the constructor and use the methods write, print or say to write to the opened pipe once the program has been started:
my = Proc::Async.new(:w, 'grep', 'foo');.stdout.tap(-> );say "Starting...";my = .start;.say("this line has foo");.say("this one doesn't");.close-stdin;await ;say "Done.";# Output:# Starting...# Output: this line has foo# Done.
Some programs (such as
grep without a file argument in this example, ) won't exit until their standard input is closed so close-stdin can be called when you are finished writing to allow the Promise returned by
start to be kept.
The lowest level interface for concurrency is provided by Thread. A thread can be thought of as a piece of code that may eventually be run on a processor, the arrangement for which is made almost entirely by the virtual machine and/or operating system. Threads should be considered, for all intents, largely un-managed and their direct use should be avoided in user code.
A thread can either be created and then actually run later:
my = Thread.new(code => );# ....run;
Or can be created and run at a single invocation:
my = Thread.start();
In both cases the completion of the code encapsulated by the Thread object can be waited on with the
finish method which will block until the thread completes:
Beyond that there are no further facilities for synchronization or resource sharing which is largely why it should be emphasized that threads are unlikely to be useful directly in user code.
The next level of the concurrency API is supplied by classes that implement the interface defined by the role Scheduler. The intent of the scheduler interface is to provide a mechanism to determine which resources to use to run a particular task and when to run it. The majority of the higher level concurrency APIs are built upon a scheduler and it may not be necessary for user code to use them at all, although some methods such as those found in Proc::Async, Promise and Supply allow you to explicitly supply a scheduler.
The current default global scheduler is available in the variable
The primary interface of a scheduler (indeed the only method required by the Scheduler interface) is the
method cue(:, Instant :, :, :, : = 1; :)
my = 0;my = .cue(, every => 2 );sleep 20;
Assuming that the
$*SCHEDULER hasn't been changed from the default, will print the numbers 0 to 10 approximately (i.e with operating system scheduling tolerances) every two seconds. In this case the code will be scheduled to run until the program ends normally, however the method returns a Cancellation object which can be used to cancel the scheduled execution before normal completion:
my = 0;my = .cue(, every => 2 );sleep 10;.cancel;sleep 10;
should only output 0 to 5,
Despite the apparent advantage the Scheduler interface provides over that of Thread all of functionality is available through higher level interfaces and it shouldn't be necessary to use a scheduler directly, except perhaps in the cases mentioned above where a scheduler can be supplied explicitly to certain methods.
A library may wish to provide an alternative scheduler implementation if it has special requirements, for instance a UI library may want all code to be run within a single UI thread, or some custom priority mechanism may be required, however the implementations provided as standard and described below should suffice for most user code.
The ThreadPoolScheduler is the default scheduler, it maintains a pool of threads that are allocated on demand, creating new ones as necessary up to maximum number given as a parameter when the scheduler object was created (the default is 16.) If the maximum is exceeded then
cue may queue the code until such time as a thread becomes available.
Rakudo allows the maximum number of threads allowed in the default scheduler to be set by the environment variable
RAKUDO_MAX_THREADS at the time the program is started.
The CurrentThreadScheduler is a very simple scheduler that will always schedule code to be run straight away on the current thread. The implication is that
cue on this scheduler will block until the code finishes execution, limiting its utility to certain special cases such as testing.
The class Lock provides the low level mechanism that protects shared data in a concurrent environment and is thus key to supporting thread-safety in the high level API, this is sometimes known as a "Mutex" in other programming languages. Because the higher level classes (Promise, Supply and Channel) use a Lock where required it is unlikely that user code will need to use a Lock directly.
my = Lock.new;my = 0;await (^10).map:say ; # 10
protect returns whatever the code block returns.
protect will block any threads that are waiting to execute the critical section the code should be as quick as possible.
Some shared data concurrency issues are less obvious than others. For a good general write-up on this subject see this blog post.
One particular issue of note is when container autovivification or extension takes place. When an Array or a Hash entry is initially assigned the underlying structure is altered and that operation is not async safe. For example, in this code:
my ;my := ;= 'foo';
The third line is the critical section as that is when the array is extended. The simplest fix is to use a <Lock> to protect the critical section . A possibly better fix would be to refactor the code so that sharing a container is not necessary.