Rules of Miou!
In our previous article, we explained the essence of Miou: a round-robin scheduler and how it was implemented in OCaml. We also clarified preemption and suspension and how we wanted to make our application "available" to system events. In short, we recommend you read this article before this one.
Here, we're going to talk about Miou's rules! Once again, an API isn't just a set of types and functions, it's a "design" that enforces a certain use. So Miou imposes rules that must be respected, otherwise it fails.
The benefits of rules
This is where we venture to formulate a certain vision of how to develop an application. These rules can be quite strict (as the OCaml compiler can be when it comes to types), but they ensure that we don't fall into pitfalls that our future selves would hate us for!
As you know, Robur does have some experience of libraries and services. As proof, this site (which is a unikernel) is still alive! And while questions of memory consumption or application longevity are essential (otherwise this site wouldn't work), so are questions of maintainance (there aren't (unfortunately?) 3 dinosaurs in existence... I don't even think there are any...). In short, if we can have tools that help us in development and enable us to solve certain problems upstream, then I'm all for it.
Miou imposes rules that are directly derived from our experience:
- avoid memory leaks
- more generally, resource leaks
- always check for errors
- always have control over what our tasks do
More generally, Miou lets you play with a whole host of objects (file-descriptors, domains, exceptions, etc.), but requires you to have everything tidied up by the end!
Rule number 1: don't forget your children
This rule is perhaps the most restrictive for the OCaml community. Indeed, when
developing applications with lwt
or async
, we were used to using:
Lwt.async
. This little function is literally the small coin you give Charon to
cross the river Lethe: you literally forget the task exists.
The problem is that a task may never finish (just as a program may never
finish). So what happens if a task launched with Lwt.async
doesn't finish?
Well, you'll have a task (which takes up a certain amount of memory) that will
exist forever! You've got a leak.
It can be radical to say that you have a leak. Indeed, it's possible to imagine that you've taken all the necessary steps to confirm (contrary to Turing) that your task will end well. Unfortunately, our experience has shown us that the confidence we had in ourselves to believe we could write good code turns out to be a little presumptuous and that, in truth, we do leak.
As such, this code with Miou fails:
# let _ = Miou.run @@ fun () ->
Miou.call_cc (Fun.const ()) (* we create a new task *)
;;
Exception: Miou.Still_has_children.
We forgot our child! In this code, we try to create a concurrently running task
(Miou.call_cc
) that does nothing. Miou.call_cc
returns what we usually call
a promise: in other words, the fulfillment of a promise that, in the future,
you'll get a value.
What Miou requires here is that you commit yourself to obtaining this value (just as you should commit yourself to ensuring that the promises made to you are kept). If you don't, Miou considers that you've forgotten "a child" (the task that does nothing) which, in itself, is a resource!
Implications
The implication here is that a task has a beginning, but more importantly, always an end! A task can be complex, even very complex. It can be the umbrella for a multitude of subtasks, which themselves have resources. The final idea is that Miou forces you to put all this in order: it forces you to wait for the results of ALL your tasks.
Above all, it forces you to know in what state these tasks have stopped. Indeed, they can:
- stop normally and release all the resources they were using
- throw an exception - and you'll know.
The correct code would be:
# let _ = Miou.run @@ fun () ->
match Miou.await (Miou.call_cc (Fun.const ())) with
| Ok () -> ()
| Error exn -> raise exn
;;
- : unit = ()
Background tasks
There is, however, one (and only one) legitimate case where we would like to "forget" a task: launching a task in the background where, precisely, we would like it to run forever!
In truth, a task (or program) never runs forever. They can run for a long time (even a very long time), but pragmatically, there will always be the moment of an update or an unpaid server bill that brings your program to a halt.
As such, we maintain that: yes, you must imagine the end of your application. However, Miou can help you set tasks in the background: we're not going to forget our children, we're just going to put them in an orphanage:
let server ~handler file_descr =
let rec go orphans =
let peer, _ = Miou_unix.accept file_descr in
let _ = Miou.call ~orphans @@ fun () -> handler peer in
go orphans in
go (Miou.orphans ())
An orphanage is just a set of tasks that we keep. It can be manipulated so that
it returns completed tasks. This way, using Miou.await_exn
on these tasks
won't block (because, basically, Miou.await_exn
awaits the result of the
task). Thus, the use of Miou.orphans
is often associated with a simple
"cleanup" function:
let rec cleanup orphans = match Miou.care orphans with
| None -> () (* The orphanage has no tasks. *)
| Some None -> (* The orphanage has tasks, but none of them are finished. *)
| Some (Some prm) ->
(* A task has just been completed, and can be "consumed". *)
Miou.await_exn prm;
cleanup oprhans
It's a matter of periodically "cleaning up" the tasks we may have created previously (to manage our clients) and which have been completed (concurrently or in parallel).
let server ~handler file_descr =
let rec go orphans =
cleanup orphans; (* Cleaning up old clients. *)
let peer, _ = Miou_unix.accept file_descr in
let _ = Miou.call ~orphans @@ fun () -> handler peer in
go orphans in
go (Miou.orphans ())
The kinship of this design
This design has already been tried out with awa-ssh and lwt
, where
the use of Lwt.async
has been prohibited - in place of a list that keeps track
of all necessary tasks and is periodically cleaned up.
It's at this point that we need to explain a very important point: it would be
difficult for us to have near-equivalence between lwt
applications and Miou
applications. Some aim to offer a quasi-transparent transition between an lwt
application and their schedulers. This is not the case with Miou, which forces
you to rethink your implementations, in particular with this rule.
The advantage of this rule is that you can be sure that the application will
soon release all its resources. This was the case with httpcats. We
wanted to go back to what we'd already done with paf-le-chien
or http-lwt-client
, but we were content to remake a new
implementation which ensured that at the end of an HTTP request, all resources
(tasks, file-descriptor, buffers, etc.) were released. Another aspect is error
handling: Miou, by forcing us to commit the result of a task, forces us to
consider the execution path in the event of an abnormal termination.
These questions are often relegated to the back burner when it comes to implementing software that meets our objectives. Miou forces you to take the lead (to ask yourself these upstream questions) that aren't absolutely necessary for your objective (you can leak but still make an HTTP request) so you can sleep soundly.
Rule number 2: no one can get your children's results
This rule is interesting because it's a simplification of our scheduler in order to relegate an essential question to the user: how to share information between tasks. Let's go straight to the wrong code:
# Miou.run @@ fun () ->
let prm0 = Miou.call (Fun.const 1) in
let prm1 = Miou.call @@ fun () ->
Miou.await_exn prm0 + 1 in
Miou.await_all [ prm0; prm1 ]
|> List.map (function Ok v -> v | Error exn -> raise exn)
|> List.fold_left (+) 0
|> print_int ;;
Exception: Miou.Not_a_child
This example shows that we want to share the information calculated by prm0
in
prm1
. The problem is that it wasn't prm1
that launched prm0
(Miou.run
directly launched prm0
). So Miou will throw a Not_a_child
exception
to inform you that the Miou.await_exn
in prm1
is illegal.
The idea is simple: with Miou, the only way to transmit information between tasks is between children and their parents (a child can only have one parent).
Above all, this reveals an essential point when it comes to concurrent and/or parallel programming: how to share information between tasks. There are a multitude of solutions to this question:
- using
Atomic
- protect information with a
Mutex
orSemaphore
- take advantage of concurrent programming, where only one task can run at a time
- implement an "atomic" structure
- transmit via a system resource (FIFO, socket, files, etc.)
Point 3 is interesting because this technique was often used with lwt
/async
(a Stdlib.Queue
can be shared between several tasks), but its limits are
details specific to the scheduler's behavior. But what we really need to
understand here is that, if we have to answer the question of how to share
information between tasks, the only correct answer at this stage is: it depends.
In this respect, Miou's restriction to sharing the result of a task only with
its direct relative opens up a space for you to find the right solution for your
context. Indeed, the use of Atomic
can help, but the use of a FIFO can just as
well be interesting: for the example, Miou
and Miou_unix
use both in
specific cases, which we'll detail in another article.
The semantics of synchronization points & the scheduler
Beyond the solutions, there's also the subtlety of their implementation. OCaml's
Condition
module is a good example, where the documentation is very clear and
specifies a behavior, known as "spurious wakeup", that the
user must take into account. A fairly common behavior, but one that can be
avoided by using other mechanisms. Another example concerns Mutex
. In a
concurrent programming context, you may well come across the Mutex exception
Sys_error "Mutex.lock: Resource deadlock avoided"
where one task has locked
the Mutex, suspends itself and another task tries to lock the same Mutex... The
use of a "Nano_mutex" or a Semaphore may then be more
interesting.
In truth, the conclusion remains fatal if we're intellectually honest: information transfer between tasks is a complicated matter. We need to look at both the implementations and the behavior of the scheduler itself!
We could offer a whole panoply of solutions with their advantages and disadvantages as integrated parts of Miou to make life easier for users. However, we have chosen to ensure only one very specific type of transfer, and leave this question open to the community: collective intelligence is always more valuable than the pretension of supposed omniscience on these issues.
Simpler scheduler implementation
From these rules, we can extract some fairly simple assertions about the handling of our promises in our scheduler. In particular, we can suggest that only the domain executing a task can modify the associated promise. This simple assertion avoids complex synchronization points between domains when it comes to promise manipulation.
But more on these aspects in another article.
Rule number 3: Ownership
The two previous rules allow for an equally interesting extension: attaching "resources" to tasks. Indeed, the basic idea behind these two fundamental rules is that a task is a resource in itself. So, when it comes to manipulating "resources" (the definition of which would be more conventional, like a file-descriptor), all you have to do is "attach" them to the task in order to extend the above checks:
- am I the owner of this file-descriptor?
- have I released (
Unix.close
) this file-descriptor as soon as my task ends?
The first check derives from our second rule: am I the parent of the task I'm
waiting for? Just as Not_a_child
can occur if you're waiting for the child of
another parent, there's also the Not_owner
exception whenever you manipulate a
resource of which you are not the owner. Let's take the example of a
file-descriptor:
type fd = Unix.file_descr * Miou.Ownership.t
let socket ?cloexec domain ty proto =
let socket = Unix.socket ?cloexec domain ty proto in
let finally = Unix.close in
let owner = Miou.Ownership.own ~finally socket in
socket, owner
let read (socket, owner) buf off len =
Miou.Ownership.check owner;
Unix.read socket buf off len
let () = Miou_unix.run @@ fun () ->
let fd = socket domain Unix.SOCK_STREAM 0 in
let buf = Bytes.create 0x1000 in
let prm = Miou.call @@ fun () -> read fd buf 0 (Bytes.length buf) in
Format.printf "%S" (Bytes.sub_string buf 0 (Miou.await_exn prm))
This code raises the Not_owner
exception because the prm
task is trying to
manipulate the file-descriptor (using read
) while the resource is attached to
the main task (using Miou.Ownership.own
). The only way to make this code work
is to "pass" ownership to our prm
task:
- let prm = Miou.call @@ fun () -> read fd buf 0 (Bytes.length buf) in
+ let prm = Miou.call ~give:[ snd fd ] @@ fun () ->
+ read fd buf 0 (Bytes.length buf) in
However, if this code is executed, the Resource_leak
exception appears. In
fact, we need a way to free our file-descriptor.
let close (socket, owner) =
Unix.close socket;
Miou.Ownership.disown socket
Finally, our main code becomes:
let () = Miou_unix.run @@ fun () ->
let fd = socket domain Unix.SOCK_STREAM 0 in
let buf = Bytes.create 0x1000 in
let prm = Miou.call ~give:[ snd fd ] @@ fun () ->
let len = read fd buf 0 (Bytes.length buf) in
Miou.Ownership.disown (snd fd) in
Format.printf "%S" (Bytes.sub_string buf 0 (Miou.await_exn prm));
close fd
Disgression on static and dynamic analysis
At this point, some may have noticed that the idea of ownership is very similar to that found in the Rust language. As such, and even though promising experiments have been made to extend OCaml's type system in order to oxidize it, we don't want to embark on experiments (using OCaml 5, given its current status, is already an experiment for us).
So, yes, Miou offers a "poor" version of a (dynamic) resource utilization check.
A poor version which, in many respects, is subtly different from what one might
imagine Rust can offer (notably, ownership can appear for several tasks as in
the example above - for both the main task and the prm
task). But this poor
version has the merit of existing!
Implication and conclusion
Finally, there's one thing to recognize and understand about Miou: the
methodology for creating applications with Miou is different from that of
lwt
and async
. The result of these rules is that for some (if not all)
implementations with lwt
, we have to rethink the whole design in order to
achieve a result that is correct with regard to these rules.
Some might consider that a quasi-transparent transition from one lwt
application to another scheduler would be an advantage. This is, of course, a
productivist advantage when it comes to producing this transition as quickly as
possible. But what about quality? What new issues does multicore and/or its
effects bring? In this respect, we think we need to rethink.
In this respect, Miou is a synthesis of our experience. A synthesis whose
choices may be inconvenient to our readers in many respects, but - and this is
Miou's objective - which is a satisfactory response to the problems we've had to
face over a decade of maintaining software with lwt
.
I'd also like to warn the user that these rules, while our various applications have shown some validity to their "real-world" applications, are subject to change. At the very least, we've made it clear that Miou is experimental. Discussion forums exist, notably on the notion of ownership. The answers we provide (and we would never claim to) are not absolute. We therefore invite our readers to give Miou a try, and to take part in this reflection on what a good scheduler could be, in the light of the history of the OCaml community.
At least we've managed to develop an HTTP client/server with TLS support httpcats, and we invite you to have fun (as we did) with it.