how to work efficiently with type inference of Ocaml language

in ocaml •  4 years ago  (edited)

Ocaml is a strongly infered static type language. For those who want to go in more details, Ocaml type engine use a Hindley Milner type system.

the goal of this article is find a way to use efficiently inference, for more ease, without sacrificing the quality of error messages thrown by the compiler.

What is type inference ?

type inference is the possibility to omit type declaration of used variables.

For the following code :

type coordinate = { x : float; y : float }
let square x = x *. x
let distance a b = square (b.x -. a.x) +. square (b.y -. a.y) |> sqrt

We define a coordinate type, with fields having the float type. What is interesting are the two next functions square and distance

The square function doesn't specify the type of the parameter x or the type of the returned value. They are automatically infered as being of float type.

For the distance function, the parameters a and b are infered of being of the type coordinate we previously defined earlier. The type of the returned value is float.

How type inference can influence the way of coding

The first immediate benefit, is to write less code. Type discipline is very important when you are working on business entities in your project. This kind of code isn't what you daily write.

Focusing on the logic of your algorithm without being too much distracted by the type engine is important. That's why usually static type languages aren't appreciated for prototyping.

But perfection isn't part of our world, and compiler can disagree with what you are trying to do. So why this happen, and how to handle it ?

What to do when compiler disagree about your types ?

image.png
Phoenix wright video game illustration

Sometimes there will be a difference between your intent, and what you want to write in your code. The difference can provoque two kind of situations :

you operate on an incompatible type

The easier situation is when you try to operate on the bad kind of types. The error message will something like :
This expression is of type A, but type B is expected.

You immediatly understand where you have made a mistake, and you feel quite grateful to the compiler.

Ocaml make a false asumption

Here an extract of a real code with a small bug I introduced to illustrate a situation where compiler error message can be quite confusing.

Don't try to understand all the code, we will focus on the error triggered at compilation

module Light : sig
  type t

  type color = Green | Red

  val of_string : string -> t

  val color_for_speed : t -> [ `Speed ] Number.t -> color
end = struct
  type t = {
    position : [ `Distance ] Number.t;
    frequency : [ `Duration ] Number.t;
  }

  type color = Green | Red

  let of_string s =
    Scanf.sscanf s "%d %d" (fun a b ->
        {
          position = Number.distance_from_int a;
          frequency = Number.duration_from_int b;
        })

  let color_at_second l used_seconds =
    let open Number in
    let light_seconds = seconds_from_duration l.frequency in
    if used_seconds < l.frequency then Green
    else
      let step = used_seconds / light_seconds in
      if step mod seconds_from_int 2 = zero then Green else Red

  let time_needed speed position =
    Number.(
      map (( * ) 3600) position / distance_from_speed speed
      |> to_int |> seconds_from_int)

  let color_for_speed l sp = time_needed sp l.position |> color_at_second l
end

when I compile all the files here the result

 % ocamlc test.ml    
File "test.ml", line 129, characters 32-45:
xxx |       let step = used_seconds / light_seconds in
                                      ^^^^^^^^^^^^
Error: This expression has type [ `Second ] Number.t
       but an expression was expected of type [ `Duration ] Number.t
       These two types have no intersection

Ocaml explain here we mix a duration type value with a second type value. If we refer the the names of variables, the intent here is to divide 2 values of second type. More you analyze the line shown in the error message, less you understand.

What is important here is to understand the error message as :
If the compiler have correctly guessed your intent, this line is imposible.

So if we focus on the function where we have a problem, we have to explicitly write the intent in order to enable Ocaml to give a more precise static analyzis.

The new version of the code will be :

  let color_at_second (l:t) (used_seconds:[`Second] Number.t) : color =
    let open Number in
    let light_seconds = seconds_from_duration l.frequency in
    if used_seconds < l.frequency then Green
    else
      let step = used_seconds / light_seconds in
      if step mod seconds_from_int 2 = zero then Green else Red

What I have changed here is the added type for the parameters and the return value type :

  • the parameter l is of type t, t is the convention name used when you create a specific module for implementing a new data type with associated functions, here the module Light define the data type of a light with a type t, which will be designated outside the module as Light.t
  • the parameter used_seconds is a duration number, so now Ocaml will not need to guess what this parameter is
  • return value will be a color

Now that the type of this function is manually defined, we are sure Ocaml understanding and our intent is aligned.

If we compile the file, the error message will now show precisely where we have made a mistake :

xxx |     if used_seconds < l.frequency  then Green
                            ^^^^^^^^^^
                                                
Error: This expression has type [ `Duration ] Number.t
       but an expression was expected of type [ `Second ] Number.t
       These two variant types have no intersection

And this is where the real problem is, we compare used_seconds with the bad variable. We can fix the line with

if used_seconds < light_seconds then Green

why did Ocaml made a false asumptions

The problem for Ocaml is that it had to guess the type of used_seconds inside the if condition. The bad assumption is that this line is by default correct if he can find a type make this line okay.

For the if condition to be valid, used_seconds and l.frequency must be of the same type. Ocaml know L.frequency is of type Duration, si used_seconds must be also a duration. That's why the division operation later was impossible.

A way of handling inference type

my advice

It is very interesting to infer type, to save time and improve readibility with less noise information in the code. So I suggest to

  1. Don't specify type for private functions
  2. If you face an error :
    1. Fix it if you easily understand where you have made a mistake
    2. Add types to parameter and return value if you are not sure where the mistake is. This way your intent and Ocaml assumptions will be aligned.

To conclude

OCaml type system is very powerful. Despite this power, miracle can't be made. So for being helped by type system, you need to learn to play with it. The goal is to save effort, and playing with it only when some more complex situations are faced.

I hope this article will help you. The example can be quite intimidating because it is an extract of a real personal project.
Usefuls links :

Authors get paid when people like you upvote their post.
If you enjoyed what you read here, create your account today and start earning FREE STEEM!