A typecast with many uses

by: Matthieu Lemerre  tags: l  published: 22 June 2012

I just finished implementing a new version of a typecast operator for the L programming language I'm working on.

The problem of the regular cast implementation

Typecast is an operator or function, that allows conversion from any type to any type. From the point of view of evaluation, it acts as the identity function; it is useful only for typechecking and type inference. It allows writing things that the type system does not (yet) support: increasing the expressivity of the type system means reducing the number of casts.

Note: in some languages (such as C), cast also perform coercion, i.e. convert the value to match the requested type. This is not the case here.

My first implementation of cast consisted in a cast function which did nothing but return its argument, and had for type: forall<a,b> a -> b. This means that this function can take arguments of any type, and does not constrain the return type. For instance, cast( 'a') + 1 converted 'a' to type Int: cast( 'a') converts 'a' to some type, and + requires a Int argument.

Other languages have this approach, for instance the OCaml cast function, named magic is defined by:

external magic : 'a -> 'b = "%identity"

With identity being a C function that returns its argument.

The problem with this definition is that it is not possible to express the result type according to the argument type directly.

For instance, suppose that I want to create a function make_triple: forall<t> (t,t,t) -> Triple<t>. The (t,t,t) and Triple<t> have the same machine representation, but they must appear different to the type system (to implement some nominal typing).

The problem is that this function is polymorphic: make_triple(2,3,4) has type Triple<Int>, while make_triple("a","b","c") has type Triple<String>. Without polymorphism, the problem would be easy to solve. For instance a make_triple_int function that works only with Int is easy to write with a regular cast (here I used the C notation for cast, (to_type) expr :

def make_triple(a:Int,b:Int,c:Int) = { (Triple<Int>) (a,b,c) }

But for a polymorphic function, the result type Triple<t> depends on the source type (t,t,t).

Cast is an identity function with a given type

The solution I found is to have a cast operator in the language, that acts as an identity function wrt. evaluation, but whose type is given has an argument. The syntax is cast( exp, type ), and does as if exp was applied to a function, whose type is type, which does nothing but returns its argument.

This cast operator solves the problem:

def make_triple(a,b,c) = cast( (a,b,c), forall<t> (t,t,t) -> Triple<t>)

Notice that the classical cast can still be obtained:

def my_cast_function(x) = cast( x, forall<a,b> a -> b)

Even more interesting is that this new function also allows to implement type annotation (that allows to enforce that a given expression has a certain type, or reduce its generality). For instance, cast( 3+2, Int -> Int) allows to check that 3+2 has type Int, without changing its type. In L, type annotation is written exp:t (for instance: (3+2):Int), and is now implemented using this cast operator.

The implementation is straightforward, being just a special case of function application.

Alternative implementations

Using standard cast + type annotation

On the contrary, it is also possible to implement this cast operator using standard cast + type annotation. For instance in OCaml:

# type 'a triple;;
# let make_triple a b c =
    let my_cast:('a * 'a * 'a) -> 'a triple = Obj.magic (fun x -> x) in
    my_cast (a,b,c);;
- : val make_triple : 'a -> 'a -> 'a -> 'a triple = <fun>
# make_triple 1 2 3;;
- : int triple = <abstr>
# make_triple "toto" "tata" "tutu";;
- : string triple = <abstr>

Still, I found that it was simpler to implement this cast operator than annotation + the standard cast. Plus, cast being a special form in the language, it can be removed at compile time, which is not the case if it has to call an external identity function.

Using a special "cast" variable

Another possibility would be to implement cast as a special variable. Instead of writing cast( exp, t1 -> t2), one would write:

(cast( t))( exp) (i.e. apply the "cast variable" to exp).

cast( t) would be an identity function of type t (t should thus be a function type).

This implementation would then be even simpler, being just a special case of variable (e.g. type variable instantation should be performed here).

Using this cast operator for implementing isorecursive sum types

Implementing isorecursive data types need such type casts (called "roll" and "unroll" in standard terminology).

For instance, if I define the List type as follows:

type List<a> = { Nil | Cons(a, List<a>) }

The internal representation of Nil is (0, ()), the one of Cons(hd,tl) is (1, (hd,tl)). The first element is a tag allowing to identify which variant is a given value. The roll and unroll functions are used for type conversion between these internal representations and the List<a> type.

So here is how the type constructors can be implemented using my new cast operator:

def Nil = cast((0,()), forall<a> (Int,())->List<a>)
def Cons = { (hd,tl) -> cast( (1, (hd,tl)), forall<a> (Int,(a,List<a>))->List<a>) }

The destructors can be implemented by reversing the cast:

def ifisNil = { (list, then_, else_) -> 
  let (num, _) = cast( list, forall<a> List<a> -> (Int, ()));
  if( num == 0) then_() else else_() 
}

def ifisCons = { (list, then_, else_) -> 
  let (num, _) = cast( list, forall<a> List<a> -> (Int, (a, List<a>)));
  if( num == 1) 
   {
      let (_, (hd,tl)) = cast( list, forall<a> List<a> -> (Int, (a, List<a>)));
      then_(hd,tl)       
   }
 else else_() 
}

This tests that it works correctly:

ifisCons( Nil, { (hd,tl) -> hd }, { () -> 0 }) // returns 0
ifisCons( Cons( 3,Nil), { (hd,tl) -> hd }, { () -> 0 }) // returns 3
ifisCons( Cons( 5, Cons( 3, Nil)), { (hd,tl) -> hd }, { () -> 0 }) // returns 5

This illustrates the purpose of cast: it allows implementing things not yet available in the type system. Actually the real implementation of the type constructors/destructor in L will likely rely on this new cast function.