I just finished implementing a new version of a typecast operator for the L programming language I'm working on.
Typecast is an operator or function, that allows conversion from any type to any type. From the point of view of evaluation, it acts as the identity function; it is useful only for typechecking and type inference. It allows writing things that the type system does not (yet) support: increasing the expressivity of the type system means reducing the number of casts.
Note: in some languages (such as C), cast also perform coercion, i.e. convert the value to match the requested type. This is not the case here.
My first implementation of cast consisted in a cast
function which
did nothing but return its argument, and had for type: forall<a,b> a
-> b
. This means that this function can take arguments of any type,
and does not constrain the return type. For instance, cast( 'a') + 1
converted 'a' to type Int
: cast( 'a')
converts 'a' to some type,
and +
requires a Int
argument.
Other languages have this approach, for instance the OCaml cast
function, named magic
is defined by:
external magic : 'a -> 'b = "%identity"
With identity
being a C function that returns its argument.
The problem with this definition is that it is not possible to express the result type according to the argument type directly.
For instance, suppose that I want to create a function make_triple:
forall<t> (t,t,t) -> Triple<t>
. The (t,t,t)
and Triple<t>
have
the same machine representation, but they must appear different to the
type system (to implement some nominal typing).
The problem is that this function is polymorphic:
make_triple(2,3,4)
has type Triple<Int>
, while
make_triple("a","b","c")
has type Triple<String>
. Without
polymorphism, the problem would be easy to solve. For instance a
make_triple_int
function that works only with Int
is easy to write
with a regular cast (here I used the C notation for cast, (to_type)
expr :
def make_triple(a:Int,b:Int,c:Int) = { (Triple<Int>) (a,b,c) }
But for a polymorphic function, the result type Triple<t>
depends on
the source type (t,t,t)
.
The solution I found is to have a cast operator in the language, that
acts as an identity function wrt. evaluation, but whose type is given
has an argument. The syntax is cast( exp, type )
, and does as if
exp
was applied to a function, whose type is type
, which does
nothing but returns its argument.
This cast operator solves the problem:
def make_triple(a,b,c) = cast( (a,b,c), forall<t> (t,t,t) -> Triple<t>)
Notice that the classical cast can still be obtained:
def my_cast_function(x) = cast( x, forall<a,b> a -> b)
Even more interesting is that this new function also allows to
implement type annotation (that allows to enforce that a given
expression has a certain type, or reduce its generality). For
instance, cast( 3+2, Int -> Int)
allows to check that 3+2
has type
Int, without changing its type. In L, type annotation is written
exp:t
(for instance: (3+2):Int
), and is now implemented using this
cast operator.
The implementation is straightforward, being just a special case of function application.
On the contrary, it is also possible to implement this cast operator using standard cast + type annotation. For instance in OCaml:
# type 'a triple;; # let make_triple a b c = let my_cast:('a * 'a * 'a) -> 'a triple = Obj.magic (fun x -> x) in my_cast (a,b,c);; - : val make_triple : 'a -> 'a -> 'a -> 'a triple = <fun> # make_triple 1 2 3;; - : int triple = <abstr> # make_triple "toto" "tata" "tutu";; - : string triple = <abstr>
Still, I found that it was simpler to implement this cast operator
than annotation + the standard cast. Plus, cast
being a special form
in the language, it can be removed at compile time, which is not the
case if it has to call an external identity function.
Another possibility would be to implement cast as a special variable.
Instead of writing cast( exp, t1 -> t2)
, one would write:
(cast( t))( exp)
(i.e. apply the "cast variable" to exp).
cast( t)
would be an identity function of type t
(t
should thus
be a function type).
This implementation would then be even simpler, being just a special case of variable (e.g. type variable instantation should be performed here).
Implementing isorecursive data types need such type casts (called "roll" and "unroll" in standard terminology).
For instance, if I define the List
type as follows:
type List<a> = { Nil | Cons(a, List<a>) }
The internal representation of Nil
is (0, ())
, the one of
Cons(hd,tl)
is (1, (hd,tl))
. The first element is a tag allowing
to identify which variant is a given value. The roll and unroll
functions are used for type conversion between these internal
representations and the List<a> type.
So here is how the type constructors can be implemented using my new cast operator:
def Nil = cast((0,()), forall<a> (Int,())->List<a>) def Cons = { (hd,tl) -> cast( (1, (hd,tl)), forall<a> (Int,(a,List<a>))->List<a>) }
The destructors can be implemented by reversing the cast:
def ifisNil = { (list, then_, else_) -> let (num, _) = cast( list, forall<a> List<a> -> (Int, ())); if( num == 0) then_() else else_() } def ifisCons = { (list, then_, else_) -> let (num, _) = cast( list, forall<a> List<a> -> (Int, (a, List<a>))); if( num == 1) { let (_, (hd,tl)) = cast( list, forall<a> List<a> -> (Int, (a, List<a>))); then_(hd,tl) } else else_() }
This tests that it works correctly:
ifisCons( Nil, { (hd,tl) -> hd }, { () -> 0 }) // returns 0 ifisCons( Cons( 3,Nil), { (hd,tl) -> hd }, { () -> 0 }) // returns 3 ifisCons( Cons( 5, Cons( 3, Nil)), { (hd,tl) -> hd }, { () -> 0 }) // returns 5
This illustrates the purpose of cast: it allows implementing things not yet available in the type system. Actually the real implementation of the type constructors/destructor in L will likely rely on this new cast function.