Saturday, October 4, 2008

Functional Purity in Scala

An important concept in functional programming is pure functions. The properties of pure functions means that they are easily verifiable and allows concurrent execution (an important advantage on today's multi-core machines). You can write pure functions in pretty much all programming languages, but AFAIK there are only two popular programming languages that verifies the purity at compile time: Haskell and D. In Haskell you can't use side effects or externally visible variables if you don't explicitly indicate so using monads in your function signature. D takes a different approach where the default is that you are allowed to use side effects and variables in your functions, and you must explicitly mark them as pure, see this presentation for more information. The difference in the implementations are surely because Haskell was designed from the start to be pure, but in D pure functions was added in version 2.0 of the language.

Pure Functions in Scala

I've been thinking about the best approach to implement pure function verification in the Scala compiler. An approach similar to the one in D would fit a lot better than the one used in Haskell (which would break all existing code and cause some problems due to strict evaluation). A solution using annotations would be quite simple to implement:

class Pure extends Annotation

In practice you want to define this as a runtime annotation in Java, but let's stick to Scala here. Now you can mark a function/method as pure:

@Pure def pureFunc(x : Int, y : Int) = x + y

There are some rules that the compiler must verify for a pure method/function:
  • Only calls to pure functions are allowed. This requires that a large number of functions in the Scala library must be marked using the pure annotation otherwise it will be impossible to write new pure functions.

  • Non-local vars can't be read or written. Local vars can be used in a pure function (as an accumulator for example), but you can't access static variables or variables reachable from the arguments.

  • A pure method can only be overridden by a pure method, and an interface method defined as pure can only be implemented as a pure method.

So far things are simple, but the restrictions imposed are quite severe, for example we can't create an array inside a pure function and use locally as this would result in calls to the non-pure array apply/update methods. Clearly this has to be allowed, which leads to the concept of "semi-pure" functions.

Semi-Pure Functions

Let's define the concept of a semi-pure function/method:

class SemiPure extends Annotation
class Pure extends SemiPure

As you can see a Pure function is a subtype of SemiPure, so anywhere a SemiPure function is required a Pure function can be used, for example a method marked as SemiPure can be overridden by a Pure method (but not the other way around). Here's an example of a semi-pure method:

case class Var(var value : Int) {
@SemiPure def inc = {
value += 1 // Ok, we can modify fields in this object
value
}
}

The following compilation rules applies to a semi-pure function:
  • It can call pure and semi-pure functions.

  • It can use local variables.

  • It can read and write variables reachable from its arguments. This includes the implicit this argument passed for class methods, so a class method can read/write fields of a class instance.

  • It can only be overridden by/implemented as a semi-pure or pure method.

So what use do we have for semi-pure functions? Well, they allow us to loosen the restrictions placed on pure functions somewhat: a pure function may call a semi-pure function if, and only if, all the argument values passed are created locally or are "invariant". "Invariant" is a term I borrowed from D, it's basically a deeply immutable value, i.e. it's an immutable value that only contain invariant values :). For example, List[Int] is invariant, but List[Array[Int]] is not invariant even though List[Array[Int]] is an immutable type.

So with this loosened restriction you can for example define a function that creates an array and updates it in a loop, and it can still be a pure function. A quite powerful concept that blends the imperative and purely functional programming styles.

Here's an example of a legal pure function that calls a semi-pure function (let's assume the method List[T].head is defined as pure):

case class Var(var value : Int) {
@SemiPure def inc = {
value += 1
value
}
}

@Pure def pureFunc(l : List[Var]) = {
val x = l.head // Ok, pure method called
val v2 = Var(10) // Object created locally
v2.inc // Ok, call of semi-pure function with locally created object
}

However, this is not allowed:

@Pure def pureFunc2(l : List[Var]) = {
val x = l.head // Ok, pure method called
x.inc // Error: semi-pure method called on variant external object
}

as it would result in an externally visible modification.

How to verify that a type is invariant is a problem that needs to be explored further. It should be doable with an addition of an immutable/invariant annotation, but there are some complications with subtyping and type parameters.

Function Objects

One more thing needs to be solved: how to declare a pure function parameter for a higher-order function. A quite simple solution is to create traits for pure and semi-pure functions and mix them with the function types. For example if you want to define a pure map function:

trait TSemiPure
trait TPure extends TSemiPure

@Pure def map[T, U](l : List[T], fn : (T => U) with TPure) : List[U] =
if (l.isEmpty) Nil else fn(l.head) :: map(l.tail, fn)

The TPure type would simply mean that the function objects apply method is considered pure by the compiler. There would be some restrictions on how the (semi-)pure traits would be allowed to be mixed in by the programmer. Another, maybe simpler, option is to create a new set of (Semi)PureFunction0-22 traits that extends the existing Function0-22 traits.

For lambda expressions the compiler could automatically infer if the function is pure, semi-pure or impure, and use the correct trait. During eta expansion the correct trait can be used depending on the annotation of the method/function expanded.

Final Words

Using the constructs I've presented in this post I think it would be feasible to implement checking of functional purity in the Scala compiler without too much effort. Hopefully this will result in a SIP in the near future, so that Scala hackers can utilize the powerful tool of statically checked pure functions.

I'm sure I've made some error or missed something along the way, so I'm grateful for any comments/corrections you might have.

Sunday, September 7, 2008

HList in Scala Revisited (or Scala Metaprogramming Works!)

The Scala compiler team listened to my request for supporting recursive type projections (ticket #1291) and has added experimental support for it. If you have a new build of the compiler (latest nightly or Eclipse plugin will do) you can enable it by add the compiler option "-Yrecursion x", where x is the maximum recursion depth allowed. Thanks to the EPFL team for the help, and particularly Geoffrey for implementing the fix.

The HList code I presented in my previous post now compiles without errors. I've setup a project at Assembla with the HList code and other metaprogramming constructs like booleans, natural numbers, integers, units etc. The library is in pre-alpha stage and there's lots of work to be done, for example the HList type is missing many methods found in the Scala List type.

Basically anything you can do with a normal List you can do with a HList, but all operations are type safe on an element by element basis. Here is some example code using HLists:

object HListExample {
import HLists._
import Nats._
import Utils._

// Create a HList of an Int, Boolean and a pair
val l1 = 10 :: true :: (10.1, "Hello") :: HNil

// Extract the second element, note that the element type
// information is preserved and we can safely perform a
// boolean and operation
l1.nth[_1] && false

// Create another HList, note the use of an operator in
// the type expression
val l2 : Double :: String :: Boolean :: HNil = 1.1 :: "string" :: false :: HNil

// Replace the second element in the list, it used to
// be a String, but now there's an Int inserted
val l3 = l2.remove[_1].insert(_1, 14)

// Type information is preserved, we can use an Int operation
// on the element
l3.nth[_1] * 10

// Append l2 to l1
val l4 = l1 ::: l2

// Statically check that the length of l4 is 6
type T = Equal[_6, l4.type#Length]
}

So, what are HLists useful for, besides being a fancy replacement for tuple types? Well, they can for example be used to implement a completely type safe object oriented programming language with support for delegation, subtyping etc. inside Scala. I will get back to that in a later blog entry.

The project code also contains a simple example of a type safe units library. Here's a usage example:

object UnitsTest {
import Units._
import Integers._

val dist : Length = measure(2.3) * m
val time : Time = measure(1.7) * s
val speed : Speed = dist / time
}

All units are checked at compile time, so you can't assign a length to a time variable for example.

There are lots of other use cases for metaprogramming and the library will continue to grow over time. However, there are some issues with metaprogramming in Scala, one being the speed of the compiler. Compiling types which expands to long paths can sometimes take a very long time. I have submitted a ticket about this too. Hopefully some optimizations are made to alleviate the problem. Meanwhile, I think certain applications of metaprogramming, for example compile time checked bounded integers, will put to much stress on the compiler and should probably be avoided.

Sunday, August 31, 2008

HList in Scala

This is my first blog post about Scala, so be gentle in your criticism :).

I've been trying to implement a linked list of heterogeneously typed elements in Scala, similar to HList for Haskell. This data type can for example be used as a replacement for all the TupleX types in the Scala standard library. A simple implementation of the HList data type in Scala might look like this:

object HListTest {
sealed trait HList

final class HNil extends HList {
def ::[T](v : T) = HCons(v, this)
}

val HNil = new HNil()

final case class HCons[H, T <: HList](head : H, tail : T) extends HList {
def ::[T](v : T) = HCons(v, this)
}

type ::[H, T <: HList] = HCons[H, T]
}

A list of arbitrary length can now be created:

val list : Int :: String :: Boolean :: HNil = 10 :: "hello" :: true :: HNil

Fine, now let's try to implement something more advanced like the append function:

case class Appender[L1 <: HList, L2 <: HList, Result <: HList](fn : (L1, L2) => Result)
{
def apply(l1 : L1, l2 : L2) = fn(l1, l2)
}

def append[L1 <: HList, L2 <: HList, Result <: HList](l1 : L1, l2 : L2)
(implicit fn : Appender[L1, L2, Result]) : Result = fn(l1, l2)

Note that the return type of the append function depends on the argument types. To use the append function we must provide two implicit functions that create instances of the Appender class for appending to the empty list and to a cons list:

implicit def nilAppender[L <: HList] : Appender[HNil, L, L] =
Appender((v : HNil, l : L) => l)

implicit def consAppender[H, T <: HList, L2 <: HList, R <: HList]
(implicit fn : Appender[T, L2, R]) : Appender[HCons[H, T], L2, HCons[H, R]] =
Appender[HCons[H, T], L2, HCons[H, R]]((l1 : HCons[H, T], l2 : L2) => HCons(l1.head, fn(l1.tail, l2)))

A simple test case:

append(HNil, HNil)

What? Compiler error, no implicit argument found? Closer examination of the error message indicates that the Scala compiler has bound the Result type parameter of the append function to Nothing, and then it tried to find an implicit matching this type. This will obviously not work. Explicitly passing the nilAppender works:

append(HNil, HNil)(nilAppender)

It seems the return type of a function can't be inferred from implicit arguments, we must be able to specify the return type of append only with the help of the non-implicit parameters. This is unfortunately an example where the Scala type inferer is not as powerful as the one in Haskell.

Game over? No, we have one more ace up our sleeve: abstract types. Let's declare the return type of the append function as an abstract type in the HList trait, and modify the append and implicit functions to use type projections:

sealed trait HList {
type Append[L <: HList] <: HList
}

final class HNil extends HList {
type Append[L <: HList] = L

def ::[T](v : T) = HCons(v, this)
}

final case class HCons[H, T <: HList](head : H, tail : T) extends HList {
type Append[L <: HList] = HCons[H, T#Append[L]]

def ::[T](v : T) = HCons(v, this)
}

def append[L1 <: HList, L2 <: HList](l1 : L1, l2 : L2)
(implicit fn : Appender[L1, L2, L1#Append[L2]]) : L1#Append[L2] = fn(l1, l2)

implicit def nilAppender[L <: HList] : Appender[HNil, L, L] =
Appender((v : HNil, l : L) => l)

implicit def consAppender[H, T <: HList, L2 <: HList, R <: HList]
(implicit fn : Appender[T, L2, T#Append[L2]]) : Appender[HCons[H, T], L2, HCons[H, T]#Append[L2]] =
Appender[HCons[H, T], L2, HCons[H, T]#Append[L2]]((l1 : HCons[H, T], l2 : L2) => HCons(l1.head, fn(l1.tail, l2)))

Now the simple test case works:

append(HNil, HNil)

A more advanced test works as well:

append(true :: HNil, 10 :: HNil).tail.head * 3

Nice! Have we implemented a type safe append function for heterogeneous lists of arbitrary length? Unfortunately not:

append(true :: "hello" :: HNil, 10 :: HNil)

This will cause an "illegal cyclic reference" compiler error. A type path can't contain the same type declaration twice, even though it or the enclosing type has different type arguments. In this case the HCons#Append type will be recursively evaluated twice for the first list type.

I've filed an enhancement ticket to remove this restriction. Hopefully it will be implemented some time in the near future, so the full power of abstract types and type projections can be utilized. In fact, there are many more things that can be implemented if this restriction is removed, see C++ MPL and Boost for some examples.

So, this is the current state of my exploration of Scala's wonderful type system. Unfortunately, at the moment it seems like HList can't be fully implemented in Scala. I'm grateful for comments proving me wrong though :)