Friday, May 28, 2010

My Misperception of Scala's Ordered Trait

A brief tale of a little misperception that I had, and how it was corrected.

The Ordered trait in Scala is typically used when defining a class of objects that know how to order themselves by comparing against other instances of that class. The typical class definition looks something like this:

case class Thing(val n:Int) extends Ordered[Thing] {
    def compare(that: Thing): Int = { this.n - that.n }  //[1]
}
When I first started looking at these definitions, they just looked wrong to me: it seemed like Ordered[Thing] depended on the definition of Thing, and Thing in turn was defined in terms of Ordered[Thing]; a circular definition!

It is of course not a circular definition. The appropriate interpretation was made obvious to me recently while reviewing some Scala code with a co-worker, who was using the Ordered trait in a non-canonical way. Instead of the usual use of Ordered as in the above class definition, his class definition looked similar to this:
case class Thing(val n:Int) extends Ordered[Any] {
    def compare(that: Any): Int = that match {
        case i:Int => this.n - i
        case x:Thing => this.n - x.n
        case _ => throw new IllegalArgumentException("bad type")
    }
}
The reason he did this was because he wanted to be able to compare a Thing - which has an Int value as part of its definition - against either a Thing or an Int. The smallest common superclass of Thing and Int is Any, so his compare method had to accept an argument of type Any, which in turn meant the Ordered trait must be Ordered[Any].

This somewhat unusual use of Ordered led us to another small problem. He had an Array[Thing], sorted according to Thing.compare, on which he wanted to do a binary search. We couldn't find a binary search built in to Scala, but it was simple enough to find one on the web. We grabbed the Scala implementation of binary search from RosettaCode.org (and changed the type of the argument from IndexedSeq[A] to Array[A] since we were using Scala 2.7):
def binarySearch[A <% Ordered[A]](a: Array[A], v: A) = {
  def recurse(low: Int, high: Int): Option[Int] = (low + high) / 2 match {
    case _ if high < low => None
    case mid if a(mid) > v => recurse(low, mid - 1)
    case mid if a(mid) < v => recurse(mid + 1, high)
    case mid => Some(mid)
  }
  recurse(0, a.size - 1)
}
Our code to call the binarySearch method looked something like this:
val a:Array[Thing] = Array(Thing(1), Thing(3), Thing(5), Thing(6))
val x = binarySearch(a,3)
It failed to compile, with an error like this REPL output:
<console>:7: error: type mismatch;
 found   : Array[Thing]
 required: Array[Any]
       val x = binarySearch(a,3)
                            ^
<console>:7: error: no implicit argument matching parameter type (Any) => Ordered[Any] was found.
       val x = binarySearch(a,3)
               ^
The cause of this pair of error messages may be obvious to some people, but we scratched our heads a while trying to figure out what we had done wrong. Eventually we realized that the binarySearch method was making the assumption that the element type of the array was same as the type of the sort; in other words, the binarySearch method only worked on arrays of elements where each element, of type A, implemented Ordered[A]. Since our Thing class did not do that, we were getting a type mismatch when trying to pass it to binarySearch.

The solution was to modify binarySearch to explicitly have two type parameters, one for the element type of the array (A) and one for the ordering type of those elements (B), i.e. the type of the values against which we can compare the array elements:
def binarySearch[A,B](a: Array[A with Ordered[B]], v: B) = {
  def recurse(low: Int, high: Int): Option[Int] = (low + high) / 2 match {
    case _ if high < low => None
    case mid if a(mid) > v => recurse(low, mid - 1)
    case mid if a(mid) < v => recurse(mid + 1, high)
    case mid => Some(mid)
  }
  recurse(0, a.size - 1)
}
Going through this exercise helped me clearly see the distinction between the element type and the ordering type, and more generally to firmly excise the mistaken perception of circular dependency I mentioned at the start of this post. Now when I see class Foo extends Bar[Foo], it's easy for me to remember that just means that class Foo implements methods, as declared in Bar, that happen to take arguments of type Foo [2].

Footnotes

[1] Using a straight subtraction like this for the comparison will fail in extreme cases, such as if the first value is MAX_VALUE and the second value is negative; however, in the interest of keeping the code example simple, I have used this very short and understandable code snippet.

[2] Yes, I know this is not a very precise statement: Bar[Foo] might only define variables rather than methods, or the methods might take arguments of type List[Foo] rather than Foo, or any of a number of other variations.

Updated 2010-07-10 to escape angle brackets as pointed out by tolund.