Sunday, October 11, 2009

Scala Case Statements As Partial Functions

A Scala case statement can be either a Function1 or a PartialFunction depending on the context.

In my previous post I presented a simple Publisher that I used to decouple my Swing actors from their targets. Reader nairb774 pointed out that the standard Scala library includes a Publisher class. In fact, there are two Publisher classes in Scala, scala.collection.mutable.Publisher and scala.swing.Publisher. Although I like my publisher class better, the swing publisher did have one feature that I thought was useful: it accepted as a callback a PartialFunction rather than, as mine did, a Function1. That would mean, I thought, that I could pass in a case statement as a callback.

For example, continuing the Mimprint example from my previous post, if I were only interested in Enabled events published by a particular publisher, rather than explicitly checking this in my callback with an isInstanceOf or a match statement that includes a case _ => clause, I could just use a one-line case statement:
showSingleViewerPublisher.subscribe { case e:Enabled => doSomething() }
My calling code in Publisher would call apply on the PartialFunction callback only if a call to its isDefinedAt method returned true, thus avoiding the MatchError that would occur if I treated it like a Function1 and called its apply method when the value was not Enabled. This seemed like useful functionality, so I decided to add it. I thought it would be easy, but unfortunately it was not.

Consider the following three definitions that assign a case statement to a partial function, full function, or no explicit function type, respectively:
val pfv:PartialFunction[String,Unit] = { case "x" => println("Got x") } val ffv:Function1[String,Unit] = { case "x" => println("Got x") } val nfv = { case "x" => println("Got x") }
For the first line, the variable pfv gets assigned a value which is a PartialFunction representing the case statement. For the second line, you might think that, since PartialFunction extends Function1 and we are assigning the same value to ffv as we did to pfv, that the variable ffv would be assigned a value which is a PartialFunction, just as for the value pfv. This is not the case.

The Scala Language Specification (SLS) explicitly states, in section 8.5, that the type of an anonymous function comprised of one or more case statements must be specified as either a FunctionK or a PartialFunction, and that the value generated by the compiler is different depending on that specified target type. So the value that gets assigned to ffv is a Function1, and ffv.isInstanceOf[PartialFunction[_,_]] evaluates to false. Note that we could assign the value pfv to the variable ffv, in which case ffv would have a value which is a PartialFunction and ffv.isInstanceOf[PartialFunction[_,_]] would evaluate to true.

What happens if you don't specify the type, as in the third line above where we assign the same value to nfv? You might think the compiler could infer the type of the resulting value, but since, as specified in the SLS, the type must be explicitly specified as either a FunctionK or a PartialFunction, our assignment to nfv is actually not a valid statement, and it fails to compile. It would be nice if the error message said something like "You must explicitly specify either a FunctionK or a PartialFunction for a case statement", but instead it gives this relatively unhelpful message:
<console>:4: error: missing parameter type for expanded function ((x0$1) => x0$1 match { case "x" => println("Got x") }) val nfv = { case "x" => println("Got x") } ^
In my case, the situation in which I encountered this message was a little different. Here is an example showing the problem I ran into:
class PF[T] { //partial function type def sub(x:PartialFunction[T,Unit]) = x } class FF[T] { //full function type def sub(x:Function[T,Unit]) = x } class NF[T] { //no unique function type def sub(x:PartialFunction[T,Unit]) = x def sub(x:Function[T,Unit]) = x } val pf = new PF[String] val ff = new FF[String] val nf = new NF[String] pf.sub{ case "x" => println("x") } //works, result is PartialFunction ff.sub{ case "x" => println("x") } //works, result is Function1 nf.sub{ case "x" => println("x") } //fails with compiler error msg
Calling the above method sub with a case statement works when there is only one method of that name, whether it takes a Function1 or a PartialFunction, but although the compiler has no problem compiling the overloaded pair of functions, once they both exist the compiler can no longer unambiguously determine the target type for the case statement, so it delivers that same error message "missing parameter type for expanded function".

In my case I was trying to modify the subscribe method in my Publisher class so that I could pass in either a regular function, such as println(_), or a PartialFunction, in particular an in-line case statement. The three options I tried are essentially classes PF, FF and NF listed above. When I used approach NF I was unable to directly pass in a case statement, but instead would get the compiler error mentioned above. When I used approach PF I could pass in a case statement as a PartialFunction, but I could not pass in a regular function. When I used approach FF I could pass in a regular function, and could pass in and properly deal with a PartialFunction, since it extends Function1, but when I used an in-line case statement it would get compiled as a Function1 rather than a PartialFunction, which would cause execution to fail when a value was passed to that case statement that it did not cover (since it was not a PartialFunction and thus did not have an isDefinedAt method to call first).

I don't like option FF because it would allow code (specifically, an in-line case statement) to compile but then not execute as expected. Options PF and NF are not very useful as is, since neither directly supports both case statements and full functions.

In a mailing list response to someone who was attempting to use option NF in his application, Paul Phillips suggested using option FF with a helper function pf that accepts a PartialFunction and returns the same value, then wrapping any case statements inside a call to that helper function; or, alternatively, assigning the case statement to a val declared as a PartialFunction before passing it to method sub. Unfortunately, if the user forgets to use either of these techniques on a case statement and just passes it directly to method sub in option FF, it will be handled as a Function1 rather than a PartialFunction, so it will compile but not behave as expected.

Paul's suggestion would also work in option NF (and in option PF, although in that case it would be redundant), which would behave much the same as option FF from the user's perspective except that passing a bare case statement to the overloaded method sub would not compile, so we would no longer have the undesirable situation of something that compiles but behaves unexpectedly.

As an alternative to Paul's pf helper function, I could write a helper function ff that takes a Function1 and turns it into a PartialFunction with an isDefinedAt method that always returns true. I would then use this with option PF. This would allow me to directly pass in case statements, but I would have to wrap all regular functions in a call to ff.

I have not yet made any changes to my Publisher class, since I don't particularly like either of the options and I don't currently really need the ability to use in-line case statements. Meanwhile, if I get the compiler error "missing parameter type for expanded function" while trying to use an in-line case statement, at least I now know one more thing to check for.

Wednesday, October 7, 2009

A Simple Publish/Subscribe Example in Scala

Here is an example where using a simple publish/subscribe mechanism allowed me to clean up some of my early Scala code.

My Mimprint program (now also on github) was originally written in Java, then ported to Scala soon after I first started learning that language. As such, much of that original ported code was "Java written in Scala". As I have continued to internalize the Scala approach I have gone back and modified various parts of the program to make it cleaner.

In one part of the program I set up a collection of menu checkboxes to allow the user to enable or disable various features. As those features are enabled or disabled, the states of other screen components change; sometimes a component is enabled or disabled, sometimes a component is hidden or made visible.

My original Java-ish Scala code to do this looked something like this (with irrelevant parts omitted):
class ViewListGroup ... { ... private var singleComp:Component = _ private var mShowFileInfo:SCheckBoxMenuItem = _ private var mShowFileIcons:SCheckBoxMenuItem = _ private var mShowDirDates:SCheckBoxMenuItem = _ private var mShowSingleViewer:SCheckBoxMenuItem = _ def getComponent():Component = { ... singleComp = playViewSingle.getComponent() ... //Add our menu items mShowFileInfo = new SCheckBoxMenuItem( viewer,"menu.List.ShowFileInfo")( showFileInfo(mShowFileInfo.getState)) mShowFileInfo.setState(true) m.add(mShowFileInfo) mShowFileIcons = new SCheckBoxMenuItem( viewer,"menu.List.ShowFileIcons")( showFileIcons(mShowFileIcons.getState)) mShowFileIcons.setState(false) m.add(mShowFileIcons) mShowDirDates = new SCheckBoxMenuItem( viewer,"menu.List.ShowDirDates")( showDirDates(mShowDirDates.getState)) mShowDirDates.setState(playViewList.includeDirectoryDates) ... m.add(mShowDirDates) mShowSingleViewer = new SCheckBoxMenuItem( viewer,"menu.List.ShowSingleViewer")( showSingleViewer(mShowSingleViewer.getState)) mShowSingleViewer.setState(true) m.add(mShowSingleViewer) showSingleViewer(mShowSingleViewer.getState) //make sure window state is in sync with menu item state ... } ... def showFileInfo(b:Boolean) { playViewList.showFileInfo(b) mShowFileInfo.setState(b) mShowFileIcons.setEnabled(b) mShowDirDates.setEnabled(b) } def showFileIcons(b:Boolean) { playViewList.showFileIcons(b) playViewList.redisplayList() } def showDirDates(b:Boolean) { playViewList.includeDirectoryDates = b playViewList.redisplayList() } def showSingleViewer(b:Boolean) { singleComp.setVisible(b) singleComp.getParent.asInstanceOf[JSplitPane].resetToPreferredSizes() mShowSingleViewer.setState(b) playViewList.requestSelect } ... }
There were two things about this code that I didn't like:
  1. Mutable instance variables using var, particularly since they were not really variable. These values were being assigned once, not at construction time, but had to be available to other methods.
  2. The close binding between the different UI components, since the action method called by one component directly modified attributes of possibly a number of other components.
After a recent conversation with a friend I realized that I could probably improve this code by using a publish/subscribe mechanism to loosen the coupling between the components. Mimprint already had an ActorPublisher class, where each subscriber is an Actor that accepts messages of the published object type, but in this case I wanted a lighter weight implementation, since I knew the subscriber actions would be quick. Also, this being Swing, the subscriber actions that update screen state should run in the Swing event thread, and the events being published are also coming from the event thread, so the simple thing to do is to run the subscriber actions directly from the publish method.

Writing a publish/subscribe handler in Scala is pretty easy, and for me it was even simpler, as I already had one. I grabbed my ListenerManager and modified it to use the publish/subscribe terminology. I also added synchronization to make it multi-thread safe, although for this app I don't really need it. It now looks like this:
package net.jimmc.util /** Manage a subscriber list. * There are no guarantees on the order of subscribers in the list. * This code is a slightly modified version of ListenerManager * as published to my blog in April 2009. */ trait Publisher[E] { type S = (E) => Unit private var subscribers: List[S] = Nil private object lock //By using lock.synchronized rather than this.synchronized we reduce //the scope of our lock from the extending object (which might be //mixing us in with other classes) to just this trait. /** True if the subscriber is already in our list. */ def isSubscribed(subscriber:S) = { val subs = lock.synchronized { subscribers } subs.exists(_==subscriber) } /** Add a subscriber to our list if it is not already there. */ def subscribe(subscriber:S) = lock.synchronized { if (!isSubscribed(subscriber)) subscribers = subscriber :: subscribers } /** Remove a subscriber from our list. If not in the list, ignored. */ def unsubscribe(subscriber:S):Unit = lock.synchronized { subscribers = subscribers.filter(_!=subscriber) } /** Publish an event to all subscribers on the list. */ def publish(event:E) = { val subs = lock.synchronized { subscribers } subs.foreach(_.apply(event)) } }
For each menu checkbox I would like to set up a publisher. In every case, I just need to publish whether that checkbox has just been enabled or disabled. I defined a simple case class hierarchy to represent the Enabled and Disabled messages:
sealed abstract class Abled case object Enabled extends Abled case object Disabled extends Abled
I then created a publisher class that uses that event type:
class AbledPublisher extends Publisher[Abled]
I want to easily publish the Enabled or Disabled object based on the current state of a checkbox, so I added an AbledPublisher companion object with an apply method to do that:
object AbledPublisher { object Abled { def apply(b:Boolean) = if (b) Enabled else Disabled } }
Conversely, upon receiving an Abled event in a subscriber for a UI component I want to be able to enable or disable that component. I could use a match statement with cases for Enabled and Disabled, but a simpler way is to modify the Abled case class hierarchy to encode a boolean state value into the Abled case object to allow easy translation from an Abled object back to a state:
sealed abstract class Abled { val state:Boolean } case object Enabled extends Abled { override val state = true } case object Disabled extends Abled { override val state = false }
Finally, I packaged up the case class hierarchy inside the AbledPublisher object to control scoping. The final AbledPublisher file looks like this:
package net.jimmc.util //For subscribers of things that turn on and off class AbledPublisher extends Publisher[AbledPublisher.Abled] // use "import AbledPublisher._" to pick up these definitions object AbledPublisher { sealed abstract class Abled { val state:Boolean } case object Enabled extends Abled { override val state = true } case object Disabled extends Abled { override val state = false } object Abled { def apply(b:Boolean) = if (b) Enabled else Disabled } }
Given the above AbledPublisher class and object, I modified my code so that the action method called by each menu checkbox publishes an Enabled or Disabled event that matches the new state of the checkbox, and for each place in the old code where an action method called a state-changing method on another component I set up that target component as a subscriber to the appropriate publisher that, when it receives a published event, takes appropriate action on itself.

With the above changes, and a slight change to my SCheckBoxMenuItem class so that it passes itself to the action callback, the code now looks like this:
import net.jimmc.util.AbledPublisher import net.jimmc.util.AbledPublisher._ class ViewListGroup ... { vlg:ViewListGroup => ... private val showFileInfoPublisher = new AbledPublisher private val showSingleViewerPublisher = new AbledPublisher private val showDirectoriesPublisher = new AbledPublisher ... def getComponent():Component = { ... val singleComp = playViewSingle.getComponent() showSingleViewerPublisher.subscribe((ev)=> { singleComp.setVisible(ev.state) singleComp.getParent.asInstanceOf[JSplitPane].resetToPreferredSizes() }) ... //Add our menu items val mShowFileInfo = new SCheckBoxMenuItem( viewer,"menu.List.ShowFileInfo")((cb)=> showFileInfo(cb.getState)) mShowFileInfo.setState(true) showFileInfoPublisher.subscribe((ev)=> mShowFileInfo.setState(ev.state) ) m.add(mShowFileInfo) val mShowFileIcons = new SCheckBoxMenuItem( viewer,"menu.List.ShowFileIcons")((cb)=> showFileIcons(cb.getState)) mShowFileIcons.setState(false) showFileInfoPublisher.subscribe((ev)=> mShowFileIcons.setState(ev.state) ) m.add(mShowFileIcons) val mShowDirDates = new SCheckBoxMenuItem( viewer,"menu.List.ShowDirDates")((cb)=> showDirDates(cb.getState)) mShowDirDates.setState(playViewList.includeDirectoryDates) mShowDirDates.setVisible(includeDirectories) showFileInfoPublisher.subscribe((ev)=> mShowDirDates.setState(ev.state) ) showDirectoriesPublisher.subscribe((ev)=> mShowDirDates.setVisible(ev.state) ) m.add(mShowDirDates) val mShowSingleViewer:SCheckBoxMenuItem = new SCheckBoxMenuItem( viewer,"menu.List.ShowSingleViewer")((cb)=> showSingleViewer(cb.getState)) mShowSingleViewer.setState(true) showSingleViewerPublisher.subscribe((ev)=> mShowSingleViewer.setState(ev.state) ) m.add(mShowSingleViewer) showSingleViewer(mShowSingleViewer.getState) //make sure window state is in sync with menu item state ... } ... def showFileInfo(b:Boolean) { playViewList.showFileInfo(b) showFileInfoPublisher.publish(Abled(b)) } def showFileIcons(b:Boolean) { playViewList.showFileIcons(b) playViewList.redisplayList() } def showDirDates(b:Boolean) { playViewList.includeDirectoryDates = b playViewList.redisplayList() } def showSingleViewer(b:Boolean) { showSingleViewerPublisher.publish(Abled(b)) playViewList.requestSelect } ... }
The total number of lines of code in ViewListGroup is actually a bit more than before, but I find the code a little easier to understand because all of the code that acts on a UI component is now localized in one place in the source file. All of the vars that held pointers to those components are now gone, replaced by a few vals for the publishers. The publishers use vars to maintain internal state, but that state is simple and easily understood, well encapsulated and multi-thread safe.

There is still more cleanup work to be done in Mimprint. For example, in the above code the checkbox action methods such as showFileInfo and showFileIcons call methods on the playViewList object as well as publishing an Abled event. Instead, I could set up playViewList as a listener on each of the published events, then make the menu checkbox actions directly publish an event and get rid of the showXXX methods. I will leave that for another round of cleanup.

Thursday, October 1, 2009

Initializing Immutable Variables in Scala

One of the guidelines I picked up when I learned Scala is to use immutable variables as much as possible. Besides the trivial but satisfying detail of making the declaration of an immutable variable (val) take no more characters than a mutable one (var), Scala also provides some interesting ways to set the values into those immutable variables.

In Scala, immutable variables are identified by declaring them using the val keyword rather than var. In Java, immutable variables are identified by adding the final qualifier to the variable declaration. But a Java final variable has slightly different semantics than a Scala val: in Java, you can declare a final variable without specifying a value for it, then fill in the value later. Java allows the variable to be assigned once, after which it can not be assigned again. In Scala, a concrete val must have its value assigned as part of the definition.

Consider this sample Java class, Interval, which represents an interval on the real number line. We want to allow the constructor to be called with endpoints in either order, but we want to store them internally in sorted order.
//Java code public class Interval { final double start; final double end; //invariant: end>=start public Interval(double x1, double x2) { if (x1>x2) { start = x2; end = x1; } else { start = x1; end = x2; } } //other methods that use start and end go here }
If you try this idiom in Scala, by replacing each final variable with a val but continuing to use the same initialization construct, you will get a compiler error "reassignment to val". When using a concrete val in Scala, you must supply the value in the statement where you declare the val.

For relatively simple cases, as in this example, we can take advantage of the fact that Scala allows us to build expressions with if in them, so we can express the same functionality as in the above Java code as follows:
class Interval(x1:Double, x2:Double) { val start = if (x2>x1) x1 else x2 val end = if (x2>x1) x2 else x1 //other methods that use start and end go here }
Sometimes the logic to calculate the values for the immutable variables is much more complicated than this and more expensive to calculate. Perhaps, as in our Java example, we don't want to recalculate that condition over again for each variable. We might also be more comfortable building up our values using mutable variables. We could take the easy and straightforward way and just use var rather than val for our variables, but it is worth a bit of effort to retain the immutability of our variables. Here is an approach I sometimes take:
class Interval(x1:Double, x2:Double) { val (start, end) = { def intervalNeedsReversing(a:Double,b:Double) = (a>b) if (intervalNeedsReversing(x1,x2)) (x2, x1) else (x1, x2) } //other methods that use start and end go here }
In the above approach, we have a block of code that calculates our values. Though not needed in this case, the intervalNeedsReversing function is an example of how you you can define functions within a block in order to refactor that code or better organize it. The value of the block is a tuple, which we then assign using a tuple-assignment to our immutable variables start and end.

A tuple-assignment is a pattern-matching operation that pulls apart the tuple data and stores each piece into the separate variables. It looks like the second line in this example:
val t2 = (123, "abc") //the type of t2 is Tuple2[Int, java.lang.String] val (n, s) = t2 //assigns n=123, s="abc"
You can use any expression in place of t2 that has the same type, including a function call, a variable, a literal tuple, or a code block.

You can include a type on each variable name; if the types of the assigned variables don't match the corresponding types of the value on the right hand side, you will get a compiler error.
val (n:Int, s:String) = t2 //ok val (s:String, n:Int) = t2 //error
The tuple syntax of parentheses around a comma-separated list of values is actually a shorthand for the TupleN class. For each pair of lines below, the first line is a shorthand ("syntactic sugar") for the second.
(a, b) Tuple2(a, b) (1, "x", "y") Tuple3(1, "x", "y") val (n, s) = t2 val Tuple2(n, s) = t2
The last of the three examples above is a pattern-matching assignment statement.

You can use the List pattern in an assignment as well:
val a :: b :: c = List(1,2,3,4) //This assigns a:Int=1, b:Int=2, c:List[Int]=List(3,4)
The List and Tuple classes can be used in a pattern-matching assignment like this because they each have an extractor defined by the unapply method in their companion object. You can use any extractor (that is, any declared object that includes an unapply method) in this way. For example, a case class can be used:
case class Foo(num:Int, str:String) val f = Foo(42,"ok") val Foo(n,s) = f //assigns n:Int=42, s:String="ok"
This works even if the case class happens to use mutable fields: the values at the time of the pattern match assignment are set into the new variables, which are immutable.
case class Bar(var num:Int, var str:String) val b = Bar(42,"ok") b.num += 1 b.str = "no" val Bar(n,s) = b //assigns n:Int=43, s:String="no" b.num += 1 //does not change n
For example, if you have a large number of values to set at once, you could declare a case class to represent them, and match on that to assign the values:
class AnotherExample { case class MyArgs(var name:String, var pathPart:String, var someNumber:Int) val MyArgs(path, part, num) = { val m = MyArgs("/path/foo/bar", "partX", 123) //change values of fields in m as desired m } }
You thus get the benefit of having immutable variables for use in your constructed object, but you can use mutable private data within the block to make it easier to do your construction.

You can use this technique to initialize immutable variables within a method as well. Effectively, you are using mutable variables only for the limited scope in which they are desired. By enclosing them in a block you prevent code outside that block from modifying those mutable values.

Since this technique is based on pattern matching, you can use it with any legal pattern. Pattern matching is typically used in the case clauses of match statements.

Patterns can include nested constructs, which allows you to pull out values from deep within a structure when that structure is known. By using the @ operator within a pattern you can extract the value of an entire subpattern:
case class Foo(n:Int, var s:String) case class Baz(f:Foo, b:Option[Baz]) val data = 123 :: Baz(Foo(3,"c"),Some(Baz(Foo(4,"d"),None))) :: 456 :: Nil val _ :: Baz(Foo(_,a),Some(b @ Baz(c @ Foo(d,e),_))) :: f :: _ = data // The above val statement assigns these values: // a = "c" // b = Baz(Foo(4,"d"),None) // c = Foo(4,"d") // d = 4 // e = "d" // f = 456
The undersccore indicates a placeholder for a part of the pattern whose value we don't care about and don't want assigned to anything.

Note that the variable c refers to the same object as the Foo object that appears in variable b. We defined Foo with a var for s. If we change the value of the Foo object referenced by variable c, then we will see that change when we ask for the value of variable b:
scala> b res0: Baz = Baz(Foo(4,d),None) scala> c res1: Foo = Foo(4,d) scala> c.s = "x" scala> c res2: Foo = Foo(4,x) scala> b res3: Baz = Baz(Foo(4,x),None)
Although b and c are themselves immutable variables, if they point to the same mutable object then changes made to that object through one variable will be visible through the other variable.

As you learn Scala and see examples of case statements, remember that any syntax that is valid as the pattern match in a case statement is also valid as a pattern match in a val assignment.