Perhaps you ask, Why? If I were being selfish I might answer, Because I want more people to use Scala so that it becomes easier for me to use it where I work. Or if I were being arrogant about the language I might say, So that you will be ready for it's inevitable rise as the successor to Java. But instead I will give an answer that I hope will have a stronger and more immediate appeal: Because it will make you a better Java programmer.
Learning Scala will not necessarily make everybody who knows Java a better programmer. If you are a Java expert and you also happen to know Haskell, ML and Erlang inside out, then perhaps Scala does not hold much new for you. But if you do know those languages, you probably consider yourself something a little different than "a Java programmer", which if you notice is the phrase I used above.
WhyWhat's wrong with just knowing Java? Some people claim learning Java stunts your intellectual growth as a developer. Here are just a few negative comments about Java:
- "Java takes the science out of computer science."
- "Computer Science (CS) education is neglecting basic skills. ... Java ... is in part responsible for this decline."
- "Java is not a good teaching language."
It is common advice that you should learn multiple programming languages. Some people say you should learn a new language every year. Each language will have at least some little corners that will present you with new concepts. You should make a point to learn other languages that have deep roots in Computer Science to ensure that at least some of those new concepts are substantial.
Why learn Scala rather than Haskell or something else? Scala integrates functional programming with object-oriented programming. When coming from the object-oriented Java world, Scala allows you to gradually learn functional techniques while still being able to use familiar object-oriented techniques. For a Java programmer, learning Scala may be easier than learning other functional languages that are not object-oriented.
Scala also has the advantage of running on the JVM and allows you easily to make direct calls to Java code. There are other languages that run on the JVM and can call Java, but none that do so as easily as Scala and that integrate the functional and object-oriented approaches as well. This means you can immediately start using Scala code along with the rest of your Java code, and you can leverage your knowledge of all those Java libraries.
Dr. Dobbs, in a journal entry about learning Scala if you use Java, says "[Scala is] the Java route to [Functional Programming]". He also points out that there is an Eclipse plugin for Scala, so you can continue to use Eclipse. Some other Java tools also work with Scala. I happen to like the jswat debugger and have used it on my Scala programs.
Some things you will learn from Scala:
- The importance of immutable values.
- The simpler composition of functions/methods that have no side effects.
- The Actor model for concurrent processing.
- How to think using higher order functions.
- A better understanding of variance (covariance and contravariance).
But first, an overview of Scala.
WhatScala is a combined object-oriented and functional language. It was created by Martin Odersky, a computer scientist at EPFL in Lausanne, Switzerland. Odersky codesigned and implemented (with Philip Wadler) the Pizza and GJ (Generic Java) extensions to Java, after which he was hired by Sun to work on the Java compiler, to which he added generics.
Scala was originally intended to run on both the Java Virtual Machine (JVM) and the dotNET Common Language Runtime (CLR). Unfortunately, the dotNET implementation seems to have fallen by the wayside, so Scala has de-facto become a JVM-only language.
As of this writing, Scala has been around for over five years, has been relatively stable for over a year, and is now at version 2.7.3.
With that little bit of background about Scala out of the way, let's get back to those Things You Will Learn.
Immutable ValuesUsing immutable values makes it easier to write code without side effects, reduces the likelihood of concurrency bugs, and can make code easier to read and understand. Scala separates the concept of a
valin Scala is like a
finalvariable in Java: once a value has been set, it can not be changed. In Java, you have to add the
finalkeyword to a variable declaration to make it immutable. In Scala, you have to use either
val. This forces you to think about that choice, and since it is just as easy to type
var, there is little reason not to do so if you don't think the value should change. Sure, you can just add
finalin Java, but the language does not encourage you to think about that detail, and the default is for everything to be mutable.
Referential Transparency (No Side Effects)To take a paragraph from one of my previous posts:
Referential transparency is a phrase from the functional programming world which means, basically, "no side effects". Side effects include reading any state which is not passed in as an argument or setting any state which is not part of what is passed back as an argument. If a function is referentially transparent, then a call to that function with a specific set of values as arguments will always return exactly the same value.Functions with side effects are harder to test, harder to reason about, and in general harder to get right. As you compose functions with side effects, the side effects tend to accumulate, making the composed function even more difficult to get right.
In imperative languages such as Java the natural way of writing many functions is to use variables (mutable data) and loops. In functional languages there are other ways things are more typically done, including the use of recursion and higher order functions, that don't require the use of mutable variables.
In object-oriented languages objects may contain state (mutable instance data) as well as data. When a method uses mutable instance data, it now has side effects, with all of the additional considerations that requires.
It is almost inherent in the nature of object oriented languages to encourage the use of instance state data. But because Scala has one foot in the functional language community, if you learn Scala you will also be drawn into that community and will learn some techniques for writing code without side effects and the advantages of doing so.
Actor ConcurrencyWriting correct concurrent code is hard. Java made it easier to do by introducing monitors and building thread control and synchronization into the language. (As a commentary on how hard it is to get concurrency right, even in the original Java Language Specification they did not quite get it right, and had to redefine the memory model with JSR-133 for Java 1.5).
With Java's threads and the
synchronizedkeyword it is easier to write code that doesn't corrupt data due to simultaneous access by multiple threads, as long as you are careful to synchronize all access to shared data.
The monitors that are used for synchronization are external to the methods that lock on them, which means any method that uses
synchronizedis not referentially transparent (it has the side effect of locking the monitor, which is visible outside the function while the function is running), which in turn implies that functions that use
synchronizedare harder to compose.
In fact, this is precisely the case: the more functions you compose that use
synchronized, the more likely you are to run into a deadlock problem, which is an undesired interaction between those side-effects of the functions. The Java thread/monitor model works well enough for a small number of threads dealing with a very small number of shared objects, but it is very difficult to manage a large program with many simultaneous threads accessing multiple shared objects.
Scala supports the Java approach to concurrency using threads and
synchronized, but it also provides another model for concurrency that scales up much better: the Actor model. Actors are a message-passing concurrency mechanism borrowed from Erlang, a language designed for high concurrency.
An Actor is an object that is responsible for maintaining data (or access to any other resource) that needs to be shared by multiple threads. The Actor is the only object allowed to access that data (or resource). Other threads communicate with the Actor by sending messages to it; the Actor can respond by sending messages back (if the other thread is an Actor). Typically the messages are immutable. Scala does not enforce this, but using mutable messages makes it more difficult to scale. Each Actor has a message inbox where incoming messages are queued. Scala's Actor library handles all of the message transfers, so the programmer does not have to deal with synchronizing any code.
There are many levels of possible problems that can arise with concurrent programs:
- Data corruption due to concurrent access.
- Resource bottleneck or starvation.
synchronizedmakes it easier to write concurrent code that does not suffer from data corruption due to multiple concurrent access, but we still have to worry about deadlock. Scala's Actor library helps get past the next level: the Actor model can support huge numbers of active actors. all with shared access to a very large number of shared resources, without deadlock, allowing the programmer to focus on ensuring that the higher level issues such as resource bottlenecks will not be a problem. According to Haller's paper (page 14), he was able to run 1,200,000 simultaneously active Actors, whereas the equivalent test using threads on the same hardware ran out of memory and was unable to create 5500 threads.
I have heard that there is an Actor library for Java called Kilim, but I have not tried it.
Higher Order FunctionsThis is really what functional programming is all about. In a functional language, functions are first-class objects that can be assigned to variables and passed to other functions, the same as any other data type. This allows for a style of factoring that sometimes allows code to be written much more concisely, which (assuming you understand the whole concept of passing functions around as objects) often also makes the code easier to understand.
You can do something like passing a function around in Java by defining an interface with a named method, passing an object that implements that interface, and invoking it by using the method name. While this sort of works, it requires an annoying amount of boilerplate and it doesn't necessarily make the resulting code easier to read.
Scala provides a set of classes called
Function2, etc., and a bunch of special compiler syntax so that you can write relatively concise functional code, which the compiler then translates into the appropriate classes, instances and method calls to make it all work in the Java VM. The code is not quite as concise as in some other functional languages, because of limitations due to how the type system works (object-oriented type hierarchies and global type inference don't mix very well), but it's much more concise than the equivalent Java code.
VarianceVariance has to do with higher-order types, such as
List<String>in Java or
Before generics were introduced in Java, there was no type information for higher-order types, except for arrays, so there was no way to do anything about covariance or contravariance. With the addition of generics to Java, covariance and contravariance checks became possible. Unfortunately, because of Java's legacy of having started off without the higher-order type information, the generics definition has a few problems that can make the whole concept a bit harder to understand.
In Scala, variance was designed in from early on so the whole thing is cleaner. It does admittedly have some problems: although cleaner than Java, it's not as clean as the pure functional languages like Haskell; it has it's own share of odd corner cases (although they are much further into the corners than in Java); and, because Scala has to run on the JVM, it has the same limitations as Java relating to the lack of runtime information about higher order types (type erasure).
Java arrays are broken in terms of variance. After learning about variance you will understand why you can't safely cast
HowHow can these lessons be applied to Java? Here's a brief list of some things you can do.
- immutable values: use "final" more.
- no side effects: write more pure functions (no mutable variables) write methods not to use global or instance state, use more recursion.
- Actor model: check out Kilim.
- higher order functions: you can use interfaces, although it is not nearly as convenient. Maybe you will be able to adopt a more functional style in Java if one of the closure proposals gets implemented. In which case after learning Scala you'll be ahead of the game in Java because you will already know how to use higher order functions effectively.
Perhaps, after reading all of the above, you have decided that you should learn Scala. Great! How can you go about that? My basic advice:
- Read about Scala: articles, blogs, books, newsgroups.
- Write some code. As soon as you can, and as much as you can. Applets, programs, libraries, anything. There is no substitute for writing code.
- Run the Scala interpreter and type things in.
Here are some pointers to some things you can read to get you started:
- For an overview of Scala's main characteristics: A Tour of Scala.
- For a quick-start guide: First Steps to Scala by Bill Venners, Martin Odersky and Lex Spoon.
- For continued reading: the Scala Reference Manuals on the official Scala web site, including A Brief Scala Tutorial and Scala By Example.
- For a series of articles aimed at Java developers: Ted Neward's Busy Java Developer's Guide To Scala, with 11 articles in the series. Or you can try Daniel Spiewak's Scala for Java Refugees blog entries.
- For in-depth technical details: The Scala Language Specification.
- For a condensed summary of Scala's syntax: my Scala Syntax Primer.
- For the authoritative tutorial: buy the 750 page book Programming in Scala by Martin Odersky, Lex Spoon and Bill Venners, available in both paper and as an eBook.
- For some general programming exercises that you can use when looking for something to code in Scala: Project Euler.
Updated 2009-01-18: Fixed var/val typo as pointed out by Doug.