Perhaps you ask, Why? If I were being selfish I might answer, Because I want more people to use Scala so that it becomes easier for me to use it where I work. Or if I were being arrogant about the language I might say, So that you will be ready for it's inevitable rise as the successor to Java. But instead I will give an answer that I hope will have a stronger and more immediate appeal: Because it will make you a better Java programmer.
Learning Scala will not necessarily make everybody who knows Java a better programmer. If you are a Java expert and you also happen to know Haskell, ML and Erlang inside out, then perhaps Scala does not hold much new for you. But if you do know those languages, you probably consider yourself something a little different than "a Java programmer", which if you notice is the phrase I used above.
Contents
Why
What's wrong with just knowing Java? Some people claim learning Java stunts your intellectual growth as a developer. Here are just a few negative comments about Java:- "Java takes the science out of computer science."
- "Computer Science (CS) education is neglecting basic skills. ... Java ... is in part responsible for this decline."
- "Java is not a good teaching language."
It is common advice that you should learn multiple programming languages. Some people say you should learn a new language every year. Each language will have at least some little corners that will present you with new concepts. You should make a point to learn other languages that have deep roots in Computer Science to ensure that at least some of those new concepts are substantial.
Why learn Scala rather than Haskell or something else? Scala integrates functional programming with object-oriented programming. When coming from the object-oriented Java world, Scala allows you to gradually learn functional techniques while still being able to use familiar object-oriented techniques. For a Java programmer, learning Scala may be easier than learning other functional languages that are not object-oriented.
Scala also has the advantage of running on the JVM and allows you easily to make direct calls to Java code. There are other languages that run on the JVM and can call Java, but none that do so as easily as Scala and that integrate the functional and object-oriented approaches as well. This means you can immediately start using Scala code along with the rest of your Java code, and you can leverage your knowledge of all those Java libraries.
Dr. Dobbs, in a journal entry about learning Scala if you use Java, says "[Scala is] the Java route to [Functional Programming]". He also points out that there is an Eclipse plugin for Scala, so you can continue to use Eclipse. Some other Java tools also work with Scala. I happen to like the jswat debugger and have used it on my Scala programs.
Some things you will learn from Scala:
- The importance of immutable values.
- The simpler composition of functions/methods that have no side effects.
- The Actor model for concurrent processing.
- How to think using higher order functions.
- A better understanding of variance (covariance and contravariance).
But first, an overview of Scala.
What
Scala is a combined object-oriented and functional language. It was created by Martin Odersky, a computer scientist at EPFL in Lausanne, Switzerland. Odersky codesigned and implemented (with Philip Wadler) the Pizza and GJ (Generic Java) extensions to Java, after which he was hired by Sun to work on the Java compiler, to which he added generics.Scala was originally intended to run on both the Java Virtual Machine (JVM) and the dotNET Common Language Runtime (CLR). Unfortunately, the dotNET implementation seems to have fallen by the wayside, so Scala has de-facto become a JVM-only language.
As of this writing, Scala has been around for over five years, has been relatively stable for over a year, and is now at version 2.7.3.
With that little bit of background about Scala out of the way, let's get back to those Things You Will Learn.
Immutable Values
Using immutable values makes it easier to write code without side effects, reduces the likelihood of concurrency bugs, and can make code easier to read and understand. Scala separates the concept of aval
from a var
.
A val
in Scala is like a final
variable in Java:
once a value has been set, it can not be changed.
In Java, you have to add the final
keyword to a variable
declaration to make it immutable.
In Scala, you have to use either var
or val
.
This forces you to think about that choice, and since it is just as
easy to type val
as var
, there is little
reason not to do so if you don't think the value should change.
Sure, you can just add final
in Java, but the language
does not encourage you to think about that detail, and the default is
for everything to be mutable.
Referential Transparency (No Side Effects)
To take a paragraph from one of my previous posts:Referential transparency is a phrase from the functional programming world which means, basically, "no side effects". Side effects include reading any state which is not passed in as an argument or setting any state which is not part of what is passed back as an argument. If a function is referentially transparent, then a call to that function with a specific set of values as arguments will always return exactly the same value.Functions with side effects are harder to test, harder to reason about, and in general harder to get right. As you compose functions with side effects, the side effects tend to accumulate, making the composed function even more difficult to get right.
In imperative languages such as Java the natural way of writing many functions is to use variables (mutable data) and loops. In functional languages there are other ways things are more typically done, including the use of recursion and higher order functions, that don't require the use of mutable variables.
In object-oriented languages objects may contain state (mutable instance data) as well as data. When a method uses mutable instance data, it now has side effects, with all of the additional considerations that requires.
It is almost inherent in the nature of object oriented languages to encourage the use of instance state data. But because Scala has one foot in the functional language community, if you learn Scala you will also be drawn into that community and will learn some techniques for writing code without side effects and the advantages of doing so.
Actor Concurrency
Writing correct concurrent code is hard. Java made it easier to do by introducing monitors and building thread control and synchronization into the language. (As a commentary on how hard it is to get concurrency right, even in the original Java Language Specification they did not quite get it right, and had to redefine the memory model with JSR-133 for Java 1.5).With Java's threads and the
synchronized
keyword
it is easier to write code that doesn't corrupt data due
to simultaneous access by multiple threads,
as long as you are careful to synchronize all access to shared data.
The monitors that are used for synchronization are external to the methods that lock on them, which means any method that uses
synchronized
is not
referentially transparent (it has the side effect of locking the
monitor, which is visible outside the function while the function
is running), which in turn implies that functions that use
synchronized
are harder to compose.
In fact, this is precisely the case: the more functions you compose that use
synchronized
, the more likely you are to run into
a deadlock problem, which is an undesired interaction between those
side-effects of the functions.
The Java thread/monitor model works well enough for a small number of
threads dealing with a very small number of shared objects,
but it is very difficult to manage a large program with many
simultaneous threads accessing multiple shared objects.
Scala supports the Java approach to concurrency using threads and
synchronized
, but it also provides another model for
concurrency that scales up much better: the
Actor model.
Actors are a message-passing concurrency mechanism
borrowed
from
Erlang,
a language designed for high concurrency.
An Actor is an object that is responsible for maintaining data (or access to any other resource) that needs to be shared by multiple threads. The Actor is the only object allowed to access that data (or resource). Other threads communicate with the Actor by sending messages to it; the Actor can respond by sending messages back (if the other thread is an Actor). Typically the messages are immutable. Scala does not enforce this, but using mutable messages makes it more difficult to scale. Each Actor has a message inbox where incoming messages are queued. Scala's Actor library handles all of the message transfers, so the programmer does not have to deal with synchronizing any code.
There are many levels of possible problems that can arise with concurrent programs:
- Data corruption due to concurrent access.
- Deadlock.
- Resource bottleneck or starvation.
synchronized
makes it easier to write concurrent code that does not suffer from
data corruption due to multiple concurrent access, but we still
have to worry about deadlock.
Scala's Actor library helps get past the next level:
the Actor model can support huge numbers of active actors.
all with shared access to a very large number of shared resources,
without deadlock,
allowing the programmer to focus on ensuring that the higher level
issues such as resource bottlenecks will not be a problem.
According to
Haller's paper (page 14),
he was able to run
1,200,000 simultaneously active Actors,
whereas the equivalent test using threads on the same hardware
ran out of memory and was unable to create 5500 threads.
I have heard that there is an Actor library for Java called Kilim, but I have not tried it.
Higher Order Functions
This is really what functional programming is all about. In a functional language, functions are first-class objects that can be assigned to variables and passed to other functions, the same as any other data type. This allows for a style of factoring that sometimes allows code to be written much more concisely, which (assuming you understand the whole concept of passing functions around as objects) often also makes the code easier to understand.You can do something like passing a function around in Java by defining an interface with a named method, passing an object that implements that interface, and invoking it by using the method name. While this sort of works, it requires an annoying amount of boilerplate and it doesn't necessarily make the resulting code easier to read.
Scala provides a set of classes called
Function0
,
Function1
,
Function2
, etc., and
a bunch of special compiler syntax so that you can write relatively
concise functional code, which the compiler then translates into
the appropriate classes, instances and method calls to make it all
work in the Java VM.
The code is not quite as concise as in some other functional languages,
because of limitations due to how the type system works
(object-oriented type hierarchies and global type inference
don't mix
very well),
but it's much more concise than the equivalent Java code.
Variance
Variance has to do with higher-order types, such asList<String>
in Java or
List[String]
in Scala.
Before generics were introduced in Java, there was no type information for higher-order types, except for arrays, so there was no way to do anything about covariance or contravariance. With the addition of generics to Java, covariance and contravariance checks became possible. Unfortunately, because of Java's legacy of having started off without the higher-order type information, the generics definition has a few problems that can make the whole concept a bit harder to understand.
In Scala, variance was designed in from early on so the whole thing is cleaner. It does admittedly have some problems: although cleaner than Java, it's not as clean as the pure functional languages like Haskell; it has it's own share of odd corner cases (although they are much further into the corners than in Java); and, because Scala has to run on the JVM, it has the same limitations as Java relating to the lack of runtime information about higher order types (type erasure).
Java arrays are broken in terms of variance. After learning about variance you will understand why you can't safely cast
String[]
to Object[]
in Java.
How
How can these lessons be applied to Java? Here's a brief list of some things you can do.- immutable values: use "final" more.
- no side effects: write more pure functions (no mutable variables) write methods not to use global or instance state, use more recursion.
- Actor model: check out Kilim.
- higher order functions: you can use interfaces, although it is not nearly as convenient. Maybe you will be able to adopt a more functional style in Java if one of the closure proposals gets implemented. In which case after learning Scala you'll be ahead of the game in Java because you will already know how to use higher order functions effectively.
Perhaps, after reading all of the above, you have decided that you should learn Scala. Great! How can you go about that? My basic advice:
- Read about Scala: articles, blogs, books, newsgroups.
- Write some code. As soon as you can, and as much as you can. Applets, programs, libraries, anything. There is no substitute for writing code.
- Run the Scala interpreter and type things in.
Here are some pointers to some things you can read to get you started:
- For an overview of Scala's main characteristics: A Tour of Scala.
- For a quick-start guide: First Steps to Scala by Bill Venners, Martin Odersky and Lex Spoon.
- For continued reading: the Scala Reference Manuals on the official Scala web site, including A Brief Scala Tutorial and Scala By Example.
- For a series of articles aimed at Java developers: Ted Neward's Busy Java Developer's Guide To Scala, with 11 articles in the series. Or you can try Daniel Spiewak's Scala for Java Refugees blog entries.
- For in-depth technical details: The Scala Language Specification.
- For a condensed summary of Scala's syntax: my Scala Syntax Primer.
- For the authoritative tutorial: buy the 750 page book Programming in Scala by Martin Odersky, Lex Spoon and Bill Venners, available in both paper and as an eBook.
- For some general programming exercises that you can use when looking for something to code in Scala: Project Euler.
Updated 2009-01-18: Fixed var/val typo as pointed out by Doug.
10 comments:
"A var in Scala is like a final variable in Java: once a value has been set, it can not be changed"
You mean val not var
"it is just as easy to type val as var"
which is exactly why one of the two should be dropped from scala, or changed to something more distinct.
Probably the simplest solution would be to just add 'final' again, or else use combine it with a non-null ! symbol:
MyClass! m = ...
or else make immutable/non-nullable the default and have a symbol for the reverse:
def mymethod( param1 : MyClass?)
Jim. Thanks for all your hard work promoting Scala. I've found your articles, especially the syntax and operator primers, invaluable.
@Doug,
You are free to drop var from your programming and borrow a page from ML with a Ref[T] class for mutable variables.
scala> final case class Ref[T](var value : T) {
| def ! = value
| def set(x : T) = value = x
| }
defined class Ref
scala> val x = Ref(3)
x: Ref[Int] = Ref(3)
scala> x!
res3: Int = 3
scala> x set 4
scala> x!
res5: Int = 4
Scala tries to strike a narrow balance: on one hand it strives to make it easy to write functional code. On the other it strives to make be a comfortable language for imperative programmers, especially those from a Java background. The val/var distinction meets those goals quite admirably.
Doug: An ironic typo. If I had been writing real Scala code using a syntax-coloring editor, I expect I would not have made that mistake. I have updated the article to fix that error.
If we had to get rid of one of val or var, I would rather keep val the simple one and make var more verbose. But I don't think that's necessary. The difference between + and - or * and / is also a single character, but I doubt many people would argue that one of them should be dropped or changed for that reason.
dr: You're welcome, thanks for your encouragement.
Thanks for the articles Jim
I was looking for a Scala quick start tutorial. Do you know one?
Thanks
Satya: Take a look at the links in the list of bullets at the end of my post, in particular the second bullet.
thanks for the many posts, Jim. I was curious about that Dewar Schonberg article linked from your post that results in a 404.
i tracked it down to here: http://www.crosstalkonline.org/storage/issue-archives/2008/200801/200801-Dewar.pdf
If going to functional programming while maintaining Java functionality and compatibility was really the ultimate goal, wouldn't Clojure be a better option?
amethod: I wanted to focus on comparing Java to Scala rather than comparing Scala to other JVM languages. As I state in my post, "Scala allows you to gradually learn functional techniques while still being able to use familiar object-oriented techniques." I think moving from Java to Scala is enough of a jump for most Java programmers; convincing those programmers to jump straight to Clojure would be a more difficult task, which I will leave to someone else.
Post a Comment