Friday, September 12, 2008

Scala Syntax Primer

Scala runs on the JVM and can directly call and be called from Java, but source compatibility was not a goal. Scala has a lot of capabilities not in Java, and to help those new features work more nicely, there are a number of differences between Java and Scala syntax that can make reading Scala code a bit of a challenge for Java programmers when first encountering Scala. This primer attempts to explain those differences. It is aimed at Java programmers, so some details about syntax which are the same as Java are omitted.

This primer is not intended to be a complete tutorial on Scala, rather it is more of a syntax reference. For a much better introduction to the language, you should buy the book Programming in Scala by Martin Odersky, Lex Spoon and Bill Venners.

Most of these syntax differences can be explained by two of Scala's major goals:
  1. Minimize verbosity.
  2. Support a functional style.
The logic behind some of these syntax rules may at first seem arbitrary, but the rules support each other surprisingly well. Hopefully by the time you finish this primer, you will have no trouble understanding code fragments like this one:
Scala is an integrated object/functional language. In the discussion below, the terms "method" and "function" are used interchangeably.

You can run the Scala interpreter and type in these examples to get a better feel for how these rules work.



  • Scala classes do not need to go into files that match the class name, as they do in Java. You can put any Scala class in any Scala file. The only time it makes a difference is when you have a class and an object of the same name and want them to be companions, in which case they must be in the same file.
  • Semicolons are optional. If you put one statement on each line, you don't need semicolons. If you want to put multiple statements on a line, you can do so by separating them with semicolons. There are specific rules about when statements can span lines, so sometimes you have to be a bit careful when doing so.
  • Every value is an object on which methods can be called. For example, 123.toString() is valid.
  • Scala includes implicit transformations that allow objects to be used in unexpected ways. If you see some source code where a method call is operating on an instance of a class which does not define that method, then probably the instance is being implicitly converted to a class on which that method is defined.
  • Scala's "Uniform Access Principle" means that variables and functions without parameters are accessed in the same way. In particular, a variable definition using the val or var keyword can be converted to a function definition simply by replacing that keyword with the def keyword. The syntax at the calling sites does not change.
  • Scala includes direct support for XML. This code fragment assigns an instance of type scala.xml.Elem to val x:
    val x = <head><title>The Title</title></head>
    You can mix Scala code in with the XML by putting it in braces. This code fragment produces the same resulting value as the above code:
    val title = "The Title" val x = <head><title>{ title }</title></head>
  • Just about everything can be nested. Packages can be nested inside packages, classes can be nested inside classes, defs can be nested inside defs.
  • As in Java, annotations are indicated by the @ character.


  • There is no static keyword. Methods and variables that you would declare static in Java go into an object rather than a class in Scala. Objects are singletons.
  • Scala has no break or continue statements. Fortunately, Scala's support of a functional programming style reduces the need for these.
  • Access modifiers such as protected and private can include a scope in square brackets, such as private[this] or protected[MyPackage1.MyPackage2]. The default access is public.
  • The val keyword declares an immutable value (a val), similar to the final keyword in Java. The var keyword declares a mutable variable.
  • Multiple items can be imported with one import statement:
    import java.text.{DateFormat,SimpleDateFormat}
  • Imported symbols can be renamed to other names, which provides a means to work around the problem of importing two symbols of the same name. For example, if you want to import both java.util.Date and java.sql.Date and be able to use them both without having to type the whole qualified name each time, you could do this:
    import java.util.{Date=>UDate} import java.sql.{Date=>SDate}
  • If an import is renamed to _, that symbol will not be imported. This allows importing everything except a specified symbol:
    import java.util.{Date=>_,_}
  • The abstract keyword is only used for abstract classes and traits. To declare an abstract function (def), val, variable (var), or type, you omit the = character and body of the item.
  • When overriding a method in a superclass, the override modifier must be specified. Overriding a method without using the override modifier or using the modifier when not overriding a superclass method will result in compilation error.
  • Some other keywords: lazy, implicit

Symbols and Literals

  • Scala allows multi-line strings quoted with triple-quotes:
    val longString = """Line 1 Line 2 Line 3""";
  • Symbol names can include almost any character. In particular, they can include all of the characters normally used as operators, such as *, +, ~ and :. The backslash character (\) is a valid symbol character, and in fact is used as a method name in the scala.xml.Elem class. Note that "abc\"def" is a seven character String with a double quote in the middle, but if abc is an instance of scala.xml.Elem, then abc\"def" is a call passing a three character String to the backslash method, a method that accepts a String argument and returns an instance of scala.xml.NodeSeq.
  • The underscore character (_) is used as a wildcard character rather than asterisk (*), such as in an import statement or in a case statement to represent a "don't care" value. This is because asterisk is a valid symbol character in Scala.
  • As in Java, by convention class names and object names start with an upper case letter, variable names start with a lower case letter.
  • In one case, whether a symbol starts with an upper or lower case actually matters to the compiler: in a case statement, that difference is used to disambiguate between a constant value, such as PI if Math.PI has been imported (starts with upper case) and a placeholder name being introduced (whose scope is then limited to the body of the case statement).


  • Any place an expression is expected, a block of expressions surrounded by braces can be used instead. The braces act as parentheses. For example, the expression 5 * { val a = 1; a + 2 } is valid and yields the value 15.
  • A mentioned above, symbol names can include almost any character, such as * and +. This is useful for defining methods that will be used as operators (see the rules under Function Calls for functions with one argument).
  • The precedence of operators is determined by their first character, and is hardwired to match the usual precedence. Thus if you create the operator methods *^ and +^, the *^ will have higher precedence. The one exception to this rule is that if the operator ends with an equal sign (=) and is not one of the standard relational operators, then it will have the same precedence as simple assignment.
  • When used as binary operators, any symbol which ends with a colon (:) is right-associative; all other symbols are left associative. The controlling object for right-associative operators goes on the right side of the operator, with its one argument on the left side. However, the left hand argument is still evaluated before the right hand argument.
  • The characters +, -, ! and ~ can be used as prefix operators in any class by defining the method unary_+, unary_- etc.
  • Every statement is an expression whose value is the last expression within that statement that was evaluated. For example, there is no ?: ternary operator in Scala. Instead, you use a standard if/then/else statement:
    val x = if (n>0) "positive" else "negative"
  • When used on the right hand side of a value or variable declaration, the underscore character means assign the default value. This is the same as not specifying a value in Java; in Scala, not specifying an initial value declares an abstract variable.
    val n = 123 //specific initial value var x:Int //abstract variable var y:Int = _ //default initial value

Case and Patterns

  • Like Java, Scala has a case statement to allow selecting one possible code path from many based on a value. A simple switch statement in Java might look like this:
    //Java code switch (n) { case 0: action0(); break; case 1: action1(); break; default: actionDflt(); break; }
    The equivalent code in Scala would be:
    n match { case 0 => action0() case 1 => action1() case _ => actionDflt() }
  • Instead of the switch keyword as in Java, Scala uses a match expression. The match keyword comes after the value being matched, unlike the relative positions of the switch keyword and the value in Java.
  • match works on all types, not just ints. For example, you can match on a String variable and have case statements each with a constant String value.
  • No break statement is required, and execution does not fall through to the next case.
  • match statements return values. The value of a match statement is the value of whichever branch was executed.
  • The underscore is used to indicate the default case.
  • In addition to constants, a case expression can include patterns, which allow for more complex matching. Case matching is handled by extractors, which can be implemented by writing an unapply method in an object.
  • A case pattern can include a variable declaration with a type, in which case the variable is defined with that type and set to the value of the matched data within the body of that case.
    n match { //assume n is of type Number case i:Int => //i is an Int here, like (int)n would be in Java case d:Double => //d is a Double here, like (double)n in Java case _ => //no values were defined in this case }
  • When matching a more complex expression, you can assign a variable name to an internal part of the pattern by writing the variable name and the @ character before the pattern:
    case Foo(a,b @ Bar(_)) //b gets set to the part that matches Bar(_)
  • Case expressions can be followed by a pattern guard before the =>. The pattern guard is the keyword if followed by a boolean expression:
    x match { //assume n is of type Number case Foo(a,b) if a==b => //here only when Foo with a==b case Foo(a,b) => //here for all other Foo case _ => //here for all non-Foo }
  • Case expressions work for XML. In this example, the variable b gets set to whatever is inside the body element if there is only one element there:
    case <body>{ b }</body> => //b is the contents
    To match multiple elements, use _* to match any sequence:
    case <body>{ b @ _* }</body> => //b is the contents
  • The catch block of a try/catch statement uses the same syntax as the body of a match statement:
    try { // code that might throw an exception goes here } catch { case e1:IllegalArgumentException => //e1 is IllegalArgumentException here case e2:ArrayOutOfBoundsException => //e2 is valid here case e3:RuntimeException => //any other RuntimeException comes here case _ => //all other exceptions come here }
  • A pattern can be used on the left hand side of an assignment. For example, the following code results in assigning the values 3 and 5 to the new variables x1 and y1:
    case class Point(x:Int,y:Int) //defines a simple value class val Point(x1,y1) = Point(3,5)
    This example is contrived, but the same kind of assignment works when calling a method that returns a value which is an object.
  • A pattern can be used on the left side of the <- operator in a generator in a for expression.

For Expressions

For Expressions are also called For Comprehensions.
  • A for expression consists of the for keyword, a sequence of specific kinds of elements separated by semicolons or newlines and surrounded by parentheses, and the yield keyword followed by an expression:
    for ( n <- 0 to 6 ; e = n%2; if e==0 ) yield n*n
  • The elements inside the parentheses can be any of the following:
    • A generator, such as n <- 0 to 6, which produces multiple values and assigns them to a new val (here n). The new val name appears to the left of the <- operator; to the right of that operator is a value which implements the foreach method to generate a series of values.
    • A definition, such as e = n%2, which introduces a new value by performing the specified calculation.
    • A filter, such as if e==0, which filters out the values which do not satisfy that expression.
  • The val name in a generator can instead be a pattern, similarly to how a pattern can be used in an assignment statement or a case expression. For example:
    val list = List((1,2),(3,4),(5,6)) for ( (a,b) <- list) yield a+b
    yields List(3, 7, 11).
  • The elements can be placed inside of braces rather than parentheses and separated by newlines rather than semicolons:
    for { n <- 0 to 6 e = n%2 if e==0 } yield n*n
  • When multiple generators are specified, each generator is repeated for each value produced by the preceding generator. For example, the expression
    for ( x <- 0 to 4 ; y <- 0 until 3) yield (x,y)
    produces a value starting with (0,0), (0,1), (0,2) and ending with (4,2).
  • The type of the value produced by a for expression is the same as the type of the first generator.
  • As an alternative to using yield followed by an expression, you can omit the yield keyword and use a block of code in place of a single expression.
  • A for statement can always be translated into a series of foreach, filter, map and flatMap method calls. In that sense, the for statement is syntactic sugar.


  • Array indexes are specified with parentheses rather than square brackets.
  • Array access is implemented the same way as function access, using the apply method.
  • The code arr(index) is converted to arr.apply(index).
  • The code arr(index) = newval is converted to arr.update(index,newval).
  • Arrays are declared using the Array keyword and with the element type in square brackets, rather than using empty square brackets after a type as is done in Java. For example, an array with space for three Strings would be declared like this:
    val x = new Array[String](3)
  • A two dimensional 3 by 3 array of Strings would be declared like this:
    val x = new Array[Array[String]](3, 3).


  • Scala has built in support for Tuples, from one element to 22 elements. A Tuple is a small ordered collection of objects, where each object can have a different type.
  • The types for Tuples of various sizes are Tuple1 through Tuple22. These types have N type parameters, where N is the Tuple size. For example, a two element Tuple with an Int and a String has type Tuple2[Int,String].
  • The Pair object allows that word to be used instead of Tuple2 for building and matching two element Tuples.
  • The Triple object allows that word to be used instead of Tuple3 for building and matching three element Tuples.
  • You can create a tuple by enclosing the object in parentheses and separating them by commas: (1, 2, "foo") is a Tuple3[Int,Int,String].
  • You can create a Tuple2 (a Pair) by using the -> operator, which works on any value: "a" -> 25 is the same as ("a", 25). The following expression is true:
    This is done by an implicit conversion from Any to Predef.ArrowAssoc, which contains the -> method.
  • The elements of a Tuple can be accessed as member fields _1, _2, _3 etc.
  • If an expression returns a Tuple, that can be assigned to a set of variables or vals. The following code assigns 5 to the new val tens and 8 to the new val ones.
    def div10(n:Int):Tuple2[Int,Int] = (n/10, n%10) val (tens, ones) = div10(58)
    This is a case of using a pattern on the left hand side of an assignment, as mentioned in the section on Cases and Patterns.


  • The primary constructor for a class is coded in-line in the class definition, i.e. the constructor statements are not contained within a definition inside the class. The constructor parameters are declared immediately after the class name, and superclass arguments are placed after the name of the class being extended.
  • Class parameters can be preceded by val to make them immutable instance values (vals), or by var to make them instance variables.
  • Class parameters can be preceded by an access modifier such as private or protected. By default, class parameters using val or var are public.
  • The primary constructor can be made private by adding the access modifier private before the parameter list.
  • trait is like interface in Java, but can include implementation code. Classes in Scala don't implement traits, they extend them same as classes. If a class extends multiple traits, or extends a class plus traits, the keyword with is used rather than commas as in Java.
  • Case classes are defined by adding the case keyword before the class keyword. This automatically does the equivalent of the following:
    • Prepends val to all parameters, making them immutable instance values.
    • Creates equals and hashCode methods so that instances of that class can safely be used in collections.
    • Creates a companion object of the same name with an apply method with the same args as declared for the class, to allow creation of instances without using the new keyword, and with an unapply method to allow the class name to be used as an extractor in case statements.
  • Anonymous classes can be defined without reference to an extending class, in which case they extend Object:
    val x = new { def cat(a:String, b:String) = a+b }
  • The type-parameterized isInstanceOf method is used to determine if an object is an instance of a specific class:
    if (x.isInstanceOf[Double]) ...
  • Similar to isInstanceOf, a value can be cast to a specific type by using the type-parameterized asInstanceOf method:
    if (x.isInstanceOf[Double]) { val d = x.asInstanceOf[Double] //operate on d } else { //not a double }
    However, the above construct is not typically used; instead, that functionality is implemented with a case statement, which simultaneously tests for a type and sets a new value of that type:
    x match { case d:Double => //operate on d case _ => //not a double }
  • The isInstanceOf method can be used to test if an object matches a trait as well as a class. It can also be used to test an instance against a structural definition, which can be used to test if an instance implements a specific method:
    type HasAddActionListenerMethod = { def addActionListener(a:ActionListener) } uiElement match { case c:HasAddActionListenerMethod => c.addActionListener(new ActionListener() { override def actionPerformed(ev:ActionEvent) { //insert your actionPerformed code here } }) }
  • Class literals are written classOf[MyClass] as opposed to MyClass.class as in Java.


  • All values in Scala are objects, so (except for compatibility with Java) there is no int/Integer or double/Double distinction. All integers are of type Int and all doubles are of type Double. (In previous versions of Scala, either upper case Int or lower case int was accepted, but convention now is to use only the upper case version, and this may be enforced by the compiler in the future.)
  • Type specifications are written as name:type rather than type name as in Java. This is to allow the type to be omitted in many cases, since Scala does type inference. For example, write n:Int rather than int n.
  • Types for generics are specified in square brackets [T] rather than in angle brackets <T> as in Java. Thus a generic type might be specified as F[A,B,C].
  • Scala supports covariant and contravariant type specifications at the definition site. These are declared with a leading + for covariant types and a leading - for contravariant types. Thus a function declaration F[+A,-B] means F is qualified by a covariant type A and a contravariant type B.
  • Types can be specified with upper and lower bounds. The expression T<:U means type U is an upper bound for T, whereas T>:U means type U is a lower bound for type T.
  • Types can be specified with view bounds, which are similar to upper bounds: The expression T<%U means type U is a view bound for T, which allows for implicit conversion to T and can thus support more actual types.
  • A higher kinded type with two type qualifiers, such as Pair[String,Int], can be written in infix notation by placing the higher kinded type name between its two type qualifiers, such as String Pair Int. This makes more sense if the higher kinded type name happens to use operator characters such as +. Thus when you see a type such as Quantity[M + M2], as used in the Quantity class in this file, that is the same as Quantity[+[M,M2]], so look for a type called + that takes two type qualifiers.
  • Existential types are supported with an expression like this:
    T forSome { type T }
    where the contents of the braces is some type declaration. This is mainly used when interfacing to Java code that either has raw types or uses Java's ? wildcard type.
  • The type T in an existential type specification can be replaced by a more complex expression:
    List[T] forSome { type T <: Component }
    In the above example, we are saying T is some type which is a subtype of Component.
  • The shorthand
    is the same thing as
    List[T] forSome { type T }
  • The shorthand
    List[_ <: Component]
    is the same thing as
    List[T] forSome { type T <: Component }
  • Type variables can be defined by using the type keyword. Similar to a typedef in C, the type variable simplifies code when a complicated type is used many times:
    type ALS = Array[List[String]] val a:ALS val b:ALS
    Type variables can also be abstract, in which case they must eventually be defined by a subclass.
  • A trait may include code that accesses another trait, in which case the class that includes the first trait must also include the second trait. In order to make this work, the first trait must include a "self type" referencing the second trait. The self type declaration is the first line of the body, usually declaring a type for this, but optionally using a different name in place of this:
    trait foo { //method for foo trait } trait bar { this : foo => //methods for bar trait, which can access foo methods }
  • Inner class types can be referenced using the outer and inner class names separated by a dot (.) as in Java, or using a pound sign (#). The dot syntax specifies a path-dependent type; the pound syntax specifies the generic inner class. For example, if you had this code:
    class Outer { class Inner {} }
    then you would use Outer#Inner to refer generically to that inner class. If you had an instance x of class Outer, you would refer to the specific class Inner in that instance by using x.Inner, which is a distinct type from the Inner class within any other instance of Outer, and a subtype of the generic Outer#Inner class.

Function Definitions

  • The return type of a function is written after the function's parameter list and preceded by a semicolon, similar to the type specification for a variable. For example, a function which would be declared in Java as
    //Java code public String toString(StringBuffer buf)
    would be declared in Scala as
    def toString(buf:StringBuffer):String
  • Functions which do not return a value are declared as having the type Unit rather than void as in Java. If a function never returns (such as if it always throws a Throwable) the return type is Nothing.
  • A function with no parameters can be declared without parentheses, in which case it must be called with no parentheses. This provides support for the Uniform Access Principle, such that the caller does not know if the symbol is a variable or a function with no parameters.
  • The function body is preceded by "=" if it returns a value (i.e. the return type is something other than Unit), but the return type and the "=" can be omitted when the type is Unit (i.e. it looks like a procedure as opposed to a function).
  • Braces around the body are not required (if the body is a single expression); more precisely, the body of a function is just an expression, and any expression with multiple parts must be enclosed in braces (an expression with one part may optionally be enclosed in braces).
  • Vararg parameters are declared by appending an asterisk to the argument, like this:
    def printf(format:String, args:Any*):String
    The parameter gets turned into an array within the method, so in the above example the args parameter would have the type Array[Any] within the body of the printf function.

Function Calls

  • When a class has an apply method, foo(bar) (where foo is an instance of that class) translates to foo.apply(bar).
    • Likewise for an object. If you see Foo(bar) that is most likely a call to the apply method of object Foo.
    • As with any method, the apply method can be overloaded, with different versions having different signatures.
    • Functions are instances of a class (Function1, Function2, etc), so the same rule applies to any function object.
    • A method named unapply in an object definition is also treated specially: it is invoked as an extractor when the object name is used in a case statement pattern.
  • Functions with zero or one argument can be called without the dot and parentheses.
    • But any expression can have parentheses around it, so you can omit the dot and still use parentheses.
    • And since you can use braces anywhere you can use parentheses, you can omit the dot and put in braces, which can contain multiple statements.
  • Functions with no arguments can be called without the parentheses. For example, the length() function on String can be invoked as "abc".length rather than "abc".length(). If the function is a Scala function defined without parentheses, then the function must be called without parentheses.
  • By convention, functions with no arguments that have side effects, such as println, are called with parentheses; those without side effects are called without parentheses.

Function Sugar

"Syntactic sugar" is added syntax to make certain constructs easier or more natural to specify. The step in which the compiler replaces these constructs by their more verbose equivalents is called "desugaring".
  • Functions with one parameter (including anonymous functions) are instances of type Function1[A,B], functions with two parameters are of type Function2[A,B,C], etc. The last type in the list of parameter types is the return value type, so there is always one more than the number N of parameters. A function with no parameters is an instance of Function0[A]. The name Function with no number is equivalent to Function1.
  • (A,B)=>C is shorthand ("syntactic sugar") for Function2[A,B,C].
  • A Function1[A,B] can be written as (A)=>B, or as just A=>B.
  • A Function0[A] (i.e. a function with no parameters) can be written as ()=>A. This function can be called with or without parentheses (as mentioned in the Function Definition section).
  • A function with no parameter list can also be specified with no parentheses as =>A. This function must be called without parentheses. If you are declaring a variable x of this type, the declaration looks like x: =>A. This signature is often used for call-by-name parameters.
  • When passing an anonymous function (also called a function literal), you can use a shorthand in which you directly write the body of the function, using underscores where each of the function parameters is to go (as long as Scala has enough information to infer the type). For example, if you are folding a list to sum all the elements, you can write it the long way:
    val list = List(1,2,3,4,5) list.foldLeft(0) { (a:Int, b:Int) => a+b }
    or, by taking advantage of Scala's type inference (and using the same value for list):
    list.foldLeft(0) { (a, b) => a+b }
    or, using underscores as in-line parameter placeholders:
    list.foldLeft(0) { _ + _ }
    or using the equivalent method /: (which also does a foldLeft):
    In this last form, we are taking advantage of the following shorthands:
    • The /: operator is equivalent to the foldLeft method (the List class defines both methods).
    • The foldLeft method (and the equivalent /: operator method) uses a curried parameter list, with the first parameter list having only one method. This allows us to take advantage of the next step.
    • Since the foldLeft method takes only one parameter (in the first parameter list), we can invoke it without the dot and parentheses.
    • Since the operator name ends with a colon, it is right-associative, so the list object goes on the right and the 0 argument goes on the left.
    • The second parameter list contains only one item (the function to apply to the fold), and the function we are passing in has only one expression, so we can use parentheses rather than braces.
    • Scala has enough information to infer the types of the two parameters in the function literal, so we do not need to specify the types of the parameters.
    • We are only using each parameter in the literal once, so we can use the underscore shorthand and not have to declare the names of the parameters in the function literal.
    • We can remove all the space without creating ambiguity.
  • If a function literal, as used in the above example, is a single method call that takes only one argument, then the method name alone may be specified. Under this rule, this:
    args.foreach( (x:Any) => println(x) )
    becomes this (the other intermediate forms given above are also valid):
  • Instead of using an underscore as a placeholder for an argument, if a function name is followed by a space and an underscore, the underscore is a placeholder for an entire argument list. This is a partially applied function.
  • Another example, this one from Tony Morris's Introduction to High-level Programming with Scala:
    def compose[A, B, C](f: B => C, g: A => B): A => C = (a: A) => f(g(a))
    • This defines a compose function with three type parameters A, B and C. The regular function parameters are in parentheses.
    • The parameter f is a function that takes one argument of type B and returns a value of type C. The parameter g is also a function.
    • The return type of the compose function, which appears after the colon that follows the parentheses, is a function that takes one argument of type A and returns a value of type C.
    • The body of the function appears after the = character, and is a function which has a parameter a of type A and returns the value f(g(a)).
    • Note that executing the compose function does NOT execute functions f and g, but rather returns a function object which, when invoked, will execute f(g) on its argument.
    • This would thus be used as follows:
      def plus1(n:Int):Int = n+1 def intToParenString(n:Int) = "("+n.toString+")" val plus1string = compose(intToParenString,plus1) val x = plus1string(10) //this executes plus1 and intToParenString, //sets x to the string (11)
Updated 2008-10-19: added classOf, added infix type notation.


Chris Bouzek said...

Very nice reference-I can see I'll have to keep this one handy.

Michael Campbell said...

Wow, epic post. Very nice. Posted at 8:47 am? Even more impressive.

Anonymous said...

Excellent reference piece. Definetely bookmarked!

s3bastien said...

Excellent article, thanks a lot !

Tim Ruffles said...
This comment has been removed by the author.
Tim Ruffles said...

Thanks Jim, there's so much to learn in Scala that it's great to have somewhere to get instant reminders.

Adam Coates said...

Wow; this is great. This is a much better introduction to Scala for already-seasoned programmers than anything on Many thanks!

Anonymous said...

Thats a bookmark