A crash course in Scala types

kingsley.daviesMay 24th, 2012Last Updated: October 22nd, 2012

0 25 15 minutes read

After many years of Java development, discovering Scala’s type system and related features was something of a departure for me. Suffice to say GADT wasn’t my first four letter utterance when learning about pattern matching on types, let alone what, when and how to use variance annotations and generalized type constraints. To kick things off, here’s a ‘small but powerful‘ few lines of buzzword bingo on the type system:

…Scala is a statically, strongly, typed language, with implicit type inference and support for structural and existential types. It also features parameterized types, abstract and phantom types and is capable of implicitly converting between datatypes. These core capabilities are utilized by context and view bounds and complimented by generalized type constraints, to provide powerful compile time contracts. Furthermore, Scala supports declaration site type annotations to facilitate invariant, covariant and contravariant type variance…

In a word, Ouch !

In the remainder of this post, I’ll try to demystify these concepts and sew the seeds of intrigue for further investigation.

In order to keep the post at a manageable length, some detail will be eluded to and links provided, (both at the end and inline) for the reader to pursue independently. Time and space permitting, I’ll try to briefly cover the hows and whys of some of these features to give context of their practical importance and implications. Apologies in advance for the length of the post, but there’s a lot of ground to cover and lots of opportunity for the reader to skim. Keep with it as there’s gold in them there types, unsurprising given the adventurous depth of features that the language tries to mine.

So first up, what is a type system ?

One authors definition (which I found thorough, if not a slightly cryptic):

‘..A type system is a tractable syntactic method for proving the absence of certain program behaviors by classifying phrases according to the kinds of values they compute’ from Ben Pierce’s ‘Types and Programming Languages‘ – a bible for anyone interested in type system theory..

And again, Ouch !

I prefer the lay interpretation of this whereby a type provides some sort of label to a type system. In turn, this enables the type system to then prove (or constrain) some property of the programs’ behaviour.

Practically a type system let’s either the compiler [typically] or runtime add some meaning to the data and values/variables, (something I’ll henceforth refer generically as fields/elements due to the overloading of the terms val and var in Scala), in order to react or fail appropriately.

2 fat ladies…

So deconstructing the buzzword bingo above, the primary attribute listed is that of being statically typed. So what does this mean ? Static typing provides COMPILE TIME constraints, checks and balances and, as such, provides the first line of defence, QA and feedback against program errors. The converse of this is dynamic typing, where type attribution to elements is determined at runtime. Besides the early feedback loop, other commonly cited benefits of static typing are:

better performance and ability to optimise the code – as the compiler can perform more optimisations and removes the need to do type checks at runtime.
better implicit and explicit documentation support – as method signatures make this implicit in the code and explicit in any code generated from the source, also type information can be used to convey the authors intent.
better tooling support and ability to analyse the code – as the tools can check the types being passed between methods and into constructors etc..
better support for correctness (see Rice’s theorem, this slide deck on structural induction and the type checker and the Curry-Howard isomorphism) – as we’ll see later in this piece, correctness can be further supported by judicious use of type constriants.
better abstraction/modularity – as support for abstract types allows the author to frame the problem differently and (potentially), in a more modular fashion.

Having said that, in practise, few languages are exclusively dynamic or statically typed. Given this list of static type system features, why would anyone use dynamic typing ? Dynamic typing is typically considered to provide greater flexibility for building loosely coupled systems and for rapid prototyping. These are fairly natural benefits of dynamic languages as they do not have to adhere to the constraints imposed by the compiler, but this comes at the cost of the static typing benefits aforementioned.

Learning to let go… letting go to learn…

Again, coming from a Java background, my natural instincts assumed: a) I had a reasonably good understanding of static typing having spent many years doing Java development and; b) that statically typed languages are inherently bloated and require a lot of boilerplate code. From the Scala perspective, it is both interesting to see how quickly you hit the limits of Javas support for types ‘out-of-the-box’ (i.e. without adding external libraries or significant hacking of the core system), and how Scala shatters the assumption that static typing == code bloat.

One such cruft cutting feature in Scala is evident with type inference support. With this, elements only need to provide their type information ‘on the right’ when they are declared. So the syntax for Scala element declaration tends be ordered by salience [IMHO] with the following precedence in place: the elements mutability status; the name of the element, then; the type information (which is held exclusively, (and not repeated.. that’s right, not repeated

) ‘on the right’.

val a = 1 // this type gets inferred to be of type Int
val b = "b" // this type in inferred to be a String
val c = 2.0 // this type is inferred to be a Double

case class SomeThing
class SomeOtherThing

val d = SomeThing // this type is instantiated as SomeThing. Not no need for the 'new' keyword as this is a case class
val e = new SomeOtherThing // This type requires the new keyword as no factory method is created for non case classes

A further point of note around the type inference strategy used by Scala is that “local type inference” aka “flow type inference” that is used instead of the Damas-Milner (aka Hindley-Milner aka HM) strategy used in other statically typed, implicitly inferred languages (see also System F and its variants).

Types cubed

Indeed, much has been made of the complexity and richness of Scala’s type support, with Scala’s abstract types being a cause of confusion and much distress (e.g. see this post for an exmaple). As the name suggests, abstract types in Scala allow for types to be referred to in the abstract and hence used as field level members of classes. This means that type fields can act as placeholders for types to be realized at a later date, (making it possible to lazily design the concrete types as the solution to the problem uncovered.. providing just enough (lagom™) design input). In a sense they are similar to type references used in parameterized types (i.e. types that require a type parameter for their declaration such as Java generics), though broken out of their Collection containers. (Note the idiomatic distinction between parameterized types and abstract types tends to distinguish between the type indicator being used for a collection vs other scenarios See ‘A Statically Safe Alternative to Virtual Types‘).

import scala.collection.GenSeq

trait SimpleListTypeContainer {
    type Simple                         // declare an abstract type with the label Simple
    type SimpleList <: GenSeq[Simple]   // Constrain the Simple List abstract type based on the previously defined abstract type
}

An additional feature of abstract types is that type bounds can be used, (i.e the actual concrete types that are permitted for a declaration can be constrained programmatically and enforced at compile time). This allows for the intent in the code to be made explicit, such as when trying to suggest certain types of specialisation (e.g. family polymorphism).

Coupled with self types, this makes for a powerful set of type tools. Self types, allow ‘this’ references to be explicitly tied to another class using the ’self’ keyword, (so within the code it is possible to make the ‘this’ reference mean a different type than the actual containing type).

trait DB { def startDB: Unit }  // defines an abstract start for a DB component
trait MT { def startMT: Unit }  // defines an abstract start for an MT component

trait Oracle extends DB { def startDB = println("Starting Oracle") } // Some actual concrete instances.. dummied for example
trait Service extends MT { def startMT = println("Starting Service") }

trait App {
    self: DB with MT => // declare that self for App refers to a DB with MT, so that we have access to the startXX ops

    startDB
    startMT
}

object DummyApp extends App with Oracle with Service // create a concrete instance with an actual DB and MT instance
DummyApp.run // run it and see "Starting Oracle" then "Starting Service"

Self types have a number of uses, such as: providing traits or abstract classes visibility to the fields and/or methods of the class they are masquerading as/ mixed into; as a type-safe, (i.e. compile time checked) way to perform declarative dependency injection (see the cake pattern).

One little duck

Scala also facilitates type-safe duck typing via structural types (see here for a side by side comparison of structural typing and duck typing).

class Duck {
    def squawk = println("Quack")
    def waddle = println("Duck walk")
}
class Pengiun {
    def squawk = println("Squeek")
    def waddle = println("Penguin walk")
}
class Person { }

// everybody's heard about the word...
def birdIsTheWord(bird : { def squawk; def waddle}) = {
    bird.squawk
    bird.waddle
}

birdIsTheWord(new Duck())      // prints "Quack" then "Duck walk"
birdIsTheWord(new Penguin())   // prints "Squeek" then "Penguin walk"
birdIsTheWord(new Person())    // Will not compile

The ability, to use classes according to some feature(s) of their structure, (rather than by name), has a number of uses, such as: in conjunction with implicits (still to be come) as part of a Smart Adapter Pattern; for creating quick ad-hoc prototyping code; as an enabler for method reuse in cases where client classes are unrelated, but share an internal structural feature (Note: the typical example touted here being a Gun and a Camera being unrelated items, but both having a shoot() method ! An example that has the dual purpose of also highlighting the inherent dangers of structural and dynamic typing per se.. for some reason, this example always reminds me of the early 90’s films Let him have it).

An observation thus far is that the few aforementioned core type system building blocks have the effect of triggering a shift in mindset and approach to problems. In fact, the notion of ‘thinking in Scala’ is not one of syntactic complexity (IMHO), but rather what is the ‘best‘ (by any subjective measure of best) idiomatic use of the extensive feature set provided. Personally, I’ve found myself deconstructing problems into their expected inputs and desired outputs, and looking at modelling my problem domain in terms of types and operations that happen on those types as opposed to looking at the World through Object tinged spectacles.

Luckily there are [link to bottom of page]some[/link] resources that help when trying to investigate further, and some idioms (like theorems) come for free ! One such construct is that of dependent types, for which Tuple instantiation provides the simplest example. Extending the notion of dependent types and building upon the inherent nesting capabilities, Scala also supports path-dependent types. As the name suggests, any types created are pivoted around the namespace in which they are created. Idiomatically, path dependent types have been used in making component oriented software and in the Cake Pattern method for handing dependency injection. Interestingly, path and value dependent types can also be interleaved as this example demonstrates.

// First a simple example of Tuple creation

val myTuple2 = ("A", 1)
val myTuple3 = ("B", 2, true) // creates a Tuple3 of String, Int, Boolean

// Not on to the path dep types stuff

trait BaseTrait {
 type T
 val baseA : T
}

class ChildA[String](param : String) extends BaseTrait { 
 type T = String
 override val baseA = param
    println(baseA) 
}

val childA = new ChildA("A")

type B = String

class ChildB[String](param : String) extends BaseTrait { 
 // type T = B  // Won't compile - baseA (below) has an incompatible type !
 // type T = childA.T // Won't compile - baseA (below) has an incompatible type !
 type T = String
 override val baseA = param
    println(baseA) 
}

val childB = new ChildB("B")

Variance annotations.. ooh.. pardon ?

Generic types in Scala are invariant by default (i.e. they can only accept the exact type as a parameter with which they were declared). Scala provides other variance annotations to permit covariance type usage (i.e. permitting any children of the declared types), and contravariance of declared types (i.e. parents of the declared type are also permitted by no child types). These variance declarations (aka variance annotations) are specified at the declaration site (i.e. where the initial parameterized type is declared), as opposed to at usage site (as is the case in Java e.g. each instantiation is free to define what is expected in the Parameterized type for the specific use). So what does all this mean and how does this manifest itself in the code ? And what is the use of having such constraints on parameterised types ? Probably the clearest explanation of this I have read is from O’Reillys excellent programming Scala. To supplement the description given in programming scala, let’s walk through the implications of their func2-script scala:

// Taken from the excellent tutorial here: 
// http://programming-scala.labs.oreilly.com/ch12.html#VarianceUnderInheritance
// Note this sample leverages the Function1 trait: Function1[-T, +R]

// WON'T COMPILE

class CSuper                { def msuper = println("CSuper") }
class C      extends CSuper { def m      = println("C") }
class CSub   extends C      { def msub   = println("CSub") }

def useF(f: C => C) = {
  val c1 = new C     // #1
  val c2: C = f(c1)  // #2
  c2.msuper          // #3
  c2.m               // #4
}

useF((c: C)      => new C)        // #5
useF((c: CSuper) => new CSub)     // #6
useF((c: CSub)   => {println(c.msub); new CSuper})   // #7: ERROR!

Taken from: O’Reilly Programming Scala

Given the trait Function1 has the following declaration Function1[-T, +R], the variance annotation for the type for the single argument parameter [the -T bit of the declaration], is contravariant and therefore accepts the explicit declared type and any parent type of T, whereas the the return type [the +R bit] is covariant and so enforces that any return parameter is either of type R or a child type of R i.e. the return type ‘is-an’ R. What this means is that we contractually expect any function to be able to take (as a minimum) an instance of a T and return an R. Hence any client code using this function, can be confident in assuming it will be able to call any methods advertised on a type R on the value returned from the function. The contravariant input parameter of type T here also allows for broader application of the function, i.e. a more broad/general purpose function could be substituted for an instance that just accepts the type T. For example, given the following (pseudo) Function1 description: Function1(Ninja, Pirate), a substitute (strictly speaking a child function type) function that is able to take a more generic type (such as a Person i.e. Function1(Person, Pirate)) and return a Pirate would be a valid substitue function in accordance with the advertised contract. By the same token any function that could accept a Ninja and return Long John Silver would be a valid conversion for a Ninja to a Pirate. In this instance the function is actually returning a specialisation of the Pirate return type. While covering parameterized types in Scala, it’s a good juncture to mentioning context bounds in this… erm context ! Context bounds extend the functionality of a given type by using the seed type to be used as the type for a parameterised type. For exmaple, given class A and a trait B[T], context types provide syntactic sugar to allow for the an instance of B[A], (hence allowing method calls from type B to be issued against an instance of type A). The syntactic sugar here would be def someMethod[A: B](a: A) = implicitly[B[A]].someMethodOnB(a) (Note: there is an excellent writeup of context bounds provided by Debasish Ghosh here).

One more time… 79 !

Having previously discussed abstract types, it’s worth seeiung how they relate to the variance annotations just mentioned. Functionally, (though with a degree of shoehorning), abstract types and variance annotations could be used interchangeably. In practise, the two constructs have different histories and different intents. Variance annotations essentailly apply to construct decalrations and are commonly used for parameterized types (e.g. in declaring Lists of things, or Options of specific types). In the Java world, parameterized types have manifested as Java Generics and their use is much more widespread within the OO domain where inheritence is a key feature.

A corollary of abstact types and type bounds support in Scala is the availability of phantom types. Essentially, phantom types are type variables that are not instantiated at runtime, but used by the compiler to enforce constraints in the source code. As such, phantom types can be used for type level programming in Scala and adding another tier of support for program correctness. (e.g. a sample use of phantom types in type safe reflection and a type-safe builder pattern with phantom types in Scala). View bounds are related to phantom types as they further constrain the use of types based on their ability to be be converted into other types. An example of this use of view bounds was provided in the [link]structural typing example[/link] earlier. View bounds are also inherent to the ‘pimp my library’ pattern, where a class can have its functionality extended by adding functionality from other classes, but [importantly] without changing the original class. This means that the original class can be returned from a call even though it may get used under the guise of another class. Also, although Scala supports implicit conversion of types, (most explicitly seen via view bounds), it is still a strongly typed language, (i.e. only explcitly defined implicit conversions are possible, the compiler doesn’t automagically try to deduce conversions and use them, which can lead to unpredictable runtime consequences).

Making it special…

As a quick recap, so far we’ve mentioned that Scala is a statically typed language and presented some of the pros and cons of being such. Although Scala is statically and strongly typed, powerful type inference, structural typing and implicit type conversions provide a great amount of flexibility and remove lots of unneccesary boilerplate code. Also, as Scala is predominantly a language and compiler, (i.e. discounting the awesome Scala community, libraries and frameworks that are also available), we’ve seen that one of the big wins from a users perspective is in the increased opportunity to program for correctness granted by the ability to contrain, bound and extend types. Thus far we have focussed on type contraints via variance annotations for parameterized types and view bounds for methods. From Scala 2.8 onwards parameterized types have been afforded even more constriant capabilities via generalised type constraint classes. These classes enable further specialisation in methods, and complement context bounds, as follows:

A =:= B asserts that A and B must be the equal
A <:< B asserts that A must be a subtype of B

A sample usage of these classes would be to enable a specialisation for addition of numeric elements in a collection, or for bespoke print formatting, or to allow for customised liability calculations on specific bet or fund types in a traders portfolio. For example:

case class PrintFormatter[T](item : T) {
 def formatString(implicit evidence: T =:= String) = { // Will only work for String PrintFormatters
     println("STRING specialised printformatting...")
 }
    def formatPrimitive(implicit evidence: T <:< AnyVal) = { // Will only work for Primitive PrintFormatters
  println("WRAPPED PRIMITIVE specialised printformatting...")
 }
}

val stringPrintFormatter = PrintFormatter("String to format...")
stringPrintFormatter formatString
// stringPrintFormatter formatPrimitive  // Will not compile due to type mismatch

val intPrintFormatter = PrintFormatter(123)
intPrintFormatter formatPrimitive
//  intPrintFormatter formatString   // Will not compile due to type mismatch

House !

Finally, and as a segue into a somewhat contrived example, it’s worth noting that Scala supports existential types primarily as a means of integrating better with Java (for both generics and primitive type support). In an effort to compensate for the type erasure as implemented in Java’s generics support, Scala includes a feature called ‘Manifests’ to (effectively) keep a record of the classes used in paramaterised types in use. So, let’s close with an extended example using Manifest based (pseudo i.e. compiler derived) reification and specialisation constriants on a few helper methods to show some of Scala’s type system tricks and support in action.

class ReifiedManifest[T <: Any : Manifest](value: T) { 

  val m = manifest[T]   // So at this point we have the manifest for the Parameterized type
                        // At which point we could either do an if() expression on what type is contained in our manifest

  if (m equals manifest[String]) {        
    println("The manifest contains a String")
  } else if (m <:< manifest[AnyVal]) {        // A subtype check using the <:< operation on the Manifest trait
    println("The manifest contains a subtype of AnyVal")
  } else if (m <:< manifest[AnyRef]) {  
    println("The manifest contains a subtype of AnyRef")
  } else {
    println("Not sure what type is contained ?")
  }
  
  // or we could grab the erased type from the manifest and do a match on some attribute of the type              
  
  m.erasure.toString match {
    case "class java.lang.String" => println("ERASURE: pattern matches on a String")
    case "double" | "int"  => println("ERASURE: pattern matches on a Numeric value.")
    case x => println("ERASURE: has picked up another type not spec'd in the pattern match: " + x)
    }
}

new ReifiedManifest("Test")                     // Contains a String  / matches on a String
new ReifiedManifest(1)                          // Contains an AnyVal / matches on a Numeric
new ReifiedManifest(1.2)                        // Contains an AnyVal / matches on a Numeric
new ReifiedManifest(BigDecimal("3.147"))        // Contains an AnyRef / matches on a an unspecified type

The final word

This is really just scratching the surface of both the generic power and complexities of type systems and what is possible and how in Scala. There is a wealth of resources and further reading available in this area, but hopefully some of those listed below should help. I’ve found that having a grasp of the fundamentals of types in Scala has helped me understand other features of the language and actually affected how I approach problems and what I expect from their solutions. I hope this write up has been useful and, until next time, happy hacking !

Links

Reference: A crash course in Scala types from our JCG partner Kingsley Davies at the Scalabound blog.