Scala

First steps with Scala, say goodbye to bash scripts…

Those who know me are aware that I’ve been following play framework, and actively taking part of it’s community, for a couple of years.

Playframework 2.0 is right around the corner, and it’s core is programmed in Scala, so it’s a wonderful opportunity to give this object-oriented / functional hybrid beast a try…
Like many others, I will pick a very simple script to give my first steps…

Finding an excuse to give Scala a try

With a couple of friends we are on the way to translate play framework documentation to spanish (go have a look at it at http://playdoces.appspot.com/, by the way, you are more than welcome to collaborate with us)
The documentation is composed of a bunch of .textile files, and I had a very simple and silly bash script to track our advance. Every file that has not yet been translated has the phrase “todavía no ha sido traducida” in it’s first line

echo pending: `grep "todavía no ha sido traducida" * | wc -l` / `ls | wc -l`

Which produced something like

pending: 40 / 63

Pretty simple, right?

I just wanted to develop a simple scala script to count the translated files, and also it’s size, to know how much work we had ahead.

Scala as a scripting language

Using scala as a scripting language is pretty simple. Just enter some scala code in a text file, and execute it with “scala file.scala“. You can also try it with the interactive interpreter, better knonw as REPL (well, it’s not really an interpreter, but a Read-Evaluate-Print Loop, that’s where the REPL name comes from).

In linux, you can also excute them directly from the shell marking the scala file as executable and adding these lines to the beginning of the file.

#!/bin/sh
exec scala "$0" "$@"
!#

Tip: you can speed up A LOT script execution by adding a –savecompiled like it says on the scala command man page, like this:

#!/bin/sh
2
exec scala -savecompiled "$0" "$@"
3
!#

Classes and type inference in scala

So I created a DocumentationFile, with a name, length and an isTranslated property.

class DocumentationFile(val file: File) {

  val name = file.getName
  val length = file.length
  val isTranslated = (firstLine.indexOf("Esta página todavía no ha sido traducida al castellano") == -1)

  def firstLine = new BufferedReader(new FileReader(file)).readLine

}

Scala takes away a lot of boilerplate code. The constructor is right there, along with the class declaration. In our case, the DocumentationFile constructor takes a java.io.File as argument.

Scala also makes heavy use of type inference to alleviate us from having to declare every variable’s type. That’s why you don’t have to specify that name is a String, length a Long and isTranslated a Boolean. You still have to declare types on method’s arguments, but usually you can omit them everywhere else.

Working with collections

Next I needed to get all textile files from the current directory, instantiate a DocumentationFile for each of them, and save them in an Array for later processing.

import java.io._

val docs = new File(".").listFiles
  .filter(_.getName.endsWith(".textile"))   // process only textile files
  .map(new DocumentationFile(_))

Technically speaking is just one line of code. The “_” is just syntactic sugar, we could have written it in a more verbose way like this:

val docs = new File(".").listFiles
  .filter( file => file.getName.endsWith(".textile") )   // process only textile files
  .map( file => new DocumentationFile(file) )

Or if you are a curly braces fun:

val docs = new File(".").listFiles
  .filter { file => 
    file.getName.endsWith(".textile")         // process only textile files
  }   
  .map { file => 
    new DocumentationFile(file)
  }

Higher order functions

Once we have all textile files, we’ll need the translated ones.

val translated = docs.filter(_.isTranslated)

Here we are passing the filter method a function as parameter (that’s what is called a higher order function). That function is evaluated for every item in the Array, and if it returns true, that item is added to the resulting Array. The “_.isTranslated” stuff is once again just syntactic sugar. We could have also written the function as follows:

val translated = docs.filter( (doc: DocumentationFile) => doc.isTranslated )

Functional versus imperative: To var or not to var

Now I need to calculate the quantity and size of the translated and not yet translated files. Counting the files is pretty easy, just have to use “translated.length” to know how many files have been translated so far. But for counting their size I have to sum the size of each one of them.

This was my first attempt:

var translatedLength = 0L
translated.foreach( translatedLength += _.length ) 

In scala we can declare variables with the “var” and “val” keywords, the first ones are mutable, while the later one ar immutables. Mutable variables are read-write, while immutable variables can’t be reassigned once their value has been established (think of them like final variables in Java).

While scala allows you to work in an imperative or functional style, it really encourages the later one. Programming in scala, kind of the scala bible, even teaches how to refactor your code to avoid the use of mutable variables, and get your head used to a more functional programming style.
These are several ways I’ve found to calculate it in a more functional style (thanks to stack overflow!)

val translatedLength: Long = translated.fold(0L)( (acum: Long, element: DocumentFile) => acum + element.length )

//type inference to the rescue
val translatedLength = translated.foldLeft(0L)( (acum, element) => acum + element.length )

//syntactic sugar
val translatedLength = translated.foldLeft(0L)( _ + _.length )

// yes, if statement is also an expression, just like the a ? b : c java operator.
val translatedLength = if (translated.length == 0) 0 else translated.map(_.length).sum

I’ve finally settled with this simple and short form:

val translatedLength = translated.map(_.length).sum
val docsLength = docs.map(_.length).sum

Default parameters and passing functions as arguments

Now I have all the information I needed, so I just have to show it on screen. I also wanted to show the file size in kbs.
Once again this was my first attempt:

println( 
  "translated size: " + asKB(translatedLength) + "/" + asKB(docsLength) + " " + 
  translatedLength * 100 / docsLength + "% "
)

println( 
  "translated files: " + translated.length + "/" + docs.length + " " + 
  translated.length * 100 / docs.length + "% "
)

def asKB(length: Long) = (length / 1000) + "kb"

And this was the output:

translated size: 256kb/612kb 41% 

translated files: 24/64 37% 

Well, it worked, but it could definitely be improved, there was too much code duplication.
So I created a function that took care of it all:

def status(
  title: String = "status", 
  current: Long, total: Long, 
  format: (Long) => String = (x) => x.toString): String = {

  val percent = current * 100 / total

  title + ": " + format(current) + "/" + format(total) + " " +
  percent + "%" +
  " (pending " + format(total - current) + " " +
  (100-percent) + "%)"
}

The only tricky part is the format parameter. It’s just a higher order function, that by default just converts the passed number to a String.
We use that function like this:

println( 
  status("translated size", translatedLength, docsLength, (length) => asKB(length) ) 
)

println( 
  status("translated files", translated.length, docs.length) 
)

And that’s it.
It’s really easy to achieve this kind of stuff using scala as a scripting language, and on the way you may learn a couple of interesting concepts, and give your first steps into functional programming.
This is the complete script, here you have a github gist and you can also find it in the play spanish documentation project.

#!/bin/sh
exec scala "$0" "$@"
!#

import java.io._

val docs = new File(".").listFiles
  .filter(_.getName.endsWith(".textile"))   // process only textile files
  .map(new DocumentationFile(_))

val translated = docs.filter(_.isTranslated)    // only already translated files

val translatedLength = translated.map(_.length).sum
val docsLength = docs.map(_.length).sum

println( 
  status("translated size", translatedLength, docsLength, (length) => asKB(length) ) 
)

println( 
  status("translated files", translated.length, docs.length) 
)

def status(
  title: String = "status", 
  current: Long, total: Long, 
  format: (Long) => String = (x) => x.toString): String = {

  val percent = current * 100 / total

  title + ": " + format(current) + "/" + format(total) + " " +
  percent + "%" +
  " (pending " + format(total - current) + " " +
  (100-percent) + "%)"
}

def asKB(length: Long) = (length / 1000) + "kb"

class DocumentationFile(val file: File) {

  val name = file.getName
  val length = file.length
  val isTranslated = (firstLine.indexOf("Esta página todavía no ha sido traducida al castellano") == -1)

  override def toString = "name: " + name + ", length: " + length + ", isTranslated: " + isTranslated
  
def firstLine = new BufferedReader(new FileReader(file)).readLine

}

Reference: First steps with Scala, say goodbye to bash scripts…   from our JCG partner Sebastian Scarano  at the Having fun with Play framework! blog

Related Articles :

Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

3 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Sergey
Sergey
9 years ago

Well, this article is not about scala for script but about scala sytax.

There is Only first paragraph is about scripting topic.

But tha might be any language. even java or c.

Daniel Macias
Daniel Macias
9 years ago

Hey Sebastian,

You can take a whole step further by using sbt Scripts:

http://www.scala-sbt.org/0.13/docs/Scripts.html

It will allow you to use SBT to launch your scripts. Within your script, you can define dependencies and sbt will resovle them for you! Also, sbt’s IO libraries allow you simplify file operations greatly. e.g.

Writing: IO.writeLines(file1, xs)

or

Recursive Directory Search: (srcDir ** “*.scala”).get

Ruben
Ruben
7 years ago
Reply to  Daniel Macias

Thanks thanks thanks :)

Back to top button