You may have heard about Scala. This language is taking the big data world by storm. If you’re an object-oriented/procedural programmer, this post will show you how to adapt the concepts you already know into Scala’s lightweight functional syntax.
Here are some of the highlights of Scala
- Optionally functional, optionally object-oriented
- Runs on the JVM (so Scala programs can run anywhere a Java program can)
- Extreme performance with low code (effortless parallelism [caveats exist])
Getting started with Scala is easy. I recommend installing IntelliJ Community Edition and using their IDE. It provides helpful type checking and does a pretty good job holding your hand.
Functional programming prides itself on immutability. Scala builds on this by providing two variable types, `var` and `val`. `var` represents data that is allowed to change after instantiation and val represents data that will not change after instantiation.
Flipping the switch: learning to think functional
At first brush, it’s easy to think and implement solutions in Scala using a procedural approach. Scala allows you to use as much OO style as you like, and as much Functional style as you prefer. As a result, it’s easy to get caught in old ways.
The Pattern: transforming a collection of objects
Take this C# example. Here is a Person class with a first name, last name, and an age.
class Person { public String FirstName { get; set; } public String LastName { get; set; } public int Age { get; set; } public Person(String _FirstName, String _LastName, int _Age) { this.FirstName = _FirstName; this.LastName = _LastName; this.Age = _Age; } }
Here is code that instantiates some people, and then returns a list of strings “Last Name, First Name”.
var people = new List<Person>(){ new Person("Brad", "Kovach", 26), new Person("Jane", "Doe", 29), new Person("John", "Doe", 30) }; var result = new List<String>(); foreach(Person person in people) { result.Add( String.Format("{0}, {1}", person.LastName, person.FirstName) ); }
It is important to note that the `result` data structure had to be explicitly created, and the for loop must be explicitly told what the “item” is for the “collection.” This code works as expected.
Let’s accomplish the same thing using Scala, and functional syntax.
case class Person(FirstName: String, LastName: String, age: Int)
Using a “case class” simplifies code because it is automatically its own constructor, and there is no assumed “logic” with the object beyond sensible equality checks.
Then we build a List and iterate over it using “map.” Map is a function for iterating over a collection when you need output. The part with `person =>` is actually specifying a function for a person to be input. The function does not say “return” because Scala assumes the last line of the function is the return.
// instantiate list (immutable because of "val") val people = List( Person("Brad", "Kovach", 26), Person("Jane", "Doe", 29), Person("John", "Doe", 30) ) // transform the list val result = people.map( person => "%s, %s".format(person.LastName, person.FirstName) )
Notice that no “result” array needed to be created in order to accomplish this transformation. These data structures are automatically instantiated and kept behind the scenes.
The Pattern: Perform an action on several pieces of data.
Performing some small piece of work without needing the result of the output is common. For these examples, I will simply output the “Last, First” result
In C#, this is another loop
foreach( var result in results ) { Console.WriteLine(result); }
In Scala, this is also a loop, but a function is passed as an argument.
results.foreach( result => println(result) )
The Pattern: Accumulating results in a loop
Assume that we are performing a sum of the ages of our three people.
var sum = 0; foreach(Person person in people) { sum += person.Age; }
This code is relatively straightforward. We instantiate a sum accumulator (0) and for each person, we just add their age to whatever sum was last. `sum` is required to be mutable.
The same operation can be performed in Scala…
val total = people.foldLeft(0)( (sum, person) => sum + person.age)
This seems cryptic, so let me walk through this token by token
- `val total` specifies that we are creating an immutable variable named “total.”
- `people.foldLeft` specifies that we’re going to be performing a “left to right” operation on the people object
- `(0)` specifies that this is the starting sum before we begin
- `(sum, person)` specifies the signature for the inline function. `foldLeft` will pass the accumulator (sum) in at the first position and the item in at the second
- `=> sum + person.age` specifies that the sum plus the person’s age are the new sum. Since this is the last line of the function, no `return` was necessary. `sum + person.age` will be calculated and passed to the next iteration as `sum`
When all “people” have been processed, the “total” variable will contain the combined age of all people.
This code works exclusively with immutable variables and relies on the language to maintain structures to work through the problem.
Conclusion
My goal with this post was to show you how to transform common object-oriented tasks into a functional paradigm. These examples show how the language works behind the scenes to do work that normally chews up programmer time and lines.
Next time, I’ll show you the Scala way to regex data, go parallel, and introduce you to Pattern matching.
Leave a Reply