Hit the ground running: Scala, and the future of functional

You may have heard about Scala.  This language is taking the big data world by storm.  If you’re an object-oriented/procedural programmer, this post will show you how to adapt the concepts you already know into Scala’s lightweight functional syntax.

Here are some of the highlights of Scala

  • Optionally functional, optionally object-oriented
  • Runs on the JVM (so Scala programs can run anywhere a Java program can)
  • Extreme performance with low code (effortless parallelism [caveats exist])

Getting started with Scala is easy.  I recommend installing IntelliJ Community Edition and using their IDE.  It provides helpful type checking and does a pretty good job holding your hand.

Functional programming prides itself on immutability.  Scala builds on this by providing two variable types, var and valvar represents data that is allowed to change after instantiation and val represents data that will not change after instantiation.

Flipping the switch: learning to think functional

At first brush, it’s easy to think and implement solutions in Scala using a procedural approach.  Scala allows you to use as much OO style as you like, and as much Functional style as you prefer.  As a result, it’s easy to get caught in old ways.

The Pattern: transforming a collection of objects

Take this C# example.  Here is a Person class with a first name, last name, and an age.

Here is code that instantiates some people, and then returns a list of strings “Last Name, First Name”.

It is important to note that the result data structure had to be explicitly created, and the for loop must be explicitly told what the “item” is for the “collection.”  This code works as expected.

Let’s accomplish the same thing using Scala, and functional syntax.

Using a “case class” simplifies code because it is automatically its own constructor, and there is no assumed “logic” with the object beyond sensible equality checks.

Then we build a List and iterate over it using “map.”  Map is a function for iterating over a collection when you need output.  The part with person => is actually specifying a function for a person to be input.  The function does not say “return” because Scala assumes the last line of the function is the return.

Notice that no “result” array needed to be created in order to accomplish this transformation.  These data structures are automatically instantiated and kept behind the scenes.

The Pattern: Perform an action on several pieces of data.

Performing some small piece of work without needing the result of the output is common. For these examples, I will simply output the “Last, First” result

In C#, this is another loop

In Scala, this is also a loop, but a function is passed as an argument.

The Pattern: Accumulating results in a loop

Assume that we are performing a sum of the ages of our three people.

This code is relatively straightforward.  We instantiate a sum accumulator (0) and for each person, we just add their age to whatever sum was last.  sum is required to be mutable.

The same operation can be performed in Scala…

This seems cryptic, so let me walk through this token by token

  1. val total specifies that we are creating an immutable variable named “total.”
  2. people.foldLeft specifies that we’re going to be performing a “left to right” operation on the people object
  3. (0) specifies that this is the starting sum before we begin
  4. (sum, person) specifies the signature for the inline function.  foldLeft will pass the accumulator (sum) in at the first position and the item in at the second
  5. => sum + person.age specifies that the sum plus the person’s age are the new sum.  Since this is the last line of the function, no return was necessary.  sum + person.age will be calculated and passed to the next iteration as sum

When all “people” have been processed, the “total” variable will contain the combined age of all people.

This code works exclusively with immutable variables and relies on the language to maintain structures to work through the problem.

Conclusion

My goal with this post was to show you how to transform common object-oriented tasks into a functional paradigm.  These examples show how the language works behind the scenes to do work that normally chews up programmer time and lines.

Next time, I’ll show you the Scala way to regex data, go parallel, and introduce you to Pattern matching.

Creating an Invisible Application: Adding email as an interface for an application

This post is designed to serve as a brief technical overview of a recent feature added to ServiceSpark, a community service management platform I develop as a volunteer for the United Way of Albany County.

ServiceSpark uses email to send email notifications to volunteers about new events and new comments on events that the volunteer is connected to.  The email includes a link back to ServiceSpark.org, and encourages the user to RSVP and comment on the event.  Unfortunately, however, this requires a click, a login, and users rarely follow through with the process.

The Challenge

Use email as an interface for ServiceSpark, allowing users to “reply” to an email to leave comments, or allow users to RSVP using their client’s native calendar support.

The Implementation

My implementation of this required some sort of way to generate unique reply email addresses for each email that was sent, and a way to make note of the reply address, so that replies, if any, can be processed.

Dealing with the reply is also problematic.  Many replies include the chain of emails behind the reply, or signatures.  These artifacts need to be stripped from the application, or else the comments will become cluttered.

Emails also need to be attached to the user’s ServiceSpark identity.  Every message should come in and appear as if the volunteer logged into ServiceSpark and created the comment, or submitted an RSVP manually.

But, finding a way to receive email at any possible address seemed challenging. For starters, standardizing a way to communicate with an email platform is difficult.  Hijacking qmail, or some other mail queue is just tedious and feels like a kludge.

The Tools

Receiving emails turned out to be easy with Mandrill.  Mandrill has an “incoming message API” that allows for your application to receive email via webhooks.

Briefly, this is how Mandrill works

  1. Set up a custom reply domain.  This is a DNS MX entry that will set Mandrill as handler for a custom domain.  This takes about 5 minutes.
  2. Set up a route from your Mandrill dashboard.  This maps incoming messages to a webhook on your server.  For mine, I set up a wildcard, so that all addresses get sent to the webhook.
  3. Emails received by Mandrill will be pushed as a JSON object to your server in nearly real time.

Since I’m using CakePHP, which is MVC, I set up a controller that is a singularized endpoint for all incoming webhooks.  A token is used as a shared secret for the application and Mandrill.

  1. Application receives a POST at /webhooks/incoming/<token>
  2. Application fires an event Webhook.Incoming.<token>
  3. An event handler is set to listen to Webhook.Incoming.<token> and parse the incoming data.

Once the webserver was receiving messages from Mandrill, I began work to parse the messages.  Dealing with the “junk” in email, like signatures, and threads was a huge requirement.  Posting this information publicly would clutter the application greatly, and annoy users.

GitHub to the rescue–literally!  GitHub wrote an email reply parsing library, and the library has been ported to PHP.  Including this library and parsing the text from the Mandrill request was trivial.

Mapping the incoming email address to a specific action is accomplished by a database table that has fields for GUID, user_id, event_type, and event_data.

So Far

  1. Emails are generated with special GUID@myreplydomain.org Reply-to addresses.
  2. GUIDS and the corresponding event information are saved to the database.
  3. User receives an email.  If they would like to respond, they may do so using their email client.  Replies go to <guid>@myreplydomain.org.
  4. Emails are received by Mandrill, parsed and sent as a JSON object to a webhook on my web server.
  5. The server looks up the guid and processes the email appropriately.

At this point, email to comment is working beautifully.  The next challenge is sending valid meeting requests and processing the responses into the application.

Standards are your friend.

The global standard for calendar data exchange is ICS aka ICAL.  This text format specifies events, and recipients, and metadata to allow loosely-coupled applications to synchronize state (like meeting cancellations).

There is a protocol to using ICS/ICAL to exchange event information.

ICS/ICAL crash course

  1. Lines are allowed to be 75 characters long.  If your line needs to wrap, the next line should start with a space.  Ideally, you should construct your file, and then wrap the lines one at a time at the end.
  2. ICS files start with BEGIN:VCALENDAR and end with END:VCALENDAR (calendar boundaries)
  3. Key VCALENDAR fields are
    1. PRODID: a string describing application vendor and application that generated the ICS file.  The format is -//vendor/product//LANGUAGE
    2. METHOD: a string describing the nature of the ICS/ICAL file.  Common values include PUBLISH (for publishing event information), REQUEST (for requesting an RSVP), CANCEL (for cancelling an event)
    3. VERSION: the version of the ICS/ICAL standard used.  This is commonly just 2.0.
  4. Events lie within calendar boundaries.  The event boundaries are BEGIN:VEVENT and END:VEVENT
  5. Key VEVENT fields are
    1. SUMMARY: a title for your event
    2. DESCRIPTION: descriptive text about your event.  Newlines should be replaced with the literal string “\n”
    3. DTSTART: the start of the event, ideally in UTC
    4. DTEND; the end of the event, ideally in UTC
    5. DTSTAMP: the time the ICS/ICAL file was generated
    6. UID: a unique identifier that can be used to reference the event in subsequent event updates or cancellations.  This can be anything (URL, GUID, SHA-256 hash), but you need to record it, or you’re going to bungle the entire protocol.
    7. ATTENDEE: encoded metadata describing the recipient’s relation to the event. Subfields include…
      1. RSVP: true or false, depending on whether you want the recipient to respond (Google and Outlook will not show RSVP buttons without this)
      2. CN: the name of the recipient
      3. MAILTO: the email address of the recipient
    8. LOCATION: a string describing the location of the event
    9. URL: an absolute URL for the event. Subfields include…
      1. VALUE: the actual URL for the event

Example ICS/ICAL meeting invite file

Attaching a file like this to an outgoing message will cause (most) email clients to display RSVP buttons!

When an ICS/ICAL file is formed properly, Google will show RSVP buttons.
When an ICS/ICAL file is formed properly, Google will show RSVP buttons.

When an RSVP choice is selected, an ICS will be sent as a reply to the endpoint.  The email address that sends the ICS response can not be treated as important, and should not be used to identify the recipient.  For example, Google uses one notification endpoint for all of their users, and you will be unable to reliably determine who is RSVP’ing to the event.   Instead, the UID should be used exclusively, or else updates to the event will cause duplicates, and all other manner of chaos.

Receiving the RSVP Reply

When Mandrill receives the reply, they will perform a POST to your specified webhook.  The response will be an ICS file.  The ICS file follows the same format as outlined above: it begins and ends with VCALENDAR boundaries, containing at least one VEVENT inside the VCALENDAR.

A number of parsers exist for robust ICS parsing, but we are not interested in anything beyond the latest response for the UID.  When the ICS was sent, the UID and the incoming email address were saved.  As a result, we can look up the UID based on the email that it came from.  If an email is received, and there isn’t a valid link between that email and UID, then nothing will be done.

  1. Look up the UID based on email address.  This returns the user, and the corresponding action (event RSVP modification, in this case).  It is worth noting that these email events should expire eventually, so these email endpoints will automatically deactivate.
  2. Regex the incoming ICS response for the VEVENT region containing UID.  This is not a multi-line regex.  This will capture and return the entire VEVENT.
  3. Regex the VEVENT region for a valid going/not going/tentative response.  Please note this is a multi-line regex.  This will check each line and then capture and return the DECLINED, ACCEPTED, OR TENTATIVE state of the RSVP.
  4. Once we have determined the new state of the RSVP, update the user’s RSVP.

Conclusion

The pieces of this project demonstrate the beauty of event-driven programming.  Using modern web development techniques like webhooks allow for decoupled applications to seamlessly interact with each other.  Mandrill’s service integrates so smoothly with my application that the end result is an interface that is invisible, but robust.  The end result is a incredibly rich interface to an application, where the user interacts and derives value from the application without even logging in.

Limitations

Mandrill’s email service does not allow for the appropriate attachment headers (specifically METHOD: REQUEST) to be included in the message.  As a result, Outlook (desktop and web) will not show RSVP buttons.  Outlook (iOS and Android) perform according to specification and will present RSVP buttons.