Login

Leveraging Scala

In this article, I am going to introduce the computer programming language "Scala". I am going to use "Scala" to enhance my mobile site generation utility (see previous article for details).

Project Functional Enhancements
I want to add the following functions to the mobile site generator utility:

  • Publish a "Site Map" document that allows search engines to index my mobile web site.
  • Replace the current mobile sites's document file naming scheme. The current file naming scheme is numeric (E.G. doc-1.html, doc-2.html). My main website is built with the Drupal content management system (CMS). Drupal provides a facility for generating "Search Engine Optimization" (SEO) file names, and thus SEO friendly links. I want to use the SEO friendly file name/links in the mobile version of my website.
  • Add a mobile-friendly contact mechanism to each article. Currently, each article contains a footer section which directs readers to contact me via my website's main contact page/form. That contact/page form does not render properly on a mobile browser. I've already composed a simple contact mechanism that renders on a mobile device. An example is located at the bottom of this page. Thus, when republishing an article for the mobile version of my website, I want to replace the current contact box with the "mobilized" contact box.

Scala
Scala runs on the Java Virtual Machine (JVM). Thus, in a single program you can use both "Scala" and "Java" instructions. For this release, I replaced all of the Java functions with Scala with a single exception. I decided to use the Apache Common IO library to perform file/directory operations. Other than the file/directory operations I was able to re--write the mobile site generator fairly quickly in Scala.

Generating the SiteMap with Scala
A "Site Map" is an XML document. Scala has XML operations built directly in to the base computer language. In addition, Scala allows you intersperse the computer language instructions with document fragments. Most computer languages provide some facility to mix html/xml (document fragments) with business objects (computer instructions). This feature is sometimes referred to as "templates" (E.G. JavaScript/Html templates, Apache Velocity templates). Also known as "embedding" instructions (E.G. Embedded Ruby/ERB). Or the feature is implement as framework extension. For example, Microsoft's ASP framework, Java Servlet Paged (JSP), etc).

The ability to ember Scala statements in a XML block turned out to be quite handy. The implementation details bear this out.

Here is the basic implementation of the Site Map generation.

class SiteMap { 
  
  val baseAddr = "http://public-action.org/mob"  

  def writeOutputFile(fileName: String, items: ListBuffer[DocNode]) = {
    val locationPrefix = "http://public-action.org/mob/doc-"
    val index = -1
    val priority = 0.5
    val changefreq = "daily"
    // Note! On my site the rss feed is published in descending order by publish data
    // thus the first articles in the rss feed (for my site) has the most recent date

    try {
      val output =
        <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
          xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9
          http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
          <url>
            <loc>{baseAddr}</loc>
            <changefreq>daily</changefreq>
            <priority>1.0</priority>
            <lastmod>{convertDate(items(0).pubDate)}</lastmod>
          </url>
          {for (item <- items) yield
            <url>
                <loc>{fullLink(item.link)}</loc>
                <lastmod>{convertDate(item.pubDate)}</lastmod>
            </url>
          }
        </urlset>
      scala.xml.XML.save(fileName,output,"UTF-8",true)
    } catch {
      case ex: Exception => ("An error occured " + ex.getMessage() )
    }
   }

   def fullLink(link: String) = {
      baseAddr + "/" + link + ".html"
   }
  
  def convertDate(inDate: String) : String = {
    val split = inDate.split(" ")
    split(3) + "-" + convertMonth(split(2)) + "-" + split(1)
  }

  def convertMonth(monthStr: String) : String = {
    monthStr match {
      case "Jan" => "01"
      case "Feb" => "02"
      case "Mar" => "03"
      case "Apr" => "04"
      case "May" => "05"
      case "Jun" => "06"
      case "Jul" => "07"
      case "Aug" => "08"
      case "Sep" => "09"
      case "Oct" => "10"
      case "Nov" => "11"
      case "Dec" => "12"
      case _ => ""
    }  
  }
}

Note the line

 scala.xml.XML.save(fileName,output,"UTF-8",true)

By setting the last parameter to "true", Scala adds the XML declaration (required to make the document valid). The date-time format used in the RSS feed does not match the date format required by the Site Map document definition. Thus, the convert date routine. Scala's match/case construct is similar to Java's switch/case construct. Scala's implementation of "match/case" extends the functionality by allowing you to match any type of pattern (E.G. match by class type). In the above code, I am performing a simple text pattern match.

When I look at SiteMap class, the source code presents a visual representation of both the XML output and the process. That visualization helps make the code maintainable. To me, the visualization makes the source listing easier to understand.

The logical process of generating a Site Map.
What I am doing is reading my current website's RSS feed (an XML representation of the website's articles). I then select each individual article and get it's:
- Link Address (location relative to the website E.G. http://public-action.org/mob/some-article.html)
- The date in which the article was first published (in YYYY-MM-DD format)

Each link and date is then inserted in to the Site Map XML document.

The "sitemap" document definition allows a webmaster to set priority values to specific articles. For my mobilized website, I am setting all of the articles at the same priority value.

Implementation details of generating a Site Map.
The site's RSS feed file consists of a series of "items". Each item contains an individual article.

Note the following block:

{for (item <- items) yield
   <url>
       <loc>{fullLink(item.link)}</loc>
       <lastmod>{convertDate(item.pubDate)}</lastmod>
    </url>
}
 

"{for (item <- items) yield" means loop through the input file one article at a time. Each iteration gets an article's "link' and publishing date, then, adds those values to the XML output.

(<loc>{fullLink(item.link)}</loc>) adds our mobile sites base address as a prefix, and a ".html" suffix, to each file name/link.

(<lastmod>{convertDate(item.pubDate)}</lastmod>) re-formats the date.

"yield" adds the iteration result to the XML document we are building.

A few notes on Scala language features.
If you aren't sure what "type" of data the Scala compiler is returning, you can run you Scala code in the "scala" read–eval–print loop (REPL), similar to a Scala interpreter.

A few notes about Scala syntax and structure. In Scala:

  • Use of semi-colons to designate end-of-statement is optional.
  • A property's type is placed to the right of the property's name. E.G. firstName: String . "firstName" is the property name, "String" is the property type.
  • All functions return a value. The "return" statement is optional. That result of the last function statement, is what the function returns.
  • The try/catch/finally clause is constructed with switch style "match".
  • A singleton instance is designated by Object (instead of class).
  • You can generate getter/setter method on class properties with following notation: var firstName: String = _
  • Data defaults to immutable. Similar to static final T in Java.

There are many other Scala language features that I've found to be very useful. My experience level with Java is far greater than my experience with Scala. Even so, I was able to re-write and enhance the whole mobile site generator project, in Scala, in roughly about the same time (as the initial write in Java).

I should note, the mobile site generator program, is simple in many ways. It's a single user application, that means no there are no thread contention issues. Nonetheless, I thought this small project is good introduction in to migrating from Java to Scala.

Scala also runs on MicroSoft's Net VM and framework. Thus, a similar .Net program written in C# may also be an easy migration.

A quick example of Scala's concise notation.

class DocNode {
  var title: String = _
  var body: String = _
  var link: String = _
  var pubDate: String = _
}

The above DocNode class generates getter/setter pairs for each property (title, body, line, pubDate). The "link" property provides the SEO friendly file name/link convention mentioned at the top of this article. The "link" values are "parsed" from the website's RSS xml feed (see SAX Parser below for details).

Using the SAX Parser in Scala
I could of converted the current "Sax XML processing" (see our previous article for details) to the construct used for the sitemap. However, as a learning exercise, I decided to re-implement the Sax processing in Scala. I was able to literally copy and paste my existing Java code, make a few changes, and voila , a Scala implementation.

A quick example of our Scala implementation. The following is a segment of the Sax event handler:

class DocNodeListHandler extends DefaultHandler  {
  val docNodeList = ListBuffer[DocNode]()
  var start = false 
  var node: DocNode = null
  val buffer:StringBuilder = new StringBuilder()
    
  override def startElement(namespaceURI: String, localName: String ,  qName:String , atts: Attributes ) = {
    buffer.setLength(0);
    if (qName.equals("language")) {
      start = true;
    } else if (qName.equals("item") && start) {
      node = new DocNode()
    }
  }

Compare the above Scala implementation with the Java implementation (NodeListHandler) here.. Not much difference.

Replacing the "contact box" implementation details.
To replace the exiting contact box text, I could have processed the article's body as XML. However, Html is often "malformed" in terms of XML document structure definitions. Thus, I decided to use a simple String function to perform the search and replace. This only works because I replicate the "contact box" exactly from article to article. That means identifying the contact back with the document is a simple pattern match.

Simplifying creation of the website's index.html file
As a bonus, Scala allowed me to consolidate the code required to produce both, the site's articles, and, the site's index page. The index page is a "table of contents", a list links to the sites articles.

Again, Scala's built in templating facility turned out to be really handy. All I had to do was insert a coupe of Scalal statements inside a html template.

When I "mobilize" an article, I add links to the index page (home link). Each mobilized article has a "home link" at the top of the articles and a "home link" at the bottom of the article. Scala allowed me to embed an "if" clause inside the html template. "If" we are creating an articles, add the home links. If we creating the index (home) page, do not add the home links.

Here is quick snipplet. "homeLinks" is a boolean value which is set to either true or false.

 { if (homeLinks) {
    <div class="pull-right">
      <a href="index.html"><i class="icon-home"></i> - Home</a>
    </div>
 }

Again, to me, the above snipplet is a superior representation of the intended html output. Again. you can perform this type of templating in other languages. However, for Scala, the feature is built in to the base language. Thus, for me a real productivity booster.

Full source code available on Git Hub repository.
I added a new repository on Git Hub for my Scala version of the "mobile site generator. You can peruse (and use) all of the source code located here.

Build and Internal Documentation Generator
To maintain some consistency between the Java and Scala versions of this project, I added an ant build file. Thus, if folks want to explore both versions, they can use the same build utility, Apache Ant.

As I mentioned in the previous article on the Java version of this project, the Ant build is optional. You can compile the Scala source files from the command line. The Ant build is provided simply as a convenience.

Scala has it's own "ScalaDoc" generation utility. I added a target in the Ant build file "docs", which generates the documentation for the project. The command line to generate the documentation is:

$>ant docs

Purpose of Documentation
It's worth seems relevant to add a note about documentation. Documentation as in, an explicit, detailed explanation, of what your program is doing, and "how' your program is doing it. When you are under pressure to get the current release out the door, it's human nature to avoid spending time on documentation.

When a program is successful, there are many releases. Spending time on documenting the current release, may expedite the production of future releases. For example, your shop has many projects written in many different computer languages. Your shop is growing and bring in new programmers. You are woken up at 3:00 AM for an emergency bug fix. An detailed, explicit and overstated explanation helps. In my opinion, Scala's concise, brief notation neither hinders nor guarantees your code is "self documenting" (I.E. no further explanation required). Thus, documentation is worth considering.

I recommend use of Scala.
For shops or developers interested in "Functional" programming (specially from a Java background), I highly recommend Scala. Hopefully, I've demonstrated some of the productivity boosts that Scala provides.

About the Author:
Lorin M Klugman - I'm an experienced developer. My main interest is in new technology. Please use our contact box here if you are interested in hiring me. Please no recruiters :)

Comments

Interesting Article

Hi,
I've been trying to choose a "Functional Programming" language to learn. This article is a good introduction. Gives me something to consider. Thank you .