Login

Polyglot

Speaking or writing several languages.

Data Mining with Clojure and Datomic

In this article we take a look at a few items:

  • Applying a "Functional Style" to our code. Making better use of Clojure's programming features.
  • Introduction to Datomic, a database written in Clojure.
  • Use of Clojure Test fixtures.

For our project, we are going to analyze job skills. We use Stack Overflow Careers 2.0 as our data source. For example, I query Stack Overflow Careers 2.0 for jobs within a 10 mile radius of my zip code. I search for job postings that contain the keyword "java" or "clojure". I capture the search results as an RSS XML document. I save the XML document to disk.

We use Datomic to persist our data. We also use Datomic to query our data. We then create reports from the queried data.

Datomic is not a relational database. However, it's pretty simple to define a Datomic schema which includes logical relationships.

For our project we first establish a "snapshot". A "snapshot" simply simply tells us when we collected our data. A "snapshot" includes a short description. Here is the "snapshot" schema definition supplied to Datomic:

{:db/id #db/id[:db.part/db]
  :db/ident :snapshot/time
  :db/valueType :db.type/instant
  :db/cardinality :db.cardinality/one
  :db/doc "time data was extracted. milliseconds"
  :db.install/_attribute :db.part/db}

  {:db/id #db/id[:db.part/db]
  :db/ident :snapshot/description
  :db/valueType :db.type/string
  :db/cardinality :db.cardinality/one
  :db/unique :db.unique/identity
  :db/doc "free form description of snapshot"
  :db.install/_attribute :db.part/db}

  {:db/id #db/id[:db.part/db]
  :db/ident :snapshot/job-set
  :db/valueType :db.type/ref
  :db/cardinality :db.cardinality/many
  :db/doc "List of Jobs obtained during snapshot"
  :db.install/_attribute :db.part/db}
  

Note the last attribute ":snapshot/job-set". That states a "snapshot" contains zero to many "jobs". The cardinality is "many", the datatype is "ref". We are storing references to "jobs". This definition is similar to a relational database "foreign key".

We use the same concept to relate a "job" to "skills". Each "job" has an attribute which stores zero to many "skill" references. A skill being something like "java", "sql", "python", etc.

The complete schema is located on github in schema.dtm.

The process for updating the database is simple. I first process the "skills". I query Datomic to determine whether a skill already exists in our database. If the skill does not exist in our database, we want Datomic to generate a unique identifier, AKA entity id. If the skill already exists, we use the existing entity id.

Here is the relevant code.

;; conn parameter is the database connection
;; queries the database for a particular skill (E.G. "programming")
;; if found, returns the database entity id
(defn get-skill-entity-id [conn skill]
 (let [results (q '[:find ?c :in $ ?t :where [?c :skill-set/skill ?t]]
              (db conn) skill) ]
         (first results)))

;; returns true if the skill can not be found
;; in the database (E.G. do we have "programming" stored in the database)
(defn skill-not-exists? [conn skill]
   (nil? (get-skill-entity-id conn skill)))

(defn process-skill [conn skill]
  (when (skill-not-exists? conn skill)
    (add-skill conn skill)))

Now, let's add the skill.

(defn add-skill [conn skill]
   (let [ temp_id (d/tempid :db.part/user) ]
      @(d/transact conn
      [[:db/add temp_id :skill-set/skill skill]])))

Now that the skills are processed, we can now process jobs. Remember, when we process a job we need to associate zero to many skills. So, we now have entity ids for all the skills. Thus, we can use the skill's entity ids as references for our jobs. Here is a portion of our code which stores a job.

(defmulti add-job (fn [one two three four five] (class five)))

;; job has skill list
(defmethod add-job clojure.lang.PersistentVector
  [conn entity-id title job-key skill-list]
  @(d/transact conn [{:db/id entity-id
                      :jobs/title title,
                      :jobs/job-key job-key,
                      :jobs/skill-set skill-list}]))

;; job does not have any skills 
(defmethod add-job nil
  [conn entity-id title job-key skill-list]
  @(d/transact conn [{:db/id entity-id
                      :jobs/title title,
                      :jobs/job-key job-key}]))

We define 2 method signatures, 1) a job with skills, 2) a job without any skills.

We use the same logic that we used for skills to establish an entity id. We look up the job, if the job is already present in the database we use the existing entity id. If the job does not exist, then let Datomic create a new entity id.

Finally we can add our snapshot to the database. The same logic applies. The snapshot contains zero to many job references. Now that we have an entity id for each job. We use the job entity ids are references in our snapshot.

Testing the Java REST-MVC Server Tier

Article Summary
This article is part of a series. We use the "book club" project to explore various programming languages and frameworks. Details of the book club's business and data requirements are detailed in a prior article, "Leveraging Ruby on Rails and ClojureScript.".

We've created a new version of the book club's server tier using a stack of Java components. See the article REST-MVC using Java for details of implementation. This article details automated testing for the Java/REST-MVC server tier.

Automated Unit Tests
In this article we will not enforce a strict differentiation between "Unit" and "Integration" test.

This article refers to "automated testing". That means, the programmer writes a separate program(s) to "test" functionality. We use the Junit testing framework along with some plugin/extensions (Mockito, JsonPath and Hamcrest).

Organizing Tests Into A Plan
In this section of the article we discuss "what" we test.

We need to consider the business and functional requirements of our application and then plan our test suite. So, let's recap our application and organize our test plan. We have entities (and a relationship). Entities are Authors, Books, Categories and Reviews. Book-Categories are relationship between a book and a category (E.G. Tom Sawyer is Fiction).

We also have application services.

  • A repository is responsible for reading and writing data, to and from the database (and the application data objects).
  • A controller accepts requests, dispatches the request to a handler, then routes the response. Our server has 2 types of controllers. We have controllers which route the response to server generated view (E.G. jsp/html/css). We also have controller which generate a JSON response (the client renders the view).

Finally, our application has some specific implementation/optimization requirements. We require the database (as opposed to the application code) enforce unique Authors, Books, Categories and Book-Categories. We require the database to delete related records. For example, books require an author. If we delete an author, the database should automatically delete all of the books written by that author.

So here are the "operation/method" tests:

  • Select records
  • Insert a record
  • Modify a record
  • Delete a record
  • Attempt to insert a duplicate record.
  • Attempt to modify an existing into a duplicate.
  • Delete an Author and verify all related books, reviews and book-categories are also deleted.
  • Delete a Book and verify all related reviews and book-categories are also deleted.

For each of the above we perform for the following services:

  • Repository
  • Controller
  • Rest Controller

For each service, we test the following entities :

  • Author
  • Book
  • Category
  • Review

For each service, we test the following relationship :

  • Book-Category

Testing Clojurescript - A simple approach

Article Summary
This article is part of a series. We use the "book club" project to explore various programming languages and frameworks. Details of the book club's business and data requirements are detailed in a prior article, "Leveraging Ruby on Rails and ClojureScript.".

This article details authomated testing og our Clojurescript client. Please see the corresponding article ClojureScript - Single Page Application - A simple approach for details on business requirements and implementation details.

We employ a stack of Java libraries and tools. Source code and build files are located on GitHub, at the book-site-clojurescript repository.

In this article we'll detail automated testing of our Clojurescript client. I'm using a port of clojure.test. cemerick/clojurescript.test. I am using phantomjs as a test runner (container).

Managing the Project with Leiningen
We are using the Leiningen to manage the Clojurescript project. Settings for Leiningen are placed in project.clj (located at the root directory of your project).

We need to add a contributed library to support automated tests for Clojurescript. We are using cemerick /clojurescript.test, a port of clojure.test to ClojureScript.

We make the following changes to the project.clj file.

We add the test library to the list dependencies and plugins:

:dependencies [[org.clojure/clojure "1.5.1"] [org.clojure/clojurescript "0.0-1859"] 
                      [com.cemerick/clojurescript.test "0.2.1"]]
:plugins [[lein-cljsbuild "0.3.3"] [com.cemerick/clojurescript.test "0.2.1"]
                      [marginalia "0.7.1"] [lein-marginalia "0.7.1"]]

Adding a new test target for the compiler.
Next we instruct the Clojurescript compiler to build 2 seperate targets. The first target includes our unit tests. The second target is what we'll use when deploying on the server (and ultimately delivered to the web browser client).

:cljsbuild
  { :builds [ { :source-paths ["src/cljs" "test" ]
                :compiler {:output-to "target/cljs/wstestable.js"
                         :optimizations :whitespace :pretty-print true}}
              {:id "prod" :source-paths ["src/cljs"]
                :compiler {:output-to "target/cljs/books_cljs.js"
                         :optimizations :whitespace :pretty-print true}}]
     :test-commands { "phantom-ws" [ "target/cljs/wstestable.js"]}
  }

Compiling Multiple Targets
With the above configuration we can compile just "prod" (production) target with the following command line:

$dev> lein cljsbuild once prod

Or, if we want our code compiled each time we save changes to disk. we substitute "once" with "auto":

$dev> lein cljsbuild auto prod

If we want to build both targets, just remove "prod" from the command line. For example to compile both targets just once:

$dev> lein cljsbuild once

Writing the tests.
In general these tests are much closer to the definition of "unit tests". The functions we are testing a very small. For example, a test of addAuthor[]. Let's look at the addAuthor source code:

(defn addAuthor
  ([id first_name last_name ]  
    (swap! AuthorList conj (Author. id first_name last_name )))
  ([jsonObj] (swap! AuthorList conj (jsonToAuthor jsonObj)) )
)

The above function can be called 2 different ways:

  1. With 3 parameters, an id, author's first name. author's last name
  2. WIth 1 parameter, a JSON object

ClojureScript - Single Page Application - A simple approach

Article Summary
This article is part of a series. We use the "book club" project to explore various programming languages and frameworks. Details of the book club's business and data requirements are detailed in a prior article, "Leveraging Ruby on Rails and ClojureScript.".

This article details enhancement to our client tier implementation using ClojureScript. Clojurescript is a compiler which takes Clojure source code and emits (generates) JavaScript. Please see our prior article Leveraging Clojurescript for details on Clojurescript, as well as details on the client's business requirements.

In this installment, we add create,update and delete (CRUD) functionality to our client.

Source code and build files are located on GitHub, at the book-site-clojurescript repository.

Single Page Application (SPA)

The JavaScript client dynamically renders web pages in the web browser. The JavaScript client is referred to as a "single page app".

To the user, the application appears to be displaying multiple web pages. However, technically, the web browser has loaded one single html document (and JavaScript engine instance).

In a traditional web application, the server generates html pages. Each time the web browser requests a new "page", the web browser loads a new instance of the JavaScript loader. On a desktop web browser, the traditional architecture works fine.

SPA and Mobile Devices

On a mobile web browser loading a new instance of the JavaScript engine is not a trivial task. SPA loads a single JavaScript engine (once). Thus, SPA is beneficial on a mobile web browser.

In addition, a SPA requires less traffic going across the network.

On a mobile client, both of these optimizations (reduced network traffic, a single javascript engine instance) are worth considering.

HTML Document Structure

Our html layout is set with a mobile display in mind. We are using the Pure CSS framework to provide a responsive design.

We added a new area of our UI (at the top portion of the display screen). I called the new area (a div) "entity-menu". Entity menu contains a child "ul", identified as "entity-menu-details". The application changes the contents of "entity-menu-details" as the user navigates through the application.

<body>
  <header id="header-container" class="header" style="text-align:center;">
    <h4 id="menu-header>"</h4>
      <div class="pure-menu pure-menu-open pure-menu-horizontal" id="entity-menu">
         <ul id="entity-menu-details">
         </ul>
      </div>
    </header>
</pre>
</div>

DOM Event Listeners

As in the prior version of the client we load the event listeners once. The event listeners target a container element. The container element and the corresponding listener are both maintained throughout the application's life-cycle.

The strategy is to keep the task on managing listeners simple. The number of listeners should remain constant.

We load all of the event listeners when the application starts. In a web browser, the "document ready" event signals the application start. The following snippet demonstrates.

(defn doc-ready-handler []
  (let[ ready-state (. js/document -readyState)]
    (if (= "complete" ready-state)
      (do
        (add-router)
        (do-ajax "GET" "../rest/export/book/all" ajax-response-handler)
        (do-ajax "GET" "../rest/export/authors/all" ajax-response-handler)
        (do-ajax "GET" "../rest/export/review/all" ajax-response-handler)
        (do-ajax "GET" "../rest/export/category/all" ajax-response-handler)
        (do-ajax "GET" "../rest/export/book_category/all" ajax-response-handler)
      ))))

(defn on-doc-ready []
  (aset js/document "onreadystatechange" doc-ready-handler ))

(on-doc-ready)

The last statement (on-doc-ready) is our application entry point.

(on-doc-ready) invokes the on-doc-ready[] function. The on-doc-ready function simply assigns the doc-read-handler[] function as the DOM listener for "ready state change".

Once the Document's ready state is "complete", the add-router[] function is called.

(defn add-router []
  (let [ report (by-id "report")
          menu (by-id "footer-menu")
          entity-menu-details (by-id "entity-menu-details")]
          (.addEventListener report "click" list-click-listener true)
          (.addEventListener menu "click" menu-listener)
          (.addEventListener entity-menu-details "click" entity-menu-listener) ))

The add-router[] function loads our event listeners.

REST-MVC using Java

Article Overview

This article is part of a series. We use the "book club" project to explore various programming languages and frameworks. Details of the book club's business and data requirements are detailed in a prior article, "Leveraging Ruby on Rails and ClojureScript.".

This article details a new server tier implementation. We employ a stack of Java libraries and tools. Source code and build files are located on GitHub, at the book-site-jpa repository.

Server Tier Software Stack

Spring

Spring was originally designed as a lightweight alternative to J2EE. Thus, Spring provides the "big picture" benefits of J2EE. In particular, you can change your application components via Spring's configuration (I.E. instead of changing your application's source code).

Let's look at the historical perspective before going to some of Spring's details.

J2EE was designed for large applications and large organizations. Hence the term "enterprise". When J2EE was first introduced, it defined several development roles within an organization (see Chapter 3 - Applying Enterprise JavaBeans). An "application assembler", "deployer" and "system administrator" combined to serve as a "application server administrator". A role, or group, who's responsibility is similar to a database administrator (DBA). Instead of administering a database, these developers focus on solving an application's business logic by configuring the J2EE components and settings.

Managing Spring's configuration is much simple than the original J2EE specification. The latest J2EE specification has adopted much of it's simplifications from Spring.

Annotation versus XML configuration settings.

Actually, I shouldn't use the word "versus". Almost all of Spring's settings can be now performed through annotations. In addition, you can use both XML and annotated settings together. If your shop is large enough to have a dedicated "application server administrator", you may prefer to keep settings in XML files. XML files are external to the Java source code files. Thus, an "administrator" can tweak the application without touch the source (when settings stored in XML). Vice versa. If your shop combines development with operations, then you may prefer storing settings in an annotated form.

Note! The same is true for other components of our stack (Hibernate, Jersey, JPA, etc). We can define setting in both annotations and XML.

Data Persistence

We'll use Hibernate to provide a transaction manager and map our database schema to Java objects. Java Persistence API (JPA) adds a vendor agnostic API on top of Hibernate.

Spring allows us to annotate all the services together. Spring allows us to inject transacation manager services and JPA repositories (classes which provide basic CRUD functionality).

H2 Database Engine

We use the H2 Database engine in "server mode" as our system database. H2 is very lightweight and easy to set up.

Business Logic

Spring MVC provides a framework for our business logic. We locate most of business logic in Spring Controllers.

Views
Spring's MVC provides a framework for many different types of views. For the book club, we are using web page views (SPRING MVC + JSP/HTML/CSS etc.). Most of the web page artifacts are generated by the server dynamically.

Note! We also implement web pages which are generated dynamically by the web browser (client). We implement a set of Spring Controllers which communicate via JSON with a RESTful interface. See "Jersey JAX-RS/JSON" below.

Web Server - Servlet Engine

For the book club, we are using Jetty as our Web server and servlet engine. Specifically version 9.0.4 of Jetty.

CSS Framework

We also use the Pure CSS framework to layout the web page views.

Jersey JAX-RS/JSON

We use Jersey JAX-RS to ready JSON data sent by our Clojurescript client. We also use Jersey to provide a JSON representation of Java objects when preparing a response (to the client).

Leveraging ClojureScript.

This article is part of a series. The previous article Leverage Ruby on Rails and ClojureScript detailed the project's business requirements. In this article I'll detail the application's mobile client implementation. The mobile client implementation is written in ClojureScript.

Complete source code is available on my GitHub repository here.

What is ClojureScript?

Clojure is a Lisp dialect. Clojure's default implementations run on the Java Virtual Machine (JVM) and Microsoft's CLR.

ClojureScript is a compiler, written in Clojure. The ClojureScript compiler emits (generates) JavaScript.

The programmer writes client code in ClojureScript. The programmer compiles the Clojurescript. The ClojureScript compiler generates JavaScript.

Why ClojureScript

There is a popular theory that holds the less source code statements, the lesser chance for bugs, the greatest number of functions can be provided (AKA -- an application can be more complex.) Clojure and ClojureScript expressions are very short and concise. Let's look at an example.

The following code adds a HTML document ready listener. The ready listener then sends a set asynchronous HTTP requests to our server. Each request to the server is assigned a separate "handler function". The handler functions (not shown here), parse the server's response.

(defn doc-ready-handler []
  (let[ ready-state (. js/document -readyState)]
    (if (= "complete" ready-state)
      (do
        (add-router)
        (doget "GET" "/books" ajax-response-handler)
        (doget "GET" "/authors" ajax-response-handler)
        (doget "GET" "/reviews" ajax-response-handler)
        (doget "GET" "/categories" ajax-response-handler)
        (doget "GET" "/book_categories" ajax-response-handler)
      ))))

(defn on-doc-ready []
  (aset  js/document "onreadystatechange" doc-ready-handler ))

(on-doc-ready)

The first thing you might note is, I am using vanilla JavaScript services. There are many high level JavaScript libraries which ease the development process. Libraries such as JQuery and BackBone have plenty of use cases. The ClojureScript community has already contributed libraries that provide calls the JQuery via ClojureScript.

One of the main focuses of this project is a learning experience. I am not adverse to high level libraries. High level libraries like JQuery and Backbone have very good use cases. I'm using vanilla JavaScript here to follow the "walk then run" philosophy. Much of this project is simply a learning device.

Translation Please
Let's translate the above ClojureScript code.

  • "defn" defines a function. We defined 2 functions, doc-ready-handler and on-doc-ready.
  • "aset" assigns a value to a an application object. In the function "on-doc-ready", we assign a "ready state" listener to our document.
  • "js/document" is the connotation for our document. "js/" designates name space. In this case, the name space is js (vanilla JavaScript). We are instructing ClojureScript to issue a vanilla JavaScript instruction.
  • "doget" is a call to custom function (shown below)

The "doc-ready-handler" function is called by our document several times as the document is loaded in to the web browser. The "doc-ready-handler" function reads the "readyState" document property to determine whether the document is "ready". In other words, this is the same functionality as JQuery's ready() function.

The last expression, (on-doc-ready), runs the function "on-doc-ready".

We are calling a function "doget" several times. That function sends an asynchronous HTTP GET request to the server. The function sets the GET request headers so that the server responds with a JavaScript Object Notation (JSON) response. Finally, the function assigns a listener callback function ("handler-function"). "handler-function" processed the server's response when it arrives. Here is my implementation:

(defn doget [request-type url handler-function ]
  (let [x  (js/XMLHttpRequest.)  ]
    (aset  x "onreadystatechange" handler-function )
    (.open x request-type url)
    (.setRequestHeader x "Content-Type" "application/json" )
    (.setRequestHeader x "Accept" "application/json" )
    (.send x)))

Leveraging Ruby on Rails

This article is part of a series. The prior article "Leveraging Ruby on Rails and ClojureScript." detailed the project business and data requirements.
In this article, I'll detail the server implementation using Ruby on Rails, Version 4 (ROR) and SQLite (database server).

Note! Full source on my GitHub repository.

Customizing and Optimizing the Database Schema

ROR provides a database abstraction layer. One of the purposes of a database abstraction layer is provide a single interface for a diverse set of database engines. However, database engines tend to vary. Each database server and version may have unique features. Thus, ROR's database abstraction layer has limits as to what if can do. ROR's database abstraction layer provides a great foundation. The developer has the option to customize and extend that foundation. I'll demonstrate how to customize the schema instructions generated by ROR.

One of the features of ROR is boiler plate code generation. For the application database, ROR has an option to define database schemas and validations. ROR also has an option to store changes to the database schema. Either defining or changing a database schema is referred to as "Migration" in ROR terminology.

For this application's requirements, we'll use the "Migration" files generated by ROR. However, we'll customize the contents of the "Migration" files to meet or detailed requirements. In other words, we will edit the "Migration" files generated by ROR.

Unique Records

One of our detailed requirements is, we want the database to prevent duplicate records. For example, we don't want to have two authors with the name, "Mark Twain". We could set the application to enforce that rule. However, it's both better performance and better architecture, to set those rules in the database schema itself.

Note! We are using ROR's default database engine SQLite. You must verify your SQLite installation is built with foreign key operation support.

For reference as to why we setting the unique rules in the database schema, note the recommendation in the ROR Guide. Section 1.1 Why Use Validations:

" it may be a good idea to use some constraints at the database level. Additionally, database-level validations can safely handle some things (such as uniqueness in heavily-used tables) that can be difficult to implement otherwise."

Leveraging Ruby on Rails and ClojureScript.

Our upcoming sample project focuses on application frameworks and compilers.

Ruby on Rails (ROR) is an application framework written in Ruby.

ClojureScript is a compiler which emits (generates) JavaScript. The ClojureScript compiler is written in Clojure.

Our project will focus on a couple of customizations.

  1. We'll define database integrity rules in the database server (versus the application code).
  2. We'll verify our JavaScript client code, marks nodes, removed from the document, as available for deletion. In other words, we will verify that after we remove an html element from the document, it's underlying object is empty. We will verify the web browser's garbage collector removes the nodes from memory.

As with projects other projects in this series, this project is primarily a learning device. We will build a useful application. However, the main benefit is, to learn technology.

Business Requirements

One of my hobbies is visiting libraries, used book stores, and reading. I want to carry a list of books, authors and categories (E.G. science fiction) with me on my mobile phone. I'd also like to refer to book reviews. Many folks who work in libraries or book stores are great sources for book and author recommendations. Communicating your prior reviews helps folks recommend new books and authors.

Application Architecture

Server will run on a Local Area Network (LAN). That means, access is provided by my LAN. We won't implement application layer authentication and authorization for the initial release.

Our application provides two user interfaces (UI and user experience (UX)).

  1. Full size web browser. Supports create, report, update, and delete (CRUD).
  2. Mobile web browser. Supports reports.

Leveraging Clojure

Project Enhancements
This article is part of series where we build and enhance the "mobile site generation" project. Each installment in this articles series looks at different computer language. A quick recap of our project. The "mobile site generation" project, is a command line utility. The command line utility reads an RSS XML document and then generates a custom website. The custom website is viewable on mobile devices. You can read more details in previous articles. In this installment, we:

  • Add new navigation links.
    • The logo header now contains a link to our table of contents (index.html) page.
  • Add a meta-tag the html source code. The meta-tag is a list of keywords. The keywords are used by search engines like Bing and Google to categorize the website. We dynamically build keyword based on the RSS XML "category" tags.

Clojure
In this article we implement and enhance our project application with Clojure.

Clojure is a dialect of Lisp . Like Scala, Clojure's default implementation runs on the Java Virtual Machine (JVM). Like Scala, there is also an implementation that runs on Microsoft's .NET CLR.

Concise Coding
Clojure fits the dictionary description of the word concise. Clojure expressions are very brief, yet very comprehensive.

To compensate for Clojure's brevity, I added a fair amount of comments to my source code. Writing Clojure code reminded me of writing "C" expressions. In "C" you can write a function that returns a pointer to a function which returns an array of pointers to structures. When you are writing such expressions, the ideas are fresh in your mind. But, if you haven't coded in "C" for a period of time, then you need an explicit explanation of those same, concise "C" expressions. Thus, I did the same for my Clojure code. I added explicit explanations to the Clojure source code. Along the same lines, I formatted the Clojure code in an outline mode. Probably not the typical format for a Clojure project. However, for folks following this series, I felt it would be easier to compare corresponding functionality (expressed in Java, Scala and Ruby).

Functional Programming
Both Scala and Clojure are referred to as "Functional" computer languages. "Functional" as opposed to "Object Oriented" (E.G. Java, Ruby), or "Procedural" (E.G. "C", Basic). In an Objected Oriented language like Java, you can compose an object tree, parent and child objects (E.G. inner classes). In a "Functional" language you express function trees (E.G. Higher level functions that contain either named or anonymous child-like functions).

For example. the following code listing includes 2 anonymous functions. Those functions are defined inside the definition of a larger function (not shown here, see function "main-process" ). The whole expression iterates first through a list of articles (as represented by "nodes"). Each article contains one to many categories. The expression then iteraties through the list of categories. The inner expression returns true if any of the categories match the criteria, "is equal to the word Polyglot". If the inner expression true, the article is appendend to the collection represented by the variable "poly-articles".

To summarize, the expression compiles a list of articles that have been a categorized as "Polyglot".

(let [ poly-articles
       (filter
            ( fn[n]
              (some #(= "Polyglot" %) (:categories n ) )
            )
            nodes
        )

Data Symbols
Note! In most computer languages,a symbol representing data is referred to as a "variable" (E.G. Java Integer myNum = 1;). I'll use the term "variable", loosely. I'll use "variable", just to make the code explanation a little more familar.

However, there is an important distinction. In Clojure, and in "Functional Programming" the data sent in to function (I.E. paramater) is not mutable. The data parameter does not change. The data parameter does not "vary". Thus the term "variable" doesn't quite fit.

Again, I'll use the term "variable" here, only, because most programmers understand "variable" means "data symbol".

Leveraging Ruby

This article is part of series where we build and enhance the "mobile site generation" project. Each installment in this articles series looks at different computer language.

Project Summary
The "mobile site generation" project is a command line utility. The command line utility generates a section of my website which can be viewed on a mobile web browser. Please refer to previous installments of this series for more details.

Why Specifications Matter
This is the third version of my "mobile site generation" project. The first version was written in Java, the second version in Scala, and now, the third version in Ruby.The processes, data structures, data flow and basic logic have remained constant through all 3 versions. These constants are typically recorded in the "system specification". The system specification describes "what" the project does and "how" the project does it. The system specification can be expressed in many different forms (E.G. a formal document, diagrams, even test plans). The point is, the system specification has some value and it exists independent of the computer language used to implement the system.

Ruby
Ruby is the computer language name. "Ruby on Rails" is a application framework. A good portion of "Ruby on Rails" is written with Ruby (the computer language). However, "Ruby" and "Ruby on Rails" are not the same. In this article series, we are looking at computer languages (not application frameworks). Thus, this article is about "Ruby" the computer language (not "Ruby on Rails" the application framework).

Ruby's Elegance
Ruby's syntax and program statements are concise (short). Like Scala, we can use a like break to delimit statements. Ruby delimits functions with a simple "def" and "end". Again, like Scala your Ruby functions (methods) don't require an explicit "return" instruction. The function return the last statement result.

I used a SAX parser and SAX event handler to parse the website's RSS XML document. I used a Ruby Gem (I.E. library) LibXml. As an example of how elegant and concise Ruby is, here is my SAX event handler implementation:

Syndicate content