Friday, December 25, 2009

How to write a Clojure reader macro, part 2

This is a follow up to my earlier post about writing a reader macro. Here's the disclaimer again.


DISCLAIMER


I Completely agree with Rich's decision to NOT support reader macros. In an activity as simple as writing this post, I found many, many places to make a mistake the hard way. This is an extremely difficult activity to get right, and the end result is something that is not quite the same as normal Clojure. Use the following information at your own risk.



Okay, now that that's been said, let's get on with it. We're going to write a multi-character delimited reader macro this time. Since I'm a point free junkie, we're going to use partial for our example.

Here's what the final use case is going to be

user=>#[+ 1]
#< core$partial ...>

Let's start modifying LispReader.java again. The first thing we're going to do is insert a static symbol

//Inserted at line 40
static Symbol INTERPOLATE_S = Symbol.create("clojure.core", "partial");

Now, Let's take a look around line 84. You'll see the following entry in the array

macros['#'] = new DispatchReader();

The # character is bound to a DispatchReader. This is the object closure uses to implement multiple character reader macros (ever notice that they all start with #?). You'll also notice that there is a dispatchMacros array with several entires in it. Add the following entry

dispatchMacros['['] = new PartialReader();

We also need to define the PartialReader class. It is based on the VectorReader class, which can be found around line 994.



The heavy lifting is done by the readDelimitedList method. Note that the closing delimiter needs to be provided, and the recursive flag should be set to true. It returns an IPersistentList object. The only thing that needs to be done is to prepend a partial to the list. That is why the cons method is used to add a partial symbol (you still need classic macro-fu).

We've added everything we need to add to LispReader.java. All that's left to do is recompile clojure.jar and test the results

user=>(map #[+ 1] [1 2 3])
(2 3 4)

Of course, now that I think about it, comment might be a better symbol to use than partial...

That's how you add a delimited reader macro. Next time we'll look at creating a new dispatch character, and properly escaping everything.

Thursday, December 17, 2009

Proposal for 1.2 : New Seq Utilities

Time to brainstorm everyone!

I've been trying to come up with new sequence functions. Of course, there's nothing really new about these at all. They just aren't in core or c.c.seq-utils. Do any of these belong in contrib or core?

alternate [pred coll]
returns an alternating collection. Similar to a regex partition.

split [pred coll]
returns a collection of collection, split by matching the predicate.

take-last [n coll]
The mirror of drop-last

take-until [pred coll]
Returns a lazy sequence of successive items from coll while (pred item) returns false. pred must be free of side-effects.

drop-until [pred coll]
Returns a lazy sequence of the items in coll starting from the first item for which (pred item) returns true.

rotate [n coll]
Take a collection and left rotates it n steps. If n is negative, the collection is rotated right. Executes in O(n) time.

rotate-while [pred coll]
Rotates a collection left while (pred item) is true. Will return a unrotated sequence if (pred item) is never true. Executes in O(n) time.

rotate-until [pred coll]
Rotates a collection left while (pred item) is nil. Will return a unrotated sequence if (pred item) is always true. Executes in O(n) time.

Anyway, I define all of these functions and more here:


Do others have ideas for missing seq utilities?

Tuesday, December 15, 2009

How to write a Clojure reader macro

This is an article on how to write a basic reader macro in Clojure.


DISCLAIMER


I Completely agree with Rich's decision to NOT support reader macros. In an activity as simple as writing this post, I found many, many places to make a mistake the hard way. This is an extremely difficult activity to get right, and the end result is something that is not quite the same as normal Clojure. Use the following information at your own risk.


The first thing to identify is a behavior that you would like to have a reader macro for. For our example, I am going to use a modified form of Chas Emerick's amazing string interpolation macro. You can find his original article here. I took a modified version of his code, and placed it in core.clj (Be sure to create a new git branch). The code I used is below



Now that the desired functionality is in core, it is time to modify the reader. In this case we need to modify the file LispReader.java. I defined a static variable INTERPOLATE_S, and I am going to assign it the "|" reader macro.

//Inserted at line 40
static Symbol INTERPOLATE_S = Symbol.create("clojure.core", "interpolate-s");

Now, in order for this to be found we need to make an entry in the macros array. This can be done like so:

//Inserted at line 86
macros['|'] = new WrappingReader(INTERPOLATE_S);

The WrappingReader class takes a Symbol object, and wraps it around the next form that is read. Recall how the following form

@a-ref

is expanded to (deref a-ref). In our case

|"A string ~(+ 2 2)"

will be expanded to

(interpolate-s "A string ~(+ 2 2)")

Let's rebuild clojure.jar and try this out at a REPL.



As you can see this works just like Chas' macro. There are still a few things that need to be covered, such as:

* How to create a multiple character reader macro
* How to create a delimited reader macro

These will be topics for another day.

Tuesday, December 8, 2009

uses for take/drop-while

I've found a use for take-while & drop-while in a map.

To start let's define the following

user=>(defn sort-map [m] (apply sorted-map (apply concat m)))

That's right. A function to cast hash-map to tree-maps.

user=>(sort-map {3 :a 1 :c 2 :b})
{1 :c, 2 :b, 3 :a}

Why would I do that? Because I now have a way of making subsets based on the keys.

user=>(keys-pred take-while #(< 2 %) (sort-map {3 :a 1 :c 2 :b}))
{1 :c}

user=>(keys-pred drop-while #(< 2 %) (sort-map {3 :a 1 :c 2 :b}))
{2 :b, 3 :a}

So, when would this have an application?




Let's define the following map



And there you go, a use for take/drop while with maps :)

Monday, November 23, 2009

Feedback for a visitor closure generator

Multimethods are a great abstraction in Clojure. They are great for adapting a wide variety of inputs to a function. However, I often need to decouple the adaptation of data from the function that operates on the adapted data. Also, after the work is complete, I need to adapt the data back to the original format.

As such, I've found the need to create visitor closures in my code. Along the way I came up with a helper function to generate visitor closures. I'd like feedback on the visitor generator.

* Getting Motivated: String functions *

I write the following code a lot to deal with keywords & symbols:

(keyword (some-str-fn (name a-keyword) str-fn-args))
(symbol (some-str-fn (name a-symbol ) str-fn-args))

Each of these can be abstracted out as
(defn visit-keyword
[str-fn a-keyword & args]
(keyword
(apply str-fn (name a-keyword) args)))

(defn visit-symbol
[str-fn a-symbol & args]
(symbol
(apply str-fn (name a-symbol ) args)))

So, the code is now called as follows

(visit-keyword str-fn a-keyword args)
(visit-symbol str-fn a-symbol args)

This is one possible implementation of a visitor patter for symbols, keywords & strings.

* More Problems: Map Functions *

It's common to filter on values or keys when working with maps. I've written a lot of code like this:

;keys
(into {} (some-pred-fn (comp a-pred key) map-fn-args))
;values
(into {} (some-pred-fn (comp a-pred val) map-fn-args))

A first approach would be to write specific helper function for each case. However, this can be turned into a visitor as well.

(defn visitor-keys-pred
[pred-fn pred-arg & args]
(into {}
(apply pred-fn (#(comp % key) pred-arg)
args)))

(defn visitor-vals-pred
[pred-fn pred-arg & args]
(into {}
(apply pred-fn (#(comp % val) pred-arg)
args)))

The definition looks a little funky, but we'll see why in a minute. For now the functions can be called as follows:

(visitor-keys-pred pred-fn pred-arg args)
(visitor-vals-pred pred-fn pred-arg args)

* Putting it together: Visitor Closures *

As you can see, a pattern is starting to develop. Each visitor function has a similar signature:

[f first-arg & rest-args]

This obviously becomes

[f & args]

Each visitor follows a similar pattern, too.
1. Apply a visit-fn to get to a common base type
2. Do work on the base type
3. Apply a return-fn to get back to the original type.

We can turn this description into a closure generating function:

(defn visitor
"Used to implement visitor patterns. (first (args)) is modified by the visitor function, and the result is wrapped in a return-fn call."
[visit-fn return-fn]
(fn [f & args]
(return-fn
(apply f
(visit-fn (first args))
(rest args)))))

We now have a new way to define previous visitor functions. Here's what they look like using this new helper function:

(def visit-keyword (visitor name keyword))
(def visit-symbol (visitor name symbol ))

;Works w/ predicate functions
(def visit-keys-pred (visitor #(comp % key) (partial into {}))
(def visit-vals-pred (visitor #(comp % val) (partial into {}))

As you can see, the only thing we described is the visit-fn and return-fn. The rest of the behavior is defined by the visitor pattern itself.

* Second form *

The one catch is that visitor assumes that the dispatched data is the first argument to f. Sometimes it is the last argument to f, as in take & drop. It is easy enough to define visitor* that works on the last argument of a function. The definition is in the link.

http://gist.github.com/241144

* Back to Multimethods *

The power of the individual closures can be amplified when wrapped in a multimethod. Consider our String/Symbol/Keyword group.

(defmulti visit-string (fn [& args] (second args))

(defmethod visit-string clojure.lang.Symbol
[f & args]
(apply visit-symbol f args))

(defmethod visit-string clojure.lang.Keyword
[f & args]
(apply visit-keyword f args))

(defmethod visit-string :default
[f & args]
(apply f args))

We now have a visit-string function that can truly visit anything. Our function calls above become:

(visit-string str-fn a-keyword args)
(visit-string str-fn a-symbol args)

* Closing questions *

So now that you've how & why this works, I've got some questions for the group:

* Did I accidentally duplicate functionality in core?
* How can this be more flexible?
* What maintenance problems are there that I don't see?
* Will speed be an issue?
* Is the signature of visitor sensible? Is the signature of the generated function sensible? How could it be better?

Thanks in advance,
Sean

Wednesday, September 30, 2009

Short circuiting reductions

Sometimes reduce needs to short circuit


(defn reducer
  "Returns a reduction closure the terminates when pred is false. Applies f to the last value that returns true. f defaults to identity if not provided. Behaves like reduce if pred is always true."
  ([pred] (reducer pred identity))
  ([pred f] (do-stuff:TBD))

Wednesday, September 23, 2009

Uses for juxt

The juxtaposition operator can be used in parsing text.

Suppose we have the following text to process:

INSERT TEXT HERRE


Consider the following function which gets information from the clipboard

(defn bom-parser
  [part-num]
    (map (&
      (juxt last second (constantly part-num))
      (p split #"\s+")
      trim)
      ((& split-lines
        (p gsub #"\"" "")
        (p gsub #"NOT SHOWN" ""))
      (get-clip))))


In the mapping operation you can see a call to juxt. This helps turn a block of text into a list of vectors.