Monday, November 23, 2009

Feedback for a visitor closure generator

Multimethods are a great abstraction in Clojure. They are great for adapting a wide variety of inputs to a function. However, I often need to decouple the adaptation of data from the function that operates on the adapted data. Also, after the work is complete, I need to adapt the data back to the original format.

As such, I've found the need to create visitor closures in my code. Along the way I came up with a helper function to generate visitor closures. I'd like feedback on the visitor generator.

* Getting Motivated: String functions *

I write the following code a lot to deal with keywords & symbols:

(keyword (some-str-fn (name a-keyword) str-fn-args))
(symbol (some-str-fn (name a-symbol ) str-fn-args))

Each of these can be abstracted out as
(defn visit-keyword
[str-fn a-keyword & args]
(apply str-fn (name a-keyword) args)))

(defn visit-symbol
[str-fn a-symbol & args]
(apply str-fn (name a-symbol ) args)))

So, the code is now called as follows

(visit-keyword str-fn a-keyword args)
(visit-symbol str-fn a-symbol args)

This is one possible implementation of a visitor patter for symbols, keywords & strings.

* More Problems: Map Functions *

It's common to filter on values or keys when working with maps. I've written a lot of code like this:

(into {} (some-pred-fn (comp a-pred key) map-fn-args))
(into {} (some-pred-fn (comp a-pred val) map-fn-args))

A first approach would be to write specific helper function for each case. However, this can be turned into a visitor as well.

(defn visitor-keys-pred
[pred-fn pred-arg & args]
(into {}
(apply pred-fn (#(comp % key) pred-arg)

(defn visitor-vals-pred
[pred-fn pred-arg & args]
(into {}
(apply pred-fn (#(comp % val) pred-arg)

The definition looks a little funky, but we'll see why in a minute. For now the functions can be called as follows:

(visitor-keys-pred pred-fn pred-arg args)
(visitor-vals-pred pred-fn pred-arg args)

* Putting it together: Visitor Closures *

As you can see, a pattern is starting to develop. Each visitor function has a similar signature:

[f first-arg & rest-args]

This obviously becomes

[f & args]

Each visitor follows a similar pattern, too.
1. Apply a visit-fn to get to a common base type
2. Do work on the base type
3. Apply a return-fn to get back to the original type.

We can turn this description into a closure generating function:

(defn visitor
"Used to implement visitor patterns. (first (args)) is modified by the visitor function, and the result is wrapped in a return-fn call."
[visit-fn return-fn]
(fn [f & args]
(apply f
(visit-fn (first args))
(rest args)))))

We now have a new way to define previous visitor functions. Here's what they look like using this new helper function:

(def visit-keyword (visitor name keyword))
(def visit-symbol (visitor name symbol ))

;Works w/ predicate functions
(def visit-keys-pred (visitor #(comp % key) (partial into {}))
(def visit-vals-pred (visitor #(comp % val) (partial into {}))

As you can see, the only thing we described is the visit-fn and return-fn. The rest of the behavior is defined by the visitor pattern itself.

* Second form *

The one catch is that visitor assumes that the dispatched data is the first argument to f. Sometimes it is the last argument to f, as in take & drop. It is easy enough to define visitor* that works on the last argument of a function. The definition is in the link.

* Back to Multimethods *

The power of the individual closures can be amplified when wrapped in a multimethod. Consider our String/Symbol/Keyword group.

(defmulti visit-string (fn [& args] (second args))

(defmethod visit-string clojure.lang.Symbol
[f & args]
(apply visit-symbol f args))

(defmethod visit-string clojure.lang.Keyword
[f & args]
(apply visit-keyword f args))

(defmethod visit-string :default
[f & args]
(apply f args))

We now have a visit-string function that can truly visit anything. Our function calls above become:

(visit-string str-fn a-keyword args)
(visit-string str-fn a-symbol args)

* Closing questions *

So now that you've how & why this works, I've got some questions for the group:

* Did I accidentally duplicate functionality in core?
* How can this be more flexible?
* What maintenance problems are there that I don't see?
* Will speed be an issue?
* Is the signature of visitor sensible? Is the signature of the generated function sensible? How could it be better?

Thanks in advance,