Konrad Hinsen recently posted some ideas for handling namespaces
http://onclojure.com/2010/02/17/managing-namespaces
There are a lot of good ideas in there, and right now I'd like to talk specifically about the proposed :like & :clone clauses. I have a bit of a different approach.
Supposed we add a new project file, namespace_config.clj. It would be a basic rules engine for configuring namespaces. I'm thinking there would be a macro, extend-ns, that is roughly defined as follows. The exact implementation needs work.
This lets us set up a series of predicates & resulting actions that would be applied to each ns in the project. Everything would be done in one central location, so (hopefully) that would cut down on maintenance.
Thursday, February 18, 2010
Sunday, January 24, 2010
Code Kata: A data sifter
Here's a problem inspired by a scheme example I discovered another day.
http://programming-musings.org/2006/02/07/scheme-code-kata/
Write a data sifter, sift, that partitions a string into a list of lists. Start with the case of using letters as a delimiter, and numbers as data. There can be any number of repetitions of numbers & letters.
user=>(sift "a1b2cd34")
(("a" ("1")) ("b" ("2")) ("c" ()) ("d" ("3" "4")))
Next, add the ability to your sift function to accept a list as input, as well as a string.
http://programming-musings.org/2006/02/07/scheme-code-kata/
Write a data sifter, sift, that partitions a string into a list of lists. Start with the case of using letters as a delimiter, and numbers as data. There can be any number of repetitions of numbers & letters.
user=>(sift "a1b2cd34")
(("a" ("1")) ("b" ("2")) ("c" ()) ("d" ("3" "4")))
Next, add the ability to your sift function to accept a list as input, as well as a string.
user=>(sift ("a" "1" "b" "2" "c" "d" "3" "4"))
(("a" ("1")) ("b" ("2")) ("c" ()) ("d" ("3" "4")))
After that, add the ability to take a vector/array as an input
user=>(sift ["a" "1" "b" "2" "c" "d" "3" "4"])
(("a" ("1")) ("b" ("2")) ("c" ()) ("d" ("3" "4")))
Finally, let your sift accept a collection of any object, and an arbitrary predicate. If the predicate is true, the object is a delimiter (e.g. a String). If the predicate is false, the object is data (e.g. a Number).
user=>(sift string? ["a" 1 "b" 2 "c" "d" 3 4])
(("a" (1)) ("b" (2)) ("c" ()) ("d" (3 4)))
I'll be posting my solution in about a week.
Sunday, January 3, 2010
1.2 fn Proposal: same & multisame
Hello Clojure Developers,
Writing software frequently follows the same process. Observing & understanding the processes and coming up with effective solutions is the task of library design. An API is judged by how well it fits into the process.
Application design is a slightly different process. For the purposes of this proposal, it involves three oversimplified steps.
However, there currently is not much work done to bring steps 2 & 3 closer to each other. This is evidenced by the fact that there are specialized namespaces in contrib for handling strings (str-utils2), functor application (generic.functor), and I was in the middle of proposing additions to the map-utils library.
All this code duplication started to smell. Here we all are writing specialized routines to AVOID using the sequence functions in our code. This is not right.
I have a proposal to eliminate this smell. I've written a higher order function called same. Here's the doc:
lib.sfd.same/same
([index? seq-fn & args])
"same is a mutlimethod that is designed to "undo" seq. It expects a seq-fn that returns a normal seq, and the appropraite args. By default it converts the resulting seq into the same type as the last argument. An optional leading integer, index, can be provided to specify the index of the argument that should be used to convert the seq. If it is a sorted seq, the comparator is preserved.
This operation is fundamentally eager, unless a lazy seq is detected. In this case no conversion is attempted, and laziness is preserved."
Please take a moment to review a fairly robust list of examples now:
http://github.com/francoisdevlin/devlinsf-clojure-utils/blob/master/test/lib/sfd/same_test.clj
Afterwards, you can peruse the code here:
http://github.com/francoisdevlin/devlinsf-clojure-utils/blob/master/src/lib/sfd/same.clj
This one function will provide the same functionality as the proposed map-utils, some of c.c.str-utils2, c.c.generic.functor, or any desired set & vector utils. It's based on a multimethod, so you are a simple defmethod addition away from keyword-utils or symbol-utils (assuming you'd want to treat them like strings).
I've also designed a method, multi-same, for functions that take a sequence in and split it into several sequences. Here's a quick example, as the uses for multi-same are still being developed.
user=>(multi-same partition 2 "abcd")
("ab" "cd")
One thing that I find VERY fascinating is the areas where same & multi-same do NOT allow str-utils2 to be replaced out of the box. Some of these can easily be explained. str-utils2/trim is a very string specific piece of code. However, others cannot easily be explained. Why is it that there is no way to split a sequence similar to a regular expression?
I think these areas where string processing is easier represent places we need to improve our sequence library. I've included some new functions in lib.sfd.seq-utils, and I would ask this group to consider adding them to c.c.seq-utils or core.
There also isn't a parser that works with predicates & sequences in core yet. I suspect fn-parse may be a start. I'd appreciate help from anyone that is good with parsers/monads.
So, here's a chance to simultaneously reduce the amount of code in contrib and add lots of functionality to Clojure. In summary, here's what I'm proposing
UPDATE 1/3: I re-wrote same & multi-same to work with a protocol, per Stuart Sierra's suggestion.
I look forward to the discussion,
Sean
Writing software frequently follows the same process. Observing & understanding the processes and coming up with effective solutions is the task of library design. An API is judged by how well it fits into the process.
Application design is a slightly different process. For the purposes of this proposal, it involves three oversimplified steps.
- Convert problem data to a form the API can understand.
- Use the API to come up with a solved version of the problem.
- Convert the API produced solution back to the problem domain solution.
However, there currently is not much work done to bring steps 2 & 3 closer to each other. This is evidenced by the fact that there are specialized namespaces in contrib for handling strings (str-utils2), functor application (generic.functor), and I was in the middle of proposing additions to the map-utils library.
All this code duplication started to smell. Here we all are writing specialized routines to AVOID using the sequence functions in our code. This is not right.
I have a proposal to eliminate this smell. I've written a higher order function called same. Here's the doc:
lib.sfd.same/same
([index? seq-fn & args])
"same is a mutlimethod that is designed to "undo" seq. It expects a seq-fn that returns a normal seq, and the appropraite args. By default it converts the resulting seq into the same type as the last argument. An optional leading integer, index, can be provided to specify the index of the argument that should be used to convert the seq. If it is a sorted seq, the comparator is preserved.
This operation is fundamentally eager, unless a lazy seq is detected. In this case no conversion is attempted, and laziness is preserved."
Please take a moment to review a fairly robust list of examples now:
http://github.com/francoisdevlin/devlinsf-clojure-utils/blob/master/test/lib/sfd/same_test.clj
Afterwards, you can peruse the code here:
http://github.com/francoisdevlin/devlinsf-clojure-utils/blob/master/src/lib/sfd/same.clj
This one function will provide the same functionality as the proposed map-utils, some of c.c.str-utils2, c.c.generic.functor, or any desired set & vector utils. It's based on a multimethod, so you are a simple defmethod addition away from keyword-utils or symbol-utils (assuming you'd want to treat them like strings).
I've also designed a method, multi-same, for functions that take a sequence in and split it into several sequences. Here's a quick example, as the uses for multi-same are still being developed.
user=>(multi-same partition 2 "abcd")
("ab" "cd")
One thing that I find VERY fascinating is the areas where same & multi-same do NOT allow str-utils2 to be replaced out of the box. Some of these can easily be explained. str-utils2/trim is a very string specific piece of code. However, others cannot easily be explained. Why is it that there is no way to split a sequence similar to a regular expression?
I think these areas where string processing is easier represent places we need to improve our sequence library. I've included some new functions in lib.sfd.seq-utils, and I would ask this group to consider adding them to c.c.seq-utils or core.
There also isn't a parser that works with predicates & sequences in core yet. I suspect fn-parse may be a start. I'd appreciate help from anyone that is good with parsers/monads.
So, here's a chance to simultaneously reduce the amount of code in contrib and add lots of functionality to Clojure. In summary, here's what I'm proposing
- Add same to core
- Add multi-same to core
- Add new sequence fns to contrib or core
- Add a new sequence parser to contrib or core
UPDATE 1/3: I re-wrote same & multi-same to work with a protocol, per Stuart Sierra's suggestion.
I look forward to the discussion,
Sean
Friday, December 25, 2009
How to write a Clojure reader macro, part 2
This is a follow up to my earlier post about writing a reader macro. Here's the disclaimer again.
I Completely agree with Rich's decision to NOT support reader macros. In an activity as simple as writing this post, I found many, many places to make a mistake the hard way. This is an extremely difficult activity to get right, and the end result is something that is not quite the same as normal Clojure. Use the following information at your own risk.
Okay, now that that's been said, let's get on with it. We're going to write a multi-character delimited reader macro this time. Since I'm a point free junkie, we're going to use partial for our example.
Here's what the final use case is going to be
user=>#[+ 1]
#< core$partial ...>
Let's start modifying LispReader.java again. The first thing we're going to do is insert a static symbol
//Inserted at line 40
static Symbol INTERPOLATE_S = Symbol.create("clojure.core", "partial");
Now, Let's take a look around line 84. You'll see the following entry in the array
macros['#'] = new DispatchReader();
The # character is bound to a DispatchReader. This is the object closure uses to implement multiple character reader macros (ever notice that they all start with #?). You'll also notice that there is a dispatchMacros array with several entires in it. Add the following entry
dispatchMacros['['] = new PartialReader();
We also need to define the PartialReader class. It is based on the VectorReader class, which can be found around line 994.
The heavy lifting is done by the readDelimitedList method. Note that the closing delimiter needs to be provided, and the recursive flag should be set to true. It returns an IPersistentList object. The only thing that needs to be done is to prepend a partial to the list. That is why the cons method is used to add a partial symbol (you still need classic macro-fu).
We've added everything we need to add to LispReader.java. All that's left to do is recompile clojure.jar and test the results
user=>(map #[+ 1] [1 2 3])
(2 3 4)
Of course, now that I think about it, comment might be a better symbol to use than partial...
That's how you add a delimited reader macro. Next time we'll look at creating a new dispatch character, and properly escaping everything.
DISCLAIMER
I Completely agree with Rich's decision to NOT support reader macros. In an activity as simple as writing this post, I found many, many places to make a mistake the hard way. This is an extremely difficult activity to get right, and the end result is something that is not quite the same as normal Clojure. Use the following information at your own risk.
Okay, now that that's been said, let's get on with it. We're going to write a multi-character delimited reader macro this time. Since I'm a point free junkie, we're going to use partial for our example.
Here's what the final use case is going to be
user=>#[+ 1]
#< core$partial ...>
Let's start modifying LispReader.java again. The first thing we're going to do is insert a static symbol
//Inserted at line 40
static Symbol INTERPOLATE_S = Symbol.create("clojure.core", "partial");
Now, Let's take a look around line 84. You'll see the following entry in the array
macros['#'] = new DispatchReader();
The # character is bound to a DispatchReader. This is the object closure uses to implement multiple character reader macros (ever notice that they all start with #?). You'll also notice that there is a dispatchMacros array with several entires in it. Add the following entry
dispatchMacros['['] = new PartialReader();
We also need to define the PartialReader class. It is based on the VectorReader class, which can be found around line 994.
The heavy lifting is done by the readDelimitedList method. Note that the closing delimiter needs to be provided, and the recursive flag should be set to true. It returns an IPersistentList object. The only thing that needs to be done is to prepend a partial to the list. That is why the cons method is used to add a partial symbol (you still need classic macro-fu).
We've added everything we need to add to LispReader.java. All that's left to do is recompile clojure.jar and test the results
user=>(map #[+ 1] [1 2 3])
(2 3 4)
Of course, now that I think about it, comment might be a better symbol to use than partial...
That's how you add a delimited reader macro. Next time we'll look at creating a new dispatch character, and properly escaping everything.
Thursday, December 17, 2009
Proposal for 1.2 : New Seq Utilities
Time to brainstorm everyone!
I've been trying to come up with new sequence functions. Of course, there's nothing really new about these at all. They just aren't in core or c.c.seq-utils. Do any of these belong in contrib or core?
alternate [pred coll]
returns an alternating collection. Similar to a regex partition.
split [pred coll]
returns a collection of collection, split by matching the predicate.
take-last [n coll]
The mirror of drop-last
take-until [pred coll]
Returns a lazy sequence of successive items from coll while (pred item) returns false. pred must be free of side-effects.
drop-until [pred coll]
Returns a lazy sequence of the items in coll starting from the first item for which (pred item) returns true.
rotate [n coll]
Take a collection and left rotates it n steps. If n is negative, the collection is rotated right. Executes in O(n) time.
rotate-while [pred coll]
Rotates a collection left while (pred item) is true. Will return a unrotated sequence if (pred item) is never true. Executes in O(n) time.
rotate-until [pred coll]
Rotates a collection left while (pred item) is nil. Will return a unrotated sequence if (pred item) is always true. Executes in O(n) time.
Anyway, I define all of these functions and more here:
http://github.com/francoisdevlin/devlinsf-clojure-utils/blob/master/src/lib/sfd/seq_utils.clj
http://github.com/francoisdevlin/devlinsf-clojure-utils/blob/master/src/lib/sfd/patterns.clj
http://github.com/francoisdevlin/devlinsf-clojure-utils/blob/master/src/lib/sfd/patterns.clj
Do others have ideas for missing seq utilities?
Tuesday, December 15, 2009
How to write a Clojure reader macro
This is an article on how to write a basic reader macro in Clojure.
I Completely agree with Rich's decision to NOT support reader macros. In an activity as simple as writing this post, I found many, many places to make a mistake the hard way. This is an extremely difficult activity to get right, and the end result is something that is not quite the same as normal Clojure. Use the following information at your own risk.
The first thing to identify is a behavior that you would like to have a reader macro for. For our example, I am going to use a modified form of Chas Emerick's amazing string interpolation macro. You can find his original article here. I took a modified version of his code, and placed it in core.clj (Be sure to create a new git branch). The code I used is below
Now that the desired functionality is in core, it is time to modify the reader. In this case we need to modify the file LispReader.java. I defined a static variable INTERPOLATE_S, and I am going to assign it the "|" reader macro.
//Inserted at line 40
static Symbol INTERPOLATE_S = Symbol.create("clojure.core", "interpolate-s");
Now, in order for this to be found we need to make an entry in the macros array. This can be done like so:
//Inserted at line 86
macros['|'] = new WrappingReader(INTERPOLATE_S);
The WrappingReader class takes a Symbol object, and wraps it around the next form that is read. Recall how the following form
@a-ref
is expanded to (deref a-ref). In our case
|"A string ~(+ 2 2)"
will be expanded to
(interpolate-s "A string ~(+ 2 2)")
Let's rebuild clojure.jar and try this out at a REPL.
As you can see this works just like Chas' macro. There are still a few things that need to be covered, such as:
* How to create a multiple character reader macro
* How to create a delimited reader macro
These will be topics for another day.
DISCLAIMER
I Completely agree with Rich's decision to NOT support reader macros. In an activity as simple as writing this post, I found many, many places to make a mistake the hard way. This is an extremely difficult activity to get right, and the end result is something that is not quite the same as normal Clojure. Use the following information at your own risk.
The first thing to identify is a behavior that you would like to have a reader macro for. For our example, I am going to use a modified form of Chas Emerick's amazing string interpolation macro. You can find his original article here. I took a modified version of his code, and placed it in core.clj (Be sure to create a new git branch). The code I used is below
Now that the desired functionality is in core, it is time to modify the reader. In this case we need to modify the file LispReader.java. I defined a static variable INTERPOLATE_S, and I am going to assign it the "|" reader macro.
//Inserted at line 40
static Symbol INTERPOLATE_S = Symbol.create("clojure.core", "interpolate-s");
Now, in order for this to be found we need to make an entry in the macros array. This can be done like so:
//Inserted at line 86
macros['|'] = new WrappingReader(INTERPOLATE_S);
The WrappingReader class takes a Symbol object, and wraps it around the next form that is read. Recall how the following form
@a-ref
is expanded to (deref a-ref). In our case
|"A string ~(+ 2 2)"
will be expanded to
(interpolate-s "A string ~(+ 2 2)")
Let's rebuild clojure.jar and try this out at a REPL.
As you can see this works just like Chas' macro. There are still a few things that need to be covered, such as:
* How to create a multiple character reader macro
* How to create a delimited reader macro
These will be topics for another day.
Tuesday, December 8, 2009
uses for take/drop-while
I've found a use for take-while & drop-while in a map.
To start let's define the following
user=>(defn sort-map [m] (apply sorted-map (apply concat m)))
That's right. A function to cast hash-map to tree-maps.
user=>(sort-map {3 :a 1 :c 2 :b})
{1 :c, 2 :b, 3 :a}
Why would I do that? Because I now have a way of making subsets based on the keys.
user=>(keys-pred take-while #(< 2 %) (sort-map {3 :a 1 :c 2 :b}))
{1 :c}
user=>(keys-pred drop-while #(< 2 %) (sort-map {3 :a 1 :c 2 :b}))
{2 :b, 3 :a}
So, when would this have an application?

Let's define the following map
And there you go, a use for take/drop while with maps :)
To start let's define the following
user=>(defn sort-map [m] (apply sorted-map (apply concat m)))
That's right. A function to cast hash-map to tree-maps.
user=>(sort-map {3 :a 1 :c 2 :b})
{1 :c, 2 :b, 3 :a}
Why would I do that? Because I now have a way of making subsets based on the keys.
user=>(keys-pred take-while #(< 2 %) (sort-map {3 :a 1 :c 2 :b}))
{1 :c}
user=>(keys-pred drop-while #(< 2 %) (sort-map {3 :a 1 :c 2 :b}))
{2 :b, 3 :a}
So, when would this have an application?

Let's define the following map
And there you go, a use for take/drop while with maps :)
Subscribe to:
Posts (Atom)