clj.orcery

Language, Expression and Design

Tuesday

13

September 2016

hara.io.file - nio for clojure

by Chris Zheng,

For the longest time, I detested writing the delete function for file system manipulation. It really bothered me that to delete a folder, there was serious pain. A comparison of how a folder was deleted in various environments:

(defn delete-recursively [fname]
  (let [func (fn [func f]
               (when (.isDirectory f)
                 (doseq [f2 (.listFiles f)]
                   (func func f2)))
               (clojure.java.io/delete-file f))]
    (func func (clojure.java.io/file fname))))

Firstly, that's too much code to write. I also like to point out that the code uses clojure.java.io - which really should have solved this problem but hasn't. Secondly, it's not that flexible. If I only wanted to delete all the .DS_Store files recursively in my directory, then I would have to write something else. And why is it so difficult to delete a directory?

When I was picking a FileSystem library, I noticed that there was this built-in package called java.nio that looked a little scary. After all, why would I want to use anything else but java.io.File and all it's juicy, familiar goodness.

It's interesting to note that java.nio stands for 'New IO' - but it's not very new. It was introduced with J2SE 1.4 in 2002. That's about a decade and a half ago - and the library is more than twice as old a Clojure itself. Java SE 7 introduced More New Input/Output APIs nicknamed NIO.2. I started looking around at different articles explaining the differences between the two, looking to convince myself as to I should give java.nio a go over java.io:

At first, I just looked at simple comparisons:

And then I came across this article:

This article tripped a switch in my mind. I thought about all the things I disliked about the java.io package: Yes, of course there should be better exceptions for failed file operations. Yes, of course there should be better support for richer functions for file system manipulation instead of handrolling it every single time. Yes, listing files in a directory should not cause me so much headache (especially in Clojure). After all those issues about java.io finally came out, I had to make a decision. Stick with the familiar and safe, or go with my principles.

What moved me was the fact that the java.io package was not consistent enough and although efforts have been done to make it consistent, java.io was simply not the most dependable of libraries to use. I bit the bullet and started reading the java.nio documentation. I realised something about filesystems that I never did before. We should be thinking about operations on many files, instead of on one file. If an operation can be done on a whole group of files, then that operation can be readily done on a single file. That saves a lot of redundant code because code for many can always be used by code for one - but not necessarily the other way round.

The workhorse of the java.nio.file package is a class called Files. It consists of a whole bunch of static methods that operate on the filesystem. It's really well done and the most useful method is one called walkFileTree - which takes a whole bunch of arguments that are completely foreign for anyone coming from the java.io world. The type signature of this method is java.nio.file.Path, java.nio.file.FileVisitor, java.nio.file.LinkOption[].

The crazy amount of types and classes and options put me off for a couple of days, but then there were about 15 years worth of tutorials on the internet and I started thinking about it a bit more about how I could make everything work. In the end, I just did it - writing a simple, customisable wrapper function around the inputs for walkFileTree. With walk written, the function then provided the framework for all complex file manipulation tasks in the hara.io.file namespace (list, delete, copy and select).

Now that the same method is providing support underneath, I can now do some pretty nifty things. copy is as a example, but the same filter/option mechanism can be applied to delete and select as well:

(require '[hara.io.file :as fs])

copy file:

(fs/copy "project.clj" "project.clj.bak")

copy directory:

(fs/copy "src" "src.bak")

copy directory - 2 levels:

(fs/copy "src" "src.2" {:depth 2})

copy only images to new directory:

(fs/copy "resources" "images.bak" {:include [".png" ".jpg"]})

copy only images to new directory - but this time, just simulate it:

(fs/copy "resources" "images.bak" {:include [".png" ".jpg"] :simulate true})

copy only links to new directory (this could be a bit nicer I suppose)

(fs/copy "" 
         "../links" 
         {:include [(fn [{:keys [path]}]
                       (fs/link? path)]})

copy everything except .cljs files to new directory:

(fs/copy "src" "clojure.bak" {:exclude [".cljs"]})

For those that continued reading because all they wanted to do was to delete those pesky .DS_Store files, here it is =)

(fs/delete "" {:include [".DS_Store"]})

If you don't want to recurse, just say so. It works the opposite of the shell command - recursive operation is implied unless explicitly stated:

(fs/delete "" {:include [".DS_Store"]
               :depth 1 ;; or :recursive false
               })

With the java.nio.file package as the backbone, the API is much more comfortable to use, it is more consistent and it feels safer. Functionality breaks the same way and improvements to one also result in improvements to many. Personally, it's an upgrade that I believe is well worth making, and I hope everyone that reads this thinks so too.

Please have a play.