Code Density?

I fully agree with Donald Knuth when he wrote:

Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

However, writing code for other people is far less precise. ;-) For what might be readable to one, may be Perl code to another.

Maps versus Comprehensions

Whenever I write Python code, I’m always surprised that the lint program flags my map calls with a warning to use a comprehension instead. Clearly, not an objective judgment as I personally prefer the visual clarity of a map call:

floats = map(float, numbers)

Seems more understandable than:

floats = [ float(n) for n in numbers ]

Perhaps that is just because, as an old Lisp programmer, I’m used to the map syntax.

Once you add anonymous functions, I might be inclined to agree, as Python’s lambda expressions can make the intention less obvious:

vfloats = map(lambda n: abs(float(n)), numbers)

Compared to:

vfloats = [ lambda n: abs(float(n)) for n in numbers ]

Since Python prefers exceptions over None values, and wants to eventually grow up into an object-oriented language with more methods than functions, dirty data, like this collection of strings:

numbers = ["3.4", "bob", "-2.1"]

means we can’t use lambdas, and need to create named functions:

def convertToFloat(n):
    try:
        return abs(float(n))
    except ValueError:
        return None

def actualNums(n):
    return n is not None

vfloats = filter(actualNums, map(convertToFloat, numbers))

If I use comprehensions, I couldn’t specify the order as I did here, for with comprehensions, the if filters the numbers before the function is called, not after. This means that I need to use nested comprehensions:

vfloats = [y for y in [convertToFloat(x) for x in numbers] if y is not None]

In order to make this more readable, I would need to split that command and use intermediate variables.

vfloat_or_none = [convertToFloat(x) for x in numbers]
vfloats = [y for y in vfloat_or_none if y is not None]

Not much of an improvement.

If our collection of strings was clean, the map function in Clojure could use the abbreviated lambda syntax, which I find clear for simple expressions:

(map #(Math/abs (Float/parseFloat %)) numbers)

However, since Clojure is heavily dependent on Java (and its idiom for preferring exceptions), we need to use named functions if our data is dirty:

(defn convert-to-float [n]
  (try
    (Math/abs (Float/parseFloat n))
    (catch NumberFormatException e nil)))

(filter some? (map convert-to-float numbers))

In languages that are more functional, we can expect them to supply existing functions for null filtering, like the some? usage above.

My point here is not to dis on Python, as I firmly believe one can write good, readable code in any language. But as an old Lisp-head, my expectations of readable code may be different than others.

Some Languages more Readable?

This is why I appreciated the long discussion on Twitter that happened after Bodil wrote:

I think Python’s syntax is the most skimmable I’ve encountered. Maybe because it’s hard to get too fancy in Python. Maybe that’s good.

This reminded me of an old colleague who would remove all spaces, blank lines and especially comments from his Java code. He would even go so far as to put multiple statements per line in order to see more code on the screen. Somewhat understandable with Java’s excessively verbose and ornamented syntax.

On the flip side, some languages allow one to write cryptic, nearly impenetrable sequence of symbols masquerading as programs. Seems we need a nice balance between expressing the solution without unnecessary syntax, and writing terse, dense, unreadable code.

Complexity Happens

Ran into the following function in Daniel Higginbotham’s Clojure for the Brave and True:

(defn comparator-over-maps
  [comparison-fn ks]
  (fn [maps]
    (zipmap ks
            (map (fn [k] (apply comparison-fn (map k maps)))
                 ks))))

I like the book, as it is interesting and well written, and the code, for the most part, is quite clear. However, I do believe this function illustrates what many people dislike about Clojure.

The idea with this program is the ability to create a functions that compares values of the keys in a hash. For instance, one could create a min function that behave like:

(min [{:a 1 :b 3} {:a 5 :b 0}])  ; => {:a 1 :b 0}

Perhaps splitting that function into two or three parts may make it more readable (I certainly would consider bringing out the inner lambda passed to the map), but complexity happens. Perhaps, we should drop a comment before such code:

# Here be dragons! 
#

Threading Macro for the Win

Often the initial goals and ideas that spawn a beautiful language runs aground on the rocks of new realities. And fixing most languages is impossible. Which is why Bodil later wrote:

Can somebody invent a syntax with Lisp’s elegance of structure and Haskell’s visual elegance, pls? And if it’s easy to skim that’s nice too.

One can not really talk about Lisp’s syntax. It really hasn’t any. This comes with some advantages, primary is the ability to fix the language with macros. For years, Lisps has been plagued with unreadable, nested expressions. Consequently, the Clojure community embraced the threading macro.

This flattens and reorders nested expressions, and is now being ported to other Lisps. Its other offering is removing the need for naming lots of transient variables. Let me explain by way of an example.

A colleague of mine recently posted the following code in our work’s “code review” system:

# Variable, status_output, contains textual output from a command
status_array = status_output.split("\n")
clean_status_array = status_array.attr_reader :eject { |line| line.empty? }
process_array = clean_status_array.select { |process| not process.match(/^==/) }
status_data = process_array.map { |line| line.split }

Simple data transformation, but the only way to follow is by walking through the variable references, and naming variables are hard. The same code, rewritten in Clojure would be:

(def status-data (->> status-output
                      (split-lines)      ; From string library
                      (remove blank?)
                      (remove #(re-matches #"^==" %))
                      (map split))       ; clojure.string/split

Ignoring Ruby’s block syntax (which, for simple one line calls, tend to make code less readable¹), removing the variable names with Clojure’s threading macro allows one to easily see the data transformations.

Makes me wish I could port the threading macro to Ruby and other languages.

Is Reduce Readable

I’m still not always sure what is readable. Perhaps we only know it when we see it. Last week, I had to write a script that could call the ps command with various options and attempt to find the maximum CPU usage value, RSS and virtual memory values.

I created a processes list by splitting the ps output, and wrote a single line of code that figured out the maximum values from the ps output.

max_CPU, max_RSS, max_VSIZE = processes.reduce([0, 0, 0], &method(:only_max))

The behavior is contained in the only_max function, but that seems pretty simple:

def only_max (max_values, line)
  max_cpu, max_rss, max_vsize = max_values
      cpu,     rss,     vsize = line.split.map(&:to_i)

  # The new memo to return is the max values:
  [ [max_cpu,   cpu].max,
    [max_rss,   rss].max,
    [max_vsize, vsize].max ]
end

I love reduce as it is such a useful tool, but if map is little known/understood, then reduce is even more so. I am concerned my little code would decrease its readability.

Is using map and reduce akin to JAPH, or does it highlight the intention of the data transformation?

Summary

My programming goal is to write code that is readable to someone with little knowledge of the language used to write the code. Why? How many times have you found a bug in code written by someone else? How quick can they get around to fixing it for you? Isn’t submitting a pull request the ultimate aspiration of open source?

So I agree with Bodil; you don’t need to know Python in order to read an algorithm coded in Python. What about Ruby? While most Ruby projects build a DSL first, I would still say so. Java? Hard to see the forest for the trees. The real challenge, is writing readable Lisp code, since the average “visual” programmer has difficulty parsing the parens.

One feature of Clojure that I like is the with-test macro. It allows me to both define a function and write some associated unit tests, as it demonstrates the function’s behavior without having to dig through the unit test suite. Since I don’t want to clutter my code with tests, lately, I’ve been placing just a few usage examples (enough to get the gist), and letting the test suite be more comprehensive (see this exercism example).

What is your opinion? What makes readable code?

Footnotes:

Ruby blocks was a novel approach at creating syntactic sugar around passing functions as variables. I liked how Groovy functions that had, as their last parameter, a function, could substitute either an inline block or a function. Not sure why Ruby made a completely separate data type for a blocks and a functions. Could they not be the same?

This is why, if you had a string of numbers into an array of numbers, you need to write:

line.split.map(&:to_i)

instead of what I think would have been more obvious:

line.split.map(to_i)

Which is much more readable than the typical Rubyism:

line.split.map { |n| to_i(n) }