Code Density?

I fully agree with Donald Knuth when he wrote:

Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

However, writing code for other people is far less precise. ;-) For what might be readable to one, may be Perl code to another.

Maps versus Comprehensions

Whenever I write Python code, I’m always surprised that the lint program flags my map calls with a warning to use a comprehension instead. Clearly, not an objective judgment as I personally prefer the visual clarity of a map call:

floats = map(float, numbers)

Seems more understandable than:

floats = [ float(n) for n in numbers ]

Perhaps that is just because, as an old Lisp programmer, I’m used to the map syntax.

Once you add anonymous functions, I might be inclined to agree, as Python’s lambda expressions can make the intention less obvious:

vfloats = map(lambda n: abs(float(n)), numbers)

Compared to:

vfloats = [ lambda n: abs(float(n)) for n in numbers ]

Since Python prefers exceptions over None values, and wants to eventually grow up into an object-oriented language with more methods than functions, dirty data, like this collection of strings:

numbers = ["3.4", "bob", "-2.1"]

means we can’t use lambdas, and need to create named functions:

def convertToFloat(n):
    try:
        return abs(float(n))
    except ValueError:
        return None

def actualNums(n):
    return n is not None

vfloats = filter(actualNums, map(convertToFloat, numbers))

If I use comprehensions, I couldn’t specify the order as I did here, for with comprehensions, the if filters the numbers before the function is called, not after. This means that I need to use nested comprehensions:

vfloats = [y for y in [convertToFloat(x) for x in numbers] if y is not None]

In order to make this more readable, I would need to split that command and use intermediate variables.

vfloat_or_none = [convertToFloat(x) for x in numbers]
vfloats = [y for y in vfloat_or_none if y is not None]

Not much of an improvement.

If our collection of strings was clean, the map function in Clojure could use the abbreviated lambda syntax, which I find clear for simple expressions:

(map #(Math/abs (Float/parseFloat %)) numbers)

However, since Clojure is heavily dependent on Java (and its idiom for preferring exceptions), we need to use named functions if our data is dirty:

(defn convert-to-float
  [n]
  (try
    (Math/abs (Float/parseFloat n))
    (catch NumberFormatException e nil)))

(filter some? (map convert-to-float numbers))

In languages that are more functional, we can expect them to supply existing functions for null filtering, like the some? usage above.

My point here is not to dis on Python, as I firmly believe one can write good, readable code in any language. But as an old Lisp-head, my expectations of readable code may be different than others.

Some Languages more Readable?

This is why I appreciated the long discussion on Twitter that happened after Bodil wrote:

I think Python’s syntax is the most skimmable I’ve encountered. Maybe because it’s hard to get too fancy in Python. Maybe that’s good.

This reminded me of an old colleague who would remove all spaces, blank lines and especially comments from his Java code. He would even go so far as to put multiple statements per line in order to see more code on the screen. Somewhat understandable with Java’s excessively verbose and ornamented syntax.

On the flip side, some languages allow one to write cryptic, nearly impenetrable sequence of symbols masquerading as programs. Seems we need a nice balance between expressing the solution without unnecessary syntax, and writing terse, dense, unreadable code.

Complexity Happens

Ran into the following function in Daniel Higginbotham’s Clojure for the Brave and True:

(defn comparator-over-maps
  [comparison-fn ks]
  (fn [maps]
    (zipmap ks
            (map (fn [k] (apply comparison-fn (map k maps)))
                 ks))))

I like the book, as it is interesting and well written, and the code, for the most part, is quite clear. However, I do believe this function illustrates what many people dislike about Clojure.

The idea with this program is the ability to create a functions that compares values of the keys in a hash. For instance, one could create a min function that behave like:

(min [{:a 1 :b 3} {:a 5 :b 0}])  ; => {:a 1 :b 0}

Perhaps splitting that function into two or three parts may make it more readable (I certainly would consider bringing out the inner lambda passed to the map), but complexity happens. Perhaps, we should drop a comment before such code:

<pre> # Here be dragons! &#1F409; </pre>

Threading Macro for the Win

Often the initial goals and ideas that spawn a beautiful language runs aground on the rocks of new realities. And fixing most languages is impossible. Which is why Bodil later wrote:

Can somebody invent a syntax with Lisp’s elegance of structure and Haskell’s visual elegance, pls? And if it’s easy to skim that’s nice too.

One can not really talk about Lisp’s syntax. It really hasn’t any. This comes with some advantages, primary is the ability to fix the language with macros. For years, Lisps has been plagued with unreadable, nested expressions. Consequently, the Clojure community embraced the threading macro.

This flattens and reorders nested expressions, and is now being ported to other Lisps. Its other offering is removing the need for naming lots of transient variables. Let me explain by way of an example.

A colleague of mine recently posted the following code in our work’s “code review” system:

# Variable, status_output, contains textual output from a command
status_array = status_output.split("\n")
clean_status_array = status_array.attr_reader :eject { |line| line.empty? }
process_array = clean_status_array.select { |process| not process.match(/^==/) }
status_data = process_array.map { |line| line.split }

Simple data transformation, but the only way to follow is by walking through the variable references, and naming variables are hard. The same code, rewritten in Clojure would be:

(def status-data (->> status-output
                      (str/split-lines)
                      (remove blank?)
                      (remove #(re-matches #"^==" %))
                      (map str/split))

Ignoring Ruby’s block syntax (which for simple one line calls like the code above which tends to make code less readable), removing the variable names allows one to easily see the data transformations.

Makes me wish I could port the threading macro to Ruby and other languages.

Summary

However, I’m still not always sure what is readable. Perhaps we only know it when we see it. Last week, I had to write a script that could call the ps command with various options and attempt to find the maximum CPU usage value, RSS and virtual memory values.

I created a processes list by splitting the ps output, and then:

max_CPU, max_RSS, max_VSIZE = processes.reduce([0, 0, 0], &method(:only_max))

The only_max function was pretty simple:

def only_max (max_values, line)
  max_cpu, max_rss, max_vsize = max_values
      cpu,     rss,     vsize = line.split.map(&:to_i)

  # The new memo to return is the max values:
  [ [max_cpu,   cpu].max,
    [max_rss,   rss].max,
    [max_vsize, vsize].max ]
end

I love reduce as it is such a useful tool, but if map is little known/understood, then reduce is even more so. I am quite concerned that my little code would decrease its readability.

What is your opinion? Is using map and reduce akin to JAPH, or does it highlight the intention of the data transformation?