Code Density?
I fully agree with Donald Knuth when he wrote:
Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.
However, writing code for other people is far less precise. ;-) For what might be readable to one, may be Perl code to another.
Maps versus Comprehensions
Whenever I write Python code, I’m always surprised that the lint
program flags my map
calls with a warning to use a comprehension
instead. Clearly, not an objective judgment as I personally
prefer the visual clarity of a map
call:
floats = map(float, numbers)
Seems more understandable than:
floats = [ float(n) for n in numbers ]
Perhaps that is just because, as an old Lisp programmer, I’m used to
the map
syntax.
Once you add anonymous functions, I might be inclined to
agree, as Python’s lambda
expressions can make the intention less
obvious:
vfloats = map(lambda n: abs(float(n)), numbers)
Compared to:
vfloats = [ lambda n: abs(float(n)) for n in numbers ]
Since Python prefers exceptions over None
values, and wants to
eventually grow up into an object-oriented language with more methods
than functions, dirty data, like this collection of strings:
numbers = ["3.4", "bob", "-2.1"]
means we can’t use lambdas, and need to create named functions:
def convertToFloat(n): try: return abs(float(n)) except ValueError: return None def actualNums(n): return n is not None vfloats = filter(actualNums, map(convertToFloat, numbers))
If I use comprehensions, I couldn’t specify the order as I did here,
for with comprehensions, the if
filters the numbers before the
function is called, not after. This means that I need to use nested
comprehensions:
vfloats = [y for y in [convertToFloat(x) for x in numbers] if y is not None]
In order to make this more readable, I would need to split that command and use intermediate variables.
vfloat_or_none = [convertToFloat(x) for x in numbers] vfloats = [y for y in vfloat_or_none if y is not None]
Not much of an improvement.
If our collection of strings was clean, the map
function in Clojure
could use the abbreviated lambda syntax, which I find clear for
simple expressions:
(map #(Math/abs (Float/parseFloat %)) numbers)
However, since Clojure is heavily dependent on Java (and its idiom for preferring exceptions), we need to use named functions if our data is dirty:
(defn convert-to-float [n] (try (Math/abs (Float/parseFloat n)) (catch NumberFormatException e nil))) (filter some? (map convert-to-float numbers))
In languages that are more functional, we can expect them to supply
existing functions for null
filtering, like the some?
usage
above.
My point here is not to dis on Python, as I firmly believe one can write good, readable code in any language. But as an old Lisp-head, my expectations of readable code may be different than others.
Some Languages more Readable?
This is why I appreciated the long discussion on Twitter that happened after Bodil wrote:
I think Python’s syntax is the most skimmable I’ve encountered. Maybe because it’s hard to get too fancy in Python. Maybe that’s good.
This reminded me of an old colleague who would remove all spaces, blank lines and especially comments from his Java code. He would even go so far as to put multiple statements per line in order to see more code on the screen. Somewhat understandable with Java’s excessively verbose and ornamented syntax.
On the flip side, some languages allow one to write cryptic, nearly impenetrable sequence of symbols masquerading as programs. Seems we need a nice balance between expressing the solution without unnecessary syntax, and writing terse, dense, unreadable code.
Complexity Happens
Ran into the following function in Daniel Higginbotham’s Clojure for the Brave and True:
(defn comparator-over-maps [comparison-fn ks] (fn [maps] (zipmap ks (map (fn [k] (apply comparison-fn (map k maps))) ks))))
I like the book, as it is interesting and well written, and the code, for the most part, is quite clear. However, I do believe this function illustrates what many people dislike about Clojure.
The idea with this program is the ability to create a functions that
compares values of the keys in a hash. For instance, one could create
a min
function that behave like:
(min [{:a 1 :b 3} {:a 5 :b 0}]) ; => {:a 1 :b 0}
Perhaps splitting that function into two or three parts may make it
more readable (I certainly would consider bringing out the inner
lambda passed to the map
), but complexity happens. Perhaps, we
should drop a comment before such code:
<pre> # Here be dragons! F409; </pre>
Threading Macro for the Win
Often the initial goals and ideas that spawn a beautiful language runs aground on the rocks of new realities. And fixing most languages is impossible. Which is why Bodil later wrote:
Can somebody invent a syntax with Lisp’s elegance of structure and Haskell’s visual elegance, pls? And if it’s easy to skim that’s nice too.
One can not really talk about Lisp’s syntax. It really hasn’t any. This comes with some advantages, primary is the ability to fix the language with macros. For years, Lisps has been plagued with unreadable, nested expressions. Consequently, the Clojure community embraced the threading macro.
This flattens and reorders nested expressions, and is now being ported to other Lisps. Its other offering is removing the need for naming lots of transient variables. Let me explain by way of an example.
A colleague of mine recently posted the following code in our work’s “code review” system:
# Variable, status_output, contains textual output from a command status_array = status_output.split("\n") clean_status_array = status_array.attr_reader :eject { |line| line.empty? } process_array = clean_status_array.select { |process| not process.match(/^==/) } status_data = process_array.map { |line| line.split }
Simple data transformation, but the only way to follow is by walking through the variable references, and naming variables are hard. The same code, rewritten in Clojure would be:
(def status-data (->> status-output (str/split-lines) (remove blank?) (remove #(re-matches #"^==" %)) (map str/split))
Ignoring Ruby’s block syntax (which for simple one line calls like the code above which tends to make code less readable), removing the variable names allows one to easily see the data transformations.
Makes me wish I could port the threading macro to Ruby and other languages.
Summary
However, I’m still not always sure what is readable. Perhaps we
only know it when we see it. Last week, I had to write a script that
could call the ps
command with various options and attempt to find
the maximum CPU usage value, RSS and virtual memory values.
I created a processes
list by splitting the ps
output, and then:
max_CPU, max_RSS, max_VSIZE = processes.reduce([0, 0, 0], &method(:only_max))
The only_max
function was pretty simple:
def only_max (max_values, line) max_cpu, max_rss, max_vsize = max_values cpu, rss, vsize = line.split.map(&:to_i) # The new memo to return is the max values: [ [max_cpu, cpu].max, [max_rss, rss].max, [max_vsize, vsize].max ] end
I love reduce
as it is such a useful tool, but if map
is little
known/understood, then reduce
is even more so. I am quite
concerned that my little code would decrease its readability.
What is your opinion? Is using map
and reduce
akin to JAPH, or
does it highlight the intention of the data transformation?