Code Density?
I fully agree with Donald Knuth when he wrote:
Let us change our traditional attitude to the construction of programs. Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.
However, writing code for other people is far less precise. ;-) For what might be readable to one, may be Perl code to another.
Maps versus Comprehensions
Whenever I write Python code, I’m always surprised that the lint
program flags my map
calls with a warning to use a comprehension
instead. Clearly, not an objective judgment as I personally
prefer the visual clarity of a map
call:
floats = map(float, numbers)
Seems more understandable than:
floats = [ float(n) for n in numbers ]
Perhaps that is just because, as an old Lisp programmer, I’m used to
the map
syntax.
Once you add anonymous functions, I might be inclined to
agree, as Python’s lambda
expressions can make the intention less
obvious:
vfloats = map(lambda n: abs(float(n)), numbers)
Compared to:
vfloats = [ lambda n: abs(float(n)) for n in numbers ]
Since Python prefers exceptions over None
values, and wants to
eventually grow up into an object-oriented language with more methods
than functions, dirty data, like this collection of strings:
numbers = ["3.4", "bob", "-2.1"]
means we can’t use lambdas, and need to create named functions:
def convertToFloat(n): try: return abs(float(n)) except ValueError: return None def actualNums(n): return n is not None vfloats = filter(actualNums, map(convertToFloat, numbers))
If I use comprehensions, I couldn’t specify the order as I did here,
for with comprehensions, the if
filters the numbers before the
function is called, not after. This means that I need to use nested
comprehensions:
vfloats = [y for y in [convertToFloat(x) for x in numbers] if y is not None]
In order to make this more readable, I would need to split that command and use intermediate variables.
vfloat_or_none = [convertToFloat(x) for x in numbers] vfloats = [y for y in vfloat_or_none if y is not None]
Not much of an improvement.
If our collection of strings was clean, the map
function in Clojure
could use the abbreviated lambda syntax, which I find clear for
simple expressions:
(map #(Math/abs (Float/parseFloat %)) numbers)
However, since Clojure is heavily dependent on Java (and its idiom for preferring exceptions), we need to use named functions if our data is dirty:
(defn convert-to-float [n] (try (Math/abs (Float/parseFloat n)) (catch NumberFormatException e nil))) (filter some? (map convert-to-float numbers))
In languages that are more functional, we can expect them to supply
existing functions for null
filtering, like the some?
usage
above.
My point here is not to dis on Python, as I firmly believe one can write good, readable code in any language. But as an old Lisp-head, my expectations of readable code may be different than others.
Some Languages more Readable?
This is why I appreciated the long discussion on Twitter that happened after Bodil wrote:
I think Python’s syntax is the most skimmable I’ve encountered. Maybe because it’s hard to get too fancy in Python. Maybe that’s good.
This reminded me of an old colleague who would remove all spaces, blank lines and especially comments from his Java code. He would even go so far as to put multiple statements per line in order to see more code on the screen. Somewhat understandable with Java’s excessively verbose and ornamented syntax.
On the flip side, some languages allow one to write cryptic, nearly impenetrable sequence of symbols masquerading as programs. Seems we need a nice balance between expressing the solution without unnecessary syntax, and writing terse, dense, unreadable code.
Complexity Happens
Ran into the following function in Daniel Higginbotham’s Clojure for the Brave and True:
(defn comparator-over-maps [comparison-fn ks] (fn [maps] (zipmap ks (map (fn [k] (apply comparison-fn (map k maps))) ks))))
I like the book, as it is interesting and well written, and the code, for the most part, is quite clear. However, I do believe this function illustrates what many people dislike about Clojure.
The idea with this program is the ability to create a functions that
compares values of the keys in a hash. For instance, one could create
a min
function that behave like:
(min [{:a 1 :b 3} {:a 5 :b 0}]) ; => {:a 1 :b 0}
Perhaps splitting that function into two or three parts may make it
more readable (I certainly would consider bringing out the inner
lambda passed to the map
), but complexity happens. Perhaps, we
should drop a comment before such code:
# Here be dragons!
#
Threading Macro for the Win
Often the initial goals and ideas that spawn a beautiful language runs aground on the rocks of new realities. And fixing most languages is impossible. Which is why Bodil later wrote:
Can somebody invent a syntax with Lisp’s elegance of structure and Haskell’s visual elegance, pls? And if it’s easy to skim that’s nice too.
One can not really talk about Lisp’s syntax. It really hasn’t any. This comes with some advantages, primary is the ability to fix the language with macros. For years, Lisps has been plagued with unreadable, nested expressions. Consequently, the Clojure community embraced the threading macro.
This flattens and reorders nested expressions, and is now being ported to other Lisps. Its other offering is removing the need for naming lots of transient variables. Let me explain by way of an example.
A colleague of mine recently posted the following code in our work’s “code review” system:
# Variable, status_output, contains textual output from a command status_array = status_output.split("\n") clean_status_array = status_array.attr_reader :eject { |line| line.empty? } process_array = clean_status_array.select { |process| not process.match(/^==/) } status_data = process_array.map { |line| line.split }
Simple data transformation, but the only way to follow is by walking through the variable references, and naming variables are hard. The same code, rewritten in Clojure would be:
(def status-data (->> status-output (split-lines) ; From string library (remove blank?) (remove #(re-matches #"^==" %)) (map split)) ; clojure.string/split
Ignoring Ruby’s block syntax (which, for simple one line calls, tend to make code less readable1), removing the variable names with Clojure’s threading macro allows one to easily see the data transformations.
Makes me wish I could port the threading macro to Ruby and other languages.
Is Reduce Readable
I’m still not always sure what is readable. Perhaps we only know
it when we see it. Last week, I had to write a script that could
call the ps
command with various options and attempt to find the
maximum CPU usage value, RSS and virtual memory values.
I created a processes
list by splitting the ps
output, and wrote
a single line of code that figured out the maximum values from the
ps
output.
max_CPU, max_RSS, max_VSIZE = processes.reduce([0, 0, 0], &method(:only_max))
The behavior is contained in the only_max
function, but that seems
pretty simple:
def only_max (max_values, line) max_cpu, max_rss, max_vsize = max_values cpu, rss, vsize = line.split.map(&:to_i) # The new memo to return is the max values: [ [max_cpu, cpu].max, [max_rss, rss].max, [max_vsize, vsize].max ] end
I love reduce
as it is such a useful tool, but if map
is little
known/understood, then reduce
is even more so. I am concerned my
little code would decrease its readability.
Is using map
and reduce
akin to JAPH, or does it highlight the
intention of the data transformation?
Summary
My programming goal is to write code that is readable to someone with little knowledge of the language used to write the code. Why? How many times have you found a bug in code written by someone else? How quick can they get around to fixing it for you? Isn’t submitting a pull request the ultimate aspiration of open source?
So I agree with Bodil; you don’t need to know Python in order to read an algorithm coded in Python. What about Ruby? While most Ruby projects build a DSL first, I would still say so. Java? Hard to see the forest for the trees. The real challenge, is writing readable Lisp code, since the average “visual” programmer has difficulty parsing the parens.
One feature of Clojure that I like is the with-test
macro. It
allows me to both define a function and write some associated unit
tests, as it demonstrates the function’s behavior without having to
dig through the unit test suite. Since I don’t want to clutter my
code with tests, lately, I’ve been placing just a few usage
examples (enough to get the gist), and letting the test suite be
more comprehensive (see this exercism example).
What is your opinion? What makes readable code?
Footnotes:
Ruby blocks was a novel approach at creating syntactic sugar around passing functions as variables. I liked how Groovy functions that had, as their last parameter, a function, could substitute either an inline block or a function. Not sure why Ruby made a completely separate data type for a blocks and a functions. Could they not be the same?
This is why, if you had a string of numbers into an array of numbers, you need to write:
line.split.map(&:to_i)
instead of what I think would have been more obvious:
line.split.map(to_i)
Which is much more readable than the typical Rubyism:
line.split.map { |n| to_i(n) }