Optimize for readability first

Nowadays when someone uses the word optimization they usually mean optimization for execution time, unless they've explicitly stated that they are going to optimize for GPU memory consumption, network traffic, etc.


Know what you are optimizing for

When I started programming we not only had slow processors, we also had very limited memory — sometimes measured in kilobytes. So we had to think about memory and optimize memory consumption wisely. University taught us about two extremes in optimization:

  • You either optimize for execution speed sacrificing memory,
  • Or you optimize for memory consumption by calculating stuff over and over again.

These days nobody cares about memory much (except demosceners, embedded systems engineers and sometimes mobile games developers); I mean not only RAM but hard drive space too. Just look at the Watch Dogs install which is about 25Gb on disk. In addition, I'm writing this in a Chrome tab which ate 130Mb of RAM.

But there are more optimization types:

  • Optimization for power consumption, which is getting more attention with growth of smartphones market
  • Optimization for readability to make reading and debugging code easier, thus reducing development time and cost
  • I'll just stop here...

Optimization for readability — making code easier to read, follow and understand.

You should understand that you can't optimize everything at once. For example, while working on performance optimization you will likely make your app consume more memory and make your code less readable.


Why readability

Developers spend most of their productive time reading code, not writing it: debugging, checking commits done by others, learning new libraries, etc.

While doing this developers essentially work as (not so good actually) interpreters — executing code in their head while trying to keep in mind the current state of execution. This is why programmers tend to be particularly grumpy when interrupted during this process.

Time == money

The most important thing to understand is that your and your coworkers' time costs a lot. And even if a developer works hard they can still waste time by:

  • Working on things which are not needed now and might be never used.
  • Working on something which doesn't add perceivable value.
    For example, wasting a week to optimize execution time of a function which is called once an hour by 10ms.
  • Writing hard-to-debug code and trying to find bugs in it.
  • Writing code which slows down other people.
    Remember that "other people" might as well be YOU in just a week.

This is provided that the developer in question is an experienced one and knows how to write efficient algorithms and clean code. Otherwise this list would be too long.


Optimize for readability

There's a well-known quote by Donald Knuth. I bet you've heard it numerous times.

"Premature optimization is the root of all evil (or at least most of it) in programming." by D.Knuth, 1974.

I have seen a lot of people who memorized it but didn't understand what this quote really means. The most common mistake looks like this:

— Why the code for this simple task is so complicated?
— I optimized X and Y because in the future...
— Haven't you heard that Premature optimization is the root of all evil?
— Sure, but this is not premature optimization, I know that this way it will work faster.

I assume this is because the term premature optimization is not well-defined. This is why the person in the example doesn't consider what he has done to be a premature optimization at all. How can we define this term?

Premature optimization — everything one tries to optimize before profiling and running tests on a working system.

Everything except readability. So instead of what one shouldn't do we'd better say what one should do. And the quote will look like this:

Optimize for readability first.


What slows down developers when reading code

Well, we agreed that we should make our code easier to read to spend less time doing it, wasting less money, right? But what does it really mean?

There are two fundamental things that immensely slow down a developer while reading code:

  1. The code is hard to understand,
  2. The code is hard to follow.

The code is hard to understand

Unfortunately people are not like software interpreters which don't bother to know what the code means to add two numbers and call a function (while it still compiles, of course).

To find why the code doesn't work a programmer needs to understand exactly what it does as typed in a source file, and what its original purpose was.

What makes code hard to understand?

From here I'll be writing about an experienced developer who knows the language the code is written in and algorithms used in the application's domain (i.e. he has enough knowledge to understand this code).

  1. The code stinks. Monstrosities of one letter variables and 1000 lines long functions.
  2. It is not properly or consistently formatted.
  3. It has unnecessary one-liners.
  4. There are undocumented low-level optimizations.
  5. The code is too clever.

I'll skip the first two because you shouldn't be reading bad code anyway. If someone in your company writes it, teach them or get rid of them. And of course you need to enforce strict coding style for your entire code base.

3. It has unnecessary one-liners

Or so-called number of lines optimizations. Long lines with nested function calls and ?: operators are harder to parse. Of course, you may say that this argument is subjective. But some people just feel that they must leave in source code as few lines as possible, sacrificing readability.

4. Undocumented low-level optimizations

Long time ago the code was readable and worked fine, but someone decided to optimize it at some point. It might be a good optimization after serious profiling, but now the code looks like a combination of arrays, bitwise operators and magic numbers. Nobody knows what exactly it does or even what it was supposed to do because the guy who made the optimization didn't leave any comments.

You probably heard that good code doesn't need comments. But optimized code (evenespecially if it's good) DOES need them.

Most harmless consequences of such optimizations might be undocumented lines like this in your code base:

if (val != val) { ... }

5. The code is too clever

As software developers we learn more and more academic tricks which we then use in production code. We are Computer Scientists after all, not just mere coders!

Some languages even encourage developers to use bleeding edge techniques, making code more expressive and academic. You get the same feeling of accomplishment when you build a perfectly robust system in code as when you prove a hard theorem in math using methods 99.997% of educated individuals don't understand.

Even if code is well-organized into modules/classes/functions and each of these blocks contains perfectly readable imperative code, for someone else to read this code they will need a broader view of its architecture and they must know all the techniques and patterns used.

Once again remember that "someone else" might as well be YOU in a week.

This is most likely why I know only 2 people who use Scala in production. Personally I like Scala a lot. For me it has been an academic playground where I could build glass castles in vacuum. But the more you know about it and the more you use its features, the more you understand that it's essentially a write-only language (please don't quote me on this!); not as write-only as Perl, but even the most beautiful code base will require changes and updates.

Not as write-only as Perl but even the most beautiful code base will require changes and updates. And now you are stuck looking for a person who can understand this beautiful code...

This seems controversial that clean clever code is harder to read.

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." by Brian Kernighan


The code is hard to follow

When reading code a developer jumps frequently from one method or class to another. This is where knowing your IDE of choice will save you a lot of time. By using Go to Declaration, Find Usages, Navigate to, Inspect and other features of an IDE (Visual Studio for example) you will be able to think about code as a connected graph.

It's fine to write code in Notepad but if you want to read code efficiently you will have to master a good IDE.

Well, what is a connected graph exactly?

A graph which is connected in the sense of a topological space, i.e., there is a path from any point to any other point in the graph. (source)

In other words in "connected" code you can easily follow from one method to another making a model in your mind of what this code does.

If somewhere in code a connection is broken (in a sense that IDE can't assist you jumping from one method to another) you usually have to spend some time looking for this connection yourself. More broken connections in code —> harder it is to follow —> harder it is to read.

So, why might the code graph be disconnected? There are many reasons; I'll list the ones I see most often:

1. Methods or properties are referenced as strings

Some frameworks just love doing this. They pass "callbacks" as strings and use reflection when needed. Here you just have to use good old CMD+F.

Most evil ones make these strings dynamic... in dynamic languages. All hail JavaScript! Or AS3 for that matter.

2. Code is separated into disconnected parts

For example, half of your code is written in C# and the other half is assembled in a visual node editor. You will have a hard time jumping between these two worlds.

The same stands for Dependency Injection frameworks and other XML-configured bullshit. They don't say it loud but writing XML configs is coding too. It's called declarative programming (not mentioning those insane people who build imperative languages over XML).

3. Huge graph nodes

20 links jump to this 1000 lines long method?.. ouch. You have no use for a graph which contains such nodes.

4. Everything is too abstract

When by Go to Declaration you get to an interface or an abstract class and have to figure out which implementation there might be. It gets worse with Dependency Injection, abstract factories and all the other methods which fight dependencies. Connections between nodes in code graph become too abstract.

It might seem that I hate DI and XML. DI is a great tool to avoid spaghetti code and make your architecture more modular and testable. But as many other good things it gets ugly when abused.

I was completely discouraged once when I was inspecting an app and I realised that I couldn't figure out where it starts... Like where the entry point is. It all was just automagically assembled from a huge XML config on start.

I do hate XML configs though.


***

So, here's what you should have learned:

  • Master your IDE,
  • Keep code graphs as connected as possible,
  • Write simple code first,
  • By writing unnecessary code you are wasting a lot of money.

It will take time and forcing yourself to write simple code, and resisting optimizations on early stages will definitely be hard.

But 2 hours before the deadline, having been awake for 48 hours straight, you will say "thank you" to yourself in the past if the code you are working with is debuggable with a half asleep brain.

P.S.

Don't miss great discussions on reddit and hackernews.
Thanks to /u/Arandur for correcting numerous grammatical errors!