prose :: and :: conz


On naming

There are only two hard things in Computer Science: cache invalidation and naming things.

– Phil Karlton

At work a teammate and I have a long running joke about how we disagree with identifier naming. He favors long “descriptive” names, and I like short terse names. We never get angry over this disparity in our views, and mostly find good compromises during code review. As long as I’m not working alone, I’m more than willing to lay aside my preferences for the greater good of the team. Yet these past few days, my bias towards shorter names has been triggered into some passionate responses.

The most notable was a recent podcast from Functional Geekery featuring an interview with Adi Bolboaca. In this episode, Adi is discussing his experiences leading code retreats with host Steven Proctor. I really got worked up when Adi was speaking of how he battles functional programmers and their short naming. His claim is that descriptive naming was a principle of clean code which transcends paradigms. I just really felt that he got that one wrong.

But why? Why I do I feel strongly about using shorter names? I’ve put some thought into it and I’ve identified a few factors. Firstly I was a math geek before a software geek. I’m not talking about that math minor they forced you to get with your CS degree where you’re playing the role of the computer and not the mathematician. I’m talking about the “advanced” mathematics courses where the focus is on clearly communicating abstractions. I can’t quite recall names longer than a single character in my math courses. We’d rather reach into the bag of Greek letters to retain the brevity. If a second letter is introduced, it is as a suffix to indicate something else significant. Not a character is wasted. The reader is expected to apply the short names to abstract concepts and roll with it.

Fast-forward to the present where I am now engrossed with functional programming. This paradigm is far more closely related to the discipline of mathematics than traditional imperative programming techniques and theory (if any such thing exists). Given that functional programming is much more like math lends credence to my feeling that “clean code” applied to functional programming doesn’t carry long identifier names with it. Terseness is part of the paradigm. One can even argue that long variable names are an anti-pattern. I think it’s just the unfamiliarity of our Reverse Hungarian notation that led Adi to believe the principle still applies.

However, I feel I can lay aside the relationship to mathematics and it’s use of terse naming in route to understanding the underlying reason for my preference of short names. It all goes back to my attitude towards commenting code. The computer doesn’t care what you name things. The names we use are of importance to human reader of the code. They’re glorified comments. Names which are public and hence get included as part of an API and its respective documentation are certainly important and should carry meaning. Names which are internal to the code are mere comments attached to code artifacts. As I stated previously, I agree that the only useful comment is one that answers the question Why?. The name of an identifier never answers Why?. Rather, it always answers What? and often the context of the code has already answered this question, rendering the name redundant.

Perhaps the most classic example is the good ol’ loop iterator i:

for(int i=0; i<=10; i++) {
    System.out.println(i);
}

Developers worth their salt never use a verbose name like index for an iterating index such as this. The context and structure of the code already tell us that this is an index. By having the shorter name, the operative properties are more pronounced in the construct. That is, the start value 0, the stop condition <=10, and the increment expression i++ are quickly and easily identifiable. Longer names such as index, currentIndex, or indexOfTheLoop clutter this up and distract from these three vital pieces of information. Given that these three dictate what the code actually means versus what the author of the code thinks it should mean, I’m strongly inclined to prefer the more concise version.

In fact, functional languages like Scala take this even further. This construct is so common, we don’t even need as much clutter as the simple example above gives us. We compress it down further to the following.

for { i <- 1 to 10 } {
  println(i)
}

We see that the most important information in this construct is even more prominent: the start value 1 and the stop value 10. Incrementing by 1 is so common, it’s assumed unless you tell it otherwise. Don’t confuse this for more of my Scala evangelism for the Java oppressed. I’m pointing out that the functional paradigm’s embrace of terseness behooves us to reconsider what is regarded as clean code. I believe this results in those who have not adopted this paradigm erroneously believing the best practices transcend. The principles certainly transcend, with code readability being the principle in this discussion, but not necessarily the principles. I’ll go even further and argue that shorter names works well for imperative programming too, just as the for loop above suggests.

So what are my underlying principles with naming?

  • I believe that name length should be roughly proportional to scope size. The i in the above examples is a good specimen of a name with a tiny scope, and hence a tiny length is appropriate.
  • The name length should be inversely proportional to its frequency of use. A name that is repeated constantly through code needs to be short lest we introduce an epidemic of carpal tunnel. A frequently used name even if terse will be well understood. A great example of this is Lift’s S.
  • Don’t repeat the construct in the name. This is why we needn’t include “index” in the above name, neither should “accumulator” be used in a fold or reduce expression.
  • Don’t clutter the name with the context of the name. If you have a class named BankAccount, don’t use accountBalance. “Account” is already there, just let it be balance. Or if you have an actor, don’t name it’s internal state as currentState unless you are distinguishing it from other states such as previousState. Just let it be state.

Chances are good that I’m wrong that longer names are detrimental to readability. At least that was the consensus on twitter. If you take nothing else away from this post, let’s at least agree that longer isn’t always better.

Leave a reply below, or send me a tweet.


Olde Comments
  1. I like using short, abstract names for abstract things, and more descriptive ones for concrete things. Thus:

    def scalarMult(v: (Double,Double), x: Double) = (v._1 * x, v._2 * x)

    but

    def momentum(velocity: (Double,Double), mass: Double = scalarMult(velocity, mass)

  2. adibolb says:

    Hi Joe, thanks for the article you wrote on this subject.

    As Justin mentioned in the previos message, there is a big difference between naming just technical variables and having good names for structures that come from the domain. So for me it is very important to have the code readable as a newspaper article.

    If we want to achieve this we need to use names that are as short as possible, but come from the domain name. This is important because code is more often read than written. If someone new comes to your team you want that person to be able to be productive as fast as possible. And this means that the code is very easy to read. Even if that means having longer variable names or longer function names.

    The code is a bit longer, but reading it is a lot easier. So I am not trying to evangelize programmers who do functional programming to have long names, I rather want to see functional programs written with short and descriptive names. I prefer velocity to v, because it tells me what it is doing. But between i and iterator, i could choose which ever, as it is not that important.

    So the word “velocity” comes from the domain of the problem. It should be used instead of “v” for the variable name. But having “i” or “iterator” is not important as long as “iterator” does not have a specific domain context attached to it like “massIterator”.

    To conclude I thing that short names are useful and all items should have a short name, but in the same time they should have the following characteristics: derive from domain (where applicable), pronounceable, easy to read (l, |, and I are too alike).

    • barnesjd says:

      Adi,

      Thanks for taking the time to read and reply! I think we mostly agree. While I would have likely used `v` instead of `velocity`, the latter doesn’t hurt my feelings. What bothers me is `velocityArg` and things of that sort that explain what should be clear about the context. Your example of `massIterator` would be distracting, but `masses` would be more appropriate.

      Perhaps our stance is best summarized as “concise and descriptive”.

  3. Erik Weibust says:

    Nice post. You know my thoughts on the subject ;) but I will say I can get behind a philosophy based around length being tied to scope size. That was good.

    • barnesjd says:

      Hey Erik! Thanks for stopping by. I wish I could recall who first told me that one so I could give proper credit. :)

  4. Russ Abbott says:

    Nice post. Thanks for thinking it out. I find the comments above helpful also. I would use “velocity” rather than “v” in the example above. The trouble I have with “v” is that you have to translate it into velocity in your mind. If you’re immersed in the code, that’s easy. But when you see it the first time or come back to it after a long time, it’s nice for the code to tell you what it means rather than force you to figure it out.

    • barnesjd says:

      I really doubt that I’ll get far with my preference for “v” in this situation. That’s the sort of thing I’ve conceded on my team at work. I believe the extreme of single letter abbreviations is my formal math background coming out. You’re correct that it requires immersion, and right to suggest code shouldn’t require that for reading. Immersion is the only way I got through the math courses, so it’s my default mode of operation when thinking a problem through.

      Thanks for taking time to read, comment, and guide me to a realization of my own thought process. :)

  5. Adam says:

    To be honest, I was mostly in disagreement with you until the bullet points in the end. I’m definitely in the camp of “whole words, but as few as absolutely necessary”.

    What’s funny though, is that your example about math classes seems counter intuitive to me. Maybe if we had more descriptive variable names in math, it would be far less cryptic and easier to teach! (Quick guys, let’s come up with a nice two-word identifier for pi!)

Tagged with: scala (41), functional-programming (31), java (22), software-development (4), mathematics (3)