prose :: and :: conz


…Or Java if you’re not into the whole brevity thing

Since giving my talk on functional programming in Scala at Mentor Graphics, I have received the more specific question “Why should I use Scala instead of Java?” I gave some reasons for Scala in that presentation, but never directly addressed why it’s preferable over Java. In this blog post, I want to further elaborate on one reason which is Scala’s terseness and clarity over Java. I will use a rather extreme case that is an unfortunately common case for Java (“case” is a pun here, btw). Special thanks to Ryan Delucchi’s avatar for inspiring this post’s title. Even more special thanks to El Duderino himself for inspiring Ryan.

Let’s consider a 2D vector for our case study. Just to be clear, I don’t mean a “vector” as in a data structure, but a “vector” as in an element of a vector space… the mathematical concept of vector you learned about in linear algebra. I know how math examples can seem a little contrived, but if you think that vectors are unpractical, I’ll let you deal with mad programming scientists like Toby. Suppose we want to create a little library for working with 2D vectors. This is what the typical Java developer would likely write.

public class Vector2DJava {
    private final double x, y;
    
    public Vector2DJava(double x, double y) {
        this.x = x;
        this.y = y;
    }

    public double getX() {
        return x;
    }
    
    public double getY() {
        return y;
    }
}

It has all the usual suspects. Private members. Constructor. Getters. I’ve left out the setters and made the fields final. I’ll get back to why these should be final in a moment.

But this is a lot of code, right? All we have is the notion of a vector which has two double members named “x” and “y” respectively. Everything else you see here is boilerplate. Don’t believe me? Compare to the idiomatic Scala approach.

class Vector2DScala(val x:Double, val y:Double)

One line. That is all. Now you may say that I’m cheating because I didn’t include getters. But why bother with that? In Java we are behooved to provide such things because they’re idiomatic. It’s not the case in Scala. Furthermore, val is equivalent to declaring a final field in Java. There’s no worries that it will get manipulated from the outside world. And when was the last time you had a getter that did anything other than return the private member? Just to be fair, we’ll break the rules and omit the getters in our Java source too.

public class Vector2DJava {
    public final double x, y;
    
    public Vector2DJava(double x, double y) {
        this.x = x;
        this.y = y;
    }
}

Even being unusually terse in Java, we’re still pretty far behind the Scala implementation just from declaring our class. However, let’s continue. Before we begin to tackle operations on our new class, I suggest we first establish equality of our new data class. As it stands, both will only be tested for referential equality (i.e. they are equal only if the exact same instance in memory), not equality in the mathematical sense. That is, two of our Vector2D objects are equal if and only if the respective x and y components are equal. Let’s define equals for our Java class. Heck, let’s take the easy and thorough route and allow Eclipse to generate it for us.

@Override
public boolean equals(Object obj) {
    if (this == obj)
        return true;
    if (obj == null)
        return false;
    if (getClass() != obj.getClass())
        return false;
    Vector2DJava other = (Vector2DJava) obj;
    if (Double.doubleToLongBits(x) != Double.doubleToLongBits(other.x))
        return false;
    if (Double.doubleToLongBits(y) != Double.doubleToLongBits(other.y))
        return false;
    return true;
}

Yeah… 13 lines of code and a scroll bar on my blog. But it’s OK, right? Eclipse produced it for us. Yeah, well… now we’re tasked with maintaining it. If another important field were added later (perhaps the vector’s unit would be a good enhancement), we have to remember to update equals. That’s more burden on the error-prone developer. I don’t know about you, but I feel that if an IDE can know exactly what to do, then why can’t the language do it instead? It seems that there is something missing if a machine can indeed take care of the task. It turns out that this is certainly so for Scala. Let’s now add equals to our Vector2DScala class.

case class Vector2DScala(val x:Double, val y:Double)

If you blinked, you missed it. I added five ASCII characters to the beginning of the declaration. It’s a case class. While there are many benefits of prepending this keyword to your class declarations, at this point you should just know that it implements an equals exactly as you expect… equivalent to the Eclipse-generated equals for our Java class. (EDIT: The val can be dropped from the x and y: case class Vector2DScala(x:Double, y:Double) Thanks to @predef1 for the twitter response)

Now when we generated equals, we also got hashcode. Defining hashcode properly is paramount, most notably if you ever imagine your objects being inserted into a HashSet or a HashMap. If you don’t, then two Vector2DJava instances which are equal may not have the same hashcode. As a result, you cannot reliably locate instances of your data in your hashed collections. This is also why it was imperative (another pun) that we declare our fields final. If they change, the hashcode is no longer consistent and the object gets lost in the table. Just to be thorough, let’s now show the hashcode implementation.

@Override
public int hashCode() {
    final int prime = 31;
    int result = 1;
    long temp;
    temp = Double.doubleToLongBits(x);
    result = prime * result + (int) (temp ^ (temp >>> 32));
    temp = Double.doubleToLongBits(y);
    result = prime * result + (int) (temp ^ (temp >>> 32));
    return result;
}

Nine more lines of code. Scala? Just as equals is implemented automatically, hashcode is also free for case classes. If you’re an experienced Java programmer, you’re probably well-acquainted with the necessity of equals and hashcode for your data objects. For those less familiar with Java, I want to emphasize how common this is. It’s so common, that Eclipse has the option to generate them that I already mentioned. It’s right next to the options to generate getters and setters.

Oh, and a little more bad news for the Java apologist. A Scala case class also defines that other method that Eclipse will generate for you: toString. I’ll spare you the generated java code for now.

We now have 41 lines in our Java class file versus one in Scala. ONE! This is the most extreme example of Scala’s terseness over Java. But take note that I have yet to utilize any functional programming techniques. I’m examining purely object-oriented features. Scala is beating Java very badly at its own game. This is only one facet. Let’s continue with our example a little more to show you that it doesn’t stop with class declarations.

We’ve yet to do anything useful with our Vector2D classes. Perhaps the first operation that comes to mind is the magnitude of the vector. Let’s take a look at an implementation in Java.

public double mag() {
    return Math.sqrt(x*x + y*y);
}

Ah, finally one where Java doesn’t completely suck. Sticking with proper Scala style, it’s a bit shorter because we don’t need the extraneous curly braces.

case class Vector2DScala(val x:Double, val y:Double) {
  def mag = Math.sqrt(x*x + y*y)
}

I’ve included the entire class source since this is the first line of code which required us to add curly braces to enclose the class body. We can make it look a little neater thanks to Scala’s imports by adding Math‘s handy functions to our scope since we’re likely to do a little math in this problem.

import Math._

case class Vector2DScala(val x:Double, val y:Double) {
  def mag = sqrt(x*x + y*y)
}

Well, there is still a catch with our implementations. Remember how I called magnitude an “operation”? Well, I’d argue it’s really just a property of the vector. Since our x and y values are fixed, there is no need to calculate the magnitude every time it is asked for. Instead, we perhaps should just calculate it once and return the same value every time. After all, square root is a very expensive operation. Let’s update our Java.

public class Vector2DJava {
    public final double x, y;
    private final double mag;
     
    public Vector2DJava(double x, double y) {
        this.x = x;
        this.y = y;
        mag = Math.sqrt(x*x + y*y);
    }
    
    public double mag() {
        return mag;
    }
// ...

So you’ll notice I had to do some rearranging to make this work. I had to now declare a field named mag and move the calculation over to the constructor. No big deal, we do this all the time. It’s just code, they say. Let’s do it in Scala.

case class Vector2DScala(val x:Double, val y:Double) {
  val mag = sqrt(x*x + y*y)
}

You blinked again, didn’t you? Look closely. The only thing I did was change def to val. That’s it! I changed my declaration from one that creates a function to one that creates a value. No rearranging.

But… this is still not good. There is certainly no guarantee that every vector created will be asked for its magnitude. Many instantiations will have to pay the penalty of calculating the magnitude without ever needing it. That isn’t as robust as it could be for our users. So let’s only calculate it if called upon, but keep it around for later invocations.

public class Vector2DJava {
    public final double x, y;
    private Double mag = null;
     
    public Vector2DJava(double x, double y) {
        this.x = x;
        this.y = y;
    }
    
    public double mag() {
        if(mag == null) mag = Math.sqrt(x*x + y*y);
        return mag;
    }
//...

More rearranging and notable changes. As expected, I evicted the calculation of mag from the constructor. I changed the type of the mag field to Double so I can have the possibility of null (another approach is to have a boolean that keeps track of whether or not mag has been instantiated). Then I have a good ol’ null check in my getter. But you’re a good Java developer, and you’re used to this sort of thing. Just water under the bridge, right? Well let’s look at how we implement this lazy evaluation in Scala.

case class Vector2DScala(val x:Double, val y:Double) {
  lazy val mag = sqrt(x*x + y*y)
}

Look dude, you gotta quit blinking when the Scala code rolls round. Once again, I needed merely five ASCII characters to implement lazy evaluation. How does it work? Well when you declare an identifier in Scala, you use either val, def, lazy val, or var. Firstly, the var is what you think it is. A variable. It can be changed. It is generally evil and should be avoided. The other three however are all immutable. What’s the difference? A val is evaluated exactly once when it is first encountered (in the case of val mag, during object construction). A def is evaluated EVERY time it is accessed, hence like a function. Finally, a lazy val is evaluated only the first time it is accessed. Hence it is either never evaluated or exactly once. Awesome, right?

So what’s the score? Including package declarations, imports, and style-appropriate newlines, it’s Java 49, Scala 7. That’s the kind of domination you can expect when Nick Saban brings the Crimson Tide to your stadium, but in this sport the win goes to the team with less points. Before we call it a game, I’d like explore one more operation. Let’s implement vector addition. The logic is very straight-forward. Just add the x and y components together. The Java implementation is likewise straight-forward.

public Vector2DJava add(Vector2DJava that) {
    return new Vector2DJava(this.x + that.x, this.y + that.y);
}

For this enhancement, let’s take a look at some code which calls our operations.

Vector2DJava u = new Vector2DJava(3, 4);
Vector2DJava v = new Vector2DJava(4, 5);
Vector2DJava s = u.add(v);
System.out.println(s.mag());

No surprises there. Well remember back when Java was the new kid on the block and all of the C++ guys mocked us for not supporting operator overloading? Well…

def +(that:Vector2DScala) = 
  Vector2DScala(this.x + that.x, this.y + that.y)

How ’bout them apples? I defined a function named “+“. Before we look more at the operator, notice also that I didn’t need the new keyword to instantiate a Vector2DScala because it is a case class. Again I didn’t need the curly braces because it’s a one-liner. Also notice the omission of the return type thanks to type inference. Let’s look at calling our new fancy operator.

val u = Vector2DScala(3, 4)
val v = Vector2DScala(4, 5)
val s = u.+(v)
Console.println(s.mag)

Well that kinda sucks. Nice try, Scala… Right? I have good news. This code works too:

val u = Vector2DScala(3, 4)
val v = Vector2DScala(4, 5)
val s = u + v
Console.println(s.mag)

Under certain circumstances, Scala allows you to drop the . and the parentheses. What they’ve also done is reduce the restrictions on naming from alphanumeric characters to just about anything on the keyboard, save a few reserved symbols such as period, curlies, parentheses, etc. It doesn’t reduce the code size, but certainly enhances the clarity and readability.

This operator was essentially a wash in terms of our score, but let’s tally it up. Here’s the full Java source.

package pac;

public class Vector2DJava {
    public final double x, y;
    private Double mag = null;
     
    public Vector2DJava(double x, double y) {
        this.x = x;
        this.y = y;
    }
    
    public double mag() {
        if(mag == null) mag = Math.sqrt(x*x + y*y);
        return mag;
    }
    
    public Vector2DJava add(Vector2DJava that) {
        return new Vector2DJava(this.x + that.x, this.y + that.y);
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        long temp;
        temp = Double.doubleToLongBits(x);
        result = prime * result + (int) (temp ^ (temp >>> 32));
        temp = Double.doubleToLongBits(y);
        result = prime * result + (int) (temp ^ (temp >>> 32));
        return result;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj)
            return true;
        if (obj == null)
            return false;
        if (getClass() != obj.getClass())
            return false;
        Vector2DJava other = (Vector2DJava) obj;
        if (Double.doubleToLongBits(x) != Double.doubleToLongBits(other.x))
            return false;
        if (Double.doubleToLongBits(y) != Double.doubleToLongBits(other.y))
            return false;
        return true;
    }

    @Override
    public String toString() {
        return "Vector2DJava [x=" + x + ", y=" + y + "]";
    }
}

And the full Scala source, including the cosmetic import.

package pac

import Math._

case class Vector2DScala(val x:Double, val y:Double) {
  lazy val mag = sqrt(x*x + y*y)

  def +(that:Vector2DScala) = 
    Vector2DScala(this.x + that.x, this.y + that.y)
}

Final score: Java 53, Scala 10. For this exercise, Scala required a between a fifth and sixth of the lines of source. Maybe you’re fine with writing and maintaining more code, but I really can’t think of any reason anyone would prefer Java in this instance.

One last thing I would like to mention about code size… what about the amount of JVM bytecode produced? Well Vector2DJava.class is 1,418 bytes. I don’t know why, but Scala actually produces two class files for my case class (EDIT: It creates a companion object. Thanks for the explanation below from Senia). The files and sizes are Vector2DScala$.class at 1,669 bytes and Vector2DScala.class at 4,916 bytes. So Scala produced considerably more bytecode. That is not necessarily a good thing or a bad thing. However, it does make Java look like assembly compared to Scala. I’ll gladly let my compiler produce more bytecode if the language will do all of the hard work for me and produce a program that runs in comparable time. I always believe in letting the machine do the work. That’s the whole point, right?


Olde Comments
  1. > I don’t know why, but Scala actually produces two class files for my case class.

    `Vector2DScala$` is class of the companoin object for class `Vector2DScala`. It contains `apply` (“I didn’t need the new keyword”) and `unapply` methods.

    For `case class` compiler implicitly creates companion object.

    • barnesjd says:

      Ah, that makes a lot sense. Explains why the ‘new’ keyword isn’t needed. Thanks for the feedback!

  2. From a business side, a smaller code base requires less resources to maintain and manage along with a laundry list of benefits, but where is the data? I would love to see a company that is going through a Scala adoption show some actual data for improvements in performance, personnel, required resources, etc. Academia might have the capability for a study of this nature, but if a privately owned or publicly traded company showed how moving to Scala or Typesafe put coin to the bottom line I would be ecstatic.

    • barnesjd says:

      That would indeed be interesting data to see. I bet it is quite difficult to isolate and measure, tho.

  3. Adam Holden says:

    Consider me convinced. That is some super cool stuff.

    • barnesjd says:

      Great! I know my Scala posts are pedestrian compared to the folks who have been into it for a while, but it’s well worth the effort if I’m able to spread the awesome.

  4. Dobbs says:

    Why not just use a Point2D from the Java standard library? All the work is done for you there.

    http://docs.oracle.com/javase/7/docs/api/java/awt/geom/Point2D.html

    • barnesjd says:

      I used 2D vector for illustrative purposes, not to suggest that one should reinvent the wheel. Certainly the best solution for either language is to use an existing library.

  5. Dobbs says:

    It is awesome BTW. I just hope Java introduces an equivalent language construct for common bean/properties/cases like this. Calling Java-9, you there, you listening!!

  6. Stuart says:

    Great post. I see you with France connections, involved in the Tokyo meetup and now something with nz – well, looking at the domain name. Thanks for passing the awesomeness far and wide.

    I imagine the there are a few like myself who started picking up Java and after a few months realised that with work and family that the boiler plate muscle memory was going to too much take time due to the verbosity of Java and came across Scala as a fantastic alternative.

    Thanks for the post – keep them coming!

  7. Stuart says:

    Right, the nz url mystery is solved … http://proseand.co.nz/2013/05/03/more-to-come/

    – Thanks Joe

    • barnesjd says:

      Glad you enjoyed the post, Stuart! Yeah, I’m world wide. LOL I’m hailing out of the great state of Alabama in the US. I might run around with folks around the cyber world, but I don’t get out of my geographical footprint too much.

Tagged with: scala (41), functional-programming (31), java (22), mathematics (3)