Writing an equals method - how hard can it be?

2018-01-21

If you’ve ever had to write or test an equals method, you may have gotten a feel for how complex this can get. This post will explain a number of things that can go wrong, offer solutions, and explain how a library called EqualsVerifier can help you prevent unexpected behavior regarding object equality testing.

Why override the default equals method anyway?

By default, every Java object has an equals(Object o) method which is inherited from the Object class. The implementation of this equals method compares objects using their memory locations, meaning that two objects are only considered equal if they actually point to the exact same memory location and are thus really one and the same object.

@Test
public void test() {
    Object object1 = new Object();
    Object sameObject = object1;
    Object object2 = new Object();

    assertTrue(object1.equals(sameObject)); // this succeeds
    assertTrue(object1.equals(object2)); // this fails
}

If you want to define equality in such a way that two objects can be considered equal even if they don’t point to the exact same memory location, you will need a custom equals implementation.

The requirements for a good equals method

  • Reflexivity: every object is equal to itself
  • Symmetry: if a is equal to b, then b is also equal to a
  • Transitivity: if a is equal to b and b is equal to c, then a is also equal to c
  • Consistency: if a is equal to b right now, then a is always equal to b as long as none of their state that is used in the equals method has been modified
  • Non-nullity: an actual object is never equal to null

Introducing the Point class

The Point class is the class we will be using as an example throughout this post. It is a simple class representing a point on a two-dimensional grid by means of an x coordinate and a y coordinate.

public class Point {
    private int x;
    private int y;

    public Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    // getters and setters for x and y here
}

We want to consider two Point objects to be equal if and only if they have the same x coordinate and the same y coordinate. Therefore, we will attempt to write an equals method that accomplishes this.

The journey to a “perfect” equals method

Attempt #1

Well, our class is simple, so let’s write a simple equals method. We add this method to our Point class:

public boolean equals(Point other) {
    return (this.x == other.x && this.y == other.y);
}

Seems simple enough. Now, let’s test our equals method:

@Test
public void test() {
    Point point1 = new Point(1, 1);
    Point point2 = new Point(1, 1);
    List<Point> points = Arrays.asList(point1);
                
    assertTrue(point1.equals(point2)); // this succeeds
    assertTrue(points.contains(point2)); // this fails
}

What happened? Even though the List internally calls equals to check equality, it somehow doesn’t consider point1 and point2 to be equal.

One important thing to note is that the contains method takes an Object as its argument, which means that point2 is passed as an Object. The following test shows that our current equals method doesn’t handle this very well.

@Test
public void test() {
    Point point1 = new Point(1, 1);
    Object pointObject = new Point(1, 1);
                
    assertTrue(point1.equals(pointObject)); // this fails
    assertTrue(pointObject.equals(point1)); // also fails
}

Short aside: method overloading and overriding rules in Java

When calling methods, Java determines the exact method to call in a way that can be confusing at first. There are basically two steps:

  1. At compile time, the number and compile-time types of the arguments are used to determine the exact signature of the method that will be invoked.
  2. At runtime, if the method to be invoked is an instance method, the actual method to invoke will be determined using dynamic method lookup based on the actual run time type of the object and the structure of the inheritance hierarchy.

For more info, check the post on Java overloading, overriding and method hiding.

In the code above, we have two classes: Object, which has a method equals(Object), and the class Point, which has a method equals(Point) and also inherits the equals(Object) method from Object. What happens in the code is the following:

  • In the first assertion, we are calling a method with signature equals(Object) on an object with compile-time type Point. As Point does not implement a method with that signature, the best match is the equals(Object) method inherited from Object.
  • In the second assertion, we are calling a method with signature equals(Point) on an object with compile-time type Object. As Object does not have an equals(Point) method, the best match at compile time is its equals(Object) method. And, beause Point (the run-time type of pointObject) does not override that method, the actual implementation that gets called is still the one defined in Object.

In both cases, Object’s equals(Object o) method tells us that point1 and pointObject are not equal because they do not point to the exact same memory location.

Attempt #2: actually overriding the default equals(Object o) method

Ok, so let’s properly override Object’s equals(Object) method:

@Override
public boolean equals(Object o) {
    if (o == null || o.getClass() != this.getClass()) {
        return false;
    }
    
    Point other = (Point) o;
    return (this.x == other.x && this.y == other.y);
}

Our tests from the previous attempt will now succeed. However, a new issue arises:

@Test
public void test() {        
    Point point1 = new Point(1, 1);
    Point point2 = new Point(1, 1);
    Set<Point> points = new HashSet<Point>();
    points.add(point1);
            
    assertTrue(points.contains(point2)); // this fails
}

The issue here is that, while we did override the default equals method, we didn’t override the default hashCode method as well. When our HashSet looks for point2, it only looks in the hash bucket that corresponds to point2’s hash code. Therefore, if two objects are considered equal, we must guarantee that their hash code will also be the same (hashcode needs to be consistent with equals). Note that it is ok for two different objects to have the same hash code, although it is better to avoid this as it can negatively impact the performance of data strucures that rely on hash codes.

Attempt #3: overriding hashCode as well

Ok, let’s add a hashCode method that is consistent with our equals method (this method was actually generated automatically by my IDE):

@Override
public int hashCode() {
    final int prime = 31;
    int result = 1;
    result = prime * result + x;
    result = prime * result + y;
    return result;
}

The previous tests now pass, but we are still not quite there:

@Test
public void test() {    
    Point point1 = new Point(1, 1);
    Set<Point> points = new HashSet<Point>();
    points.add(point1);
    
    point1.setX(2);
            
    assertTrue(points.contains(point1)); // this fails
}

This means that, although point1 is the actual object we put in the set, the set doesn’t seem to contain point1 anymore. When we added point1 to the set, it got assigned to a hash bucket based on its hash code. However, by changing the point’s x coordinate, we have also changed its hash code. The contains method looks in the bucket corresponding to the new hash code and will not find our point there because it sits in the bucket corresponding to its original hash code.

Attempt #4: making instance variables final

Ok, let’s solve the previous issue by making the x and y coordinate final. This yields the following definition for our Point class:

public class Point {
    private final int x;
    private final int y;

    public Point(int x, int y) {
        this.x = x;
        this.y = y;
    }

    // getters for x and y here
    
    @Override
    public boolean equals(Object o) {
        if (o == null || o.getClass() != this.getClass()) {
            return false;
        }
        
        Point other = (Point) o;
        return (this.x == other.x && this.y == other.y);
    }

    @Override
    public int hashCode() {
        final int prime = 31;
        int result = 1;
        result = prime * result + x;
        result = prime * result + y;
        return result;
    }    
}

Our current equals method is functionally equivalent to the one that my IDE generates automatically (using the default settings) and our hashCode method was already generated by my IDE. Therefore, if I let my IDE do the work for me, this is what I’m going to get by default. But is it enough?

Well, it is enough if we don’t have to care about subclasses. If there are subclasses involved, things get a bit more interesting.

@Test
public void test() {    
    Point point1 = new Point(1, 1);
    Point point2 = new Point(1, 1) {};

    assertTrue(point1.equals(point2)); // this fails
}

In this test, point2 is an instance of an anonymous subclass of Point that adds no additional behavior or state. Here, point2 has the exact same x and y coordinate as point1 (it actually even has exactly identical state and behavior). However, they are not considered to be equal at all. This violates the contract for our equals method on Point, which we defined as “two Points are equal if and only if they have the same x coordinate and the same y coordinate”.

The reason why this test fails is that the equals method uses getClass() to verify if both objects belong to the same class and getClass() will actually return a different class for point1 and point2.

Although you will probably not create a lot of trivial anonymous subclasses in real life, you may sometimes want to subclass a class that you defined a custom equals method for and allow objects of the subclass to equal objects of the superclass. Fortunately, we can provide this behavior by using instanceof instead of getClass().

Attempt #5: using instanceof instead of getClass()

@Override
public boolean equals(Object o) {
    if (!(o instanceof Point)) {
        return false;
    }
    
    Point other = (Point) o;
    return (this.x == other.x && this.y == other.y);
}

This implementation’s behavior is equivalent to that of the implementation generated by my IDE if I choose the option to use instanceof instead of getClass(). It passes all of our previous tests. In fact, as long as no subclass of Point ever overrides our equals (or hashCode) method, this will work just fine. This means that, when letting my IDE generate my equals and hashCode methods for me, I actually get a good implementation as long as I choose the right options.

One additional thing that the version generated by my IDE does is that it starts by checking for actual identity. This is a very inexpensive test, making this a good optimization if it is common for equal objects to also be identical.

@Override
public boolean equals(Object o) {
    if (this == o) {
        return true;
    }

    if (!(o instanceof Point)) {
        return false;
    }
    
    Point other = (Point) o;
    return (this.x == other.x && this.y == other.y);
}

What if a subclass needs to include additional state in equals?

Things get more complicated if a subclass is going to add state and we want to include this state in its equals method. For example, let’s assume that we have an enum called Color and we create a class ColorPoint that extends the Point class with a specific color for a point.

public enum Color {
    BLUE, RED, YELLOW, GREEN;
}

public class ColorPoint extends Point {
    private final Color color;

    public ColorPoint(int x, int y, Color color) {
        super(x, y);
        this.color = color;
    }
    
    // getter for color
}

Now, what if we want to include the color in the equals method so that a ColorPoint(1, 1, Color.RED) is not equal to a ColorPoint(1, 1, Color.BLUE)? Well, there is a way to accomplish this. It is also described in this article.

An important remark is that, in this solution, a Point will never be able to be equal to a ColorPoint. The reason for this is that we need our equals method to be transitive. If we would follow the contract we envisioned for the equals method of Point (two Points are equal if and only if they have the same x coordinate and the same y coordinate), this would mean that a Point(1, 1) is equal to a ColorPoint(1, 1, Color.RED) and to a ColorPoint(1, 1, Color.BLUE). However, transitivity would then imply that a ColorPoint(1, 1, Color.RED) and a ColorPoint(1, 1, Color.BLUE) must be equal to each other, which is exactly what we didn’t want.

Because of this, this solution could cause unexpected behavior in code that depends on the contract that “two Points are equal if and only if they have the same x coordinate and the same y coordinate”.

The solution involves introducing a canEqual method and letting custom equals methods call that method on the other object.

public class Point {
    // ...
    
    @Override
    public boolean equals(Object o) {
        if (!(o instanceof Point)) {
            return false;
        }
        
        Point other = (Point) o;
        
        if (!other.canEqual(this)) {
            return false;
        }
        
        return (this.x == other.x && this.y == other.y);
    }
    
    public boolean canEqual(Object o) {
        return (o instanceof Point);
    }    
    
    // ...
}

public class ColorPoint extends Point {
    // ...

    @Override
    public boolean equals(Object o) {
        if (!(o instanceof ColorPoint)) {
            return false;
        }
        
        ColorPoint other = (ColorPoint) o;
        
        if (!other.canEqual(this)) {
            return false;
        }
        
        return (this.color == other.color 
                && super.equals(other));
    }
    
    public boolean canEqual(Object o) {
        return (o instanceof ColorPoint);
    }
    
    // ...
}

The Point and ColorPoint classes both satisfy all of the previous tests. If we would create a new subclass of Point or ColorPoint without overriding equals, canEqual or hashCode, instances oft the subclass can still be equal to instances of the superclass. If we want to create a new subclass of Point or ColorPoint that adds additional state and includes this state in its equals method, we need to override both equals and canEqual.

As stated before, the only big drawback of this approach is the fact that it breaks our original contract saying that “two Points are equal if and only if they have the same x coordinate and the same y coordinate”. A piece of code operating on Point instances can not longer make the assumption that, if two Points are not equal to each other, there must be some difference in their x or y coordinates. Indeed, this assumption no longer holds if some of the instances are ColorPoint instances. This is essentially a violation of the Liskov substitution principle, although the article linked above doesn’t seem to agree.

@Test
public void test() {
    Point point1 = new Point(1, 1);
    Point point2 = new ColorPoint(1, 1, Color.BLUE);
            
    assertTrue(point1.getX() == point2.getX());
    assertTrue(point1.getY() == point2.getY());
    assertTrue(point1.equals(point2)); // this fails
}

A simpler solution for subclasses that include additional state in equals

The previous approach is relatively complex, mostly because we wanted to allow subclass objects to be equal to superclass objects as long as they don’t need to include additional state in their equals method.

If we’re okay with subclass objects never being equal to superclass objects, we can just go ahead and use the getClass() approach.

How to handle this in practice

In practice, the approach that you’ll typically want to follow is this:

  1. Let your IDE generate your equals (and hashCode) methods for you, using instanceof instead of getClass().
  2. Either make your class final or make your equals and hashCode methods final.

Note that the two options outlined in step 2 have different effects:

  • Making your class final prevents any issues with subclasses by simply not allowing subclasses for your class.
  • Making your equals and hashCode methods final prevents subclasses from overriding your equals and hashCode methods and including additional state in them.

In cases where this is not sufficient (you want subclasses to include additional state in their equals method), consider using the solution involving the canEqual method or the simpler solution if you’re ok with subclass instances never being equal to superclass instances.

Testing your equals methods

Testing an equals method by hand is a tedious task that will likely lead to pages and pages of error-prone testing code. Fortunately, there is a better solution: the EqualsVerifier library by Jan Ouwens. Using it is simple:

@Test
public void equalsContract() {
    EqualsVerifier.forClass(Point.class).verify();
}

It uses reflection to inspect your class and test its equals and hashCode methods with 100% coverage. It recognizes all of the possible issues that were outlined in this article (and some others as well). If you’re confused by an error message it produces, have a look at this overview. If you understand why EqualsVerifier complains about a certain issue but you need it to be less restrictive, you can pass it an additional option to make it ignore that issue. This library should be able to make hand-written equals tests a thing of the past.

Resources