Blog

Overriding ==, eql? and hash

Icône flèche bleue vers la gauche
Retour au blog
Overriding ==, eql? and hash

Overriding ==, eql? and hash

February 7, 2013

It is quite usual in object-oriented programming to redefine the criteria of equivalence between two instances of a class. For example, let’s write a class representing a specific point in a three-dimensional space:

class Point
attr_reader :x, :y, :z
def initialize(x, y, z)
@x, @y, @z = x, y, z
end
end

Given two points p1 and p2 with the same x, y and z coordinates, it seems natural that p1 be considered equivalent to p2, whether p1 and p2 are the same instance (i.e. the same memory object) or not. In Ruby, the method used to check the equivalence of two objects is ==. Its default implementation (on the Object class) checks for instance identity, so we have to override it in our class to satisfy our domain specific notion of equivalence:

def ==(other)
x == other.x && y == other.y && z == other.z
end

Let’s try it:

irb(main):001:0> Point.new(0, 0, 0) == Point.new(0, 0, 0)
=> true

So far, so good.

Using our objects as hash keys

Now what if we want to use our newly defined objects as a key in a Hash?

irb(main):001:0> labels = { Point.new(0, 0, 0) => “Origin” }
irb(main):002:0> labels[Point.new(0, 0, 0)]
=> nil

Not exactly what we want. This is because Hash does not use the == method when it looks for a key. Rather, it uses the eql? method as well as the hash method, which are also defined on Object. So, let’s first redefine eql?. As is often the case, it is perfectly sensible in our example to make eql? and == synonymous:

def eql?(other)
self == other
end

Now hash. This method returns an integer representing a kind of address where the key should be stored or found in a hashtable, called the hash value of the object. According to its specification, if object1.eql?(object2) is true then object1.hash MUST be equal to object2.hash (i.e. if a key matches another, they must be found at the same address). The reverse does not have to be true: object1.hash can be equal to object2.hash even when object1.eql?(object2) is false, this situation being called a collision. However, lookup performances tend to drop as the number of collisions rises, so a good implementation should try to minimize their probability (a property of the hash function called uniformity, see reference 4). There are many possible ways to compute a hash value for our Point instances (this topic deserves its own blogpost), but here is a simple one (assuming that the implementation provided by Array is a good one):

def hash
[ x, y, z ].hash
end

Now if we try again, we get:

irb(main):001:0> labels = { Point.new(0, 0, 0) => “Origin” }
irb(main):004:0> labels[Point.new(0, 0, 0)]
=> “Origin”

Other equality/equivalence related methods

Object also defines two other methods that are related to equality/equivalence: equal? determines object identity (i.e if a is the same instance as b) and should never be overridden. On the other hand, case statements use the === method to select the appropriate when, so here again you can override it to provide class specific equivalence for a switch statement (Range is a good example of this, see the documentation).

References and further reading:

  1. Specification of ==, eql? and equal?
  2. Specification of hash
  3. A more complete blogpost on basic concepts that can be implemented/overridden in a class
  4. Definition and properties of hash values
  5. In depth view of Hash implementation in MRI
  6. A blogpost on the implementation of a hashtable (here called a dictionary) It’s in Python, but the principles are the same.

Prêt à créer votre produit logiciel ? Contactez-nous !