Introduction to Spatial Hashes

Daniel Pope

One of the things that bites the beginner games programmer earliest is that detecting intersections between objects can get slow very quickly.

For example, say there are 100 actors in a level - I use actor to mean an object that interacts or moves as separate from scenery objects or tile maps. Which ones are colliding with which other ones? A naive answer requires around 5,000 tests per frame. This gets dramatically worse as the number of actors increases - 20,000 tests for 200 objects, 45,000 for 300 objects, and so on (it is O(n²) for asymptotic complexity fans - but you already knew that).

Fortunately there are well-established techniques for making this process faster: indexes. If you are familiar with databases, the principle is exactly the same. As well as maintaining the positions of the objects, we maintain a datastructure that lets you quickly narrow down the set of objects you're interested in.

It's difficult to actually narrow down the set of objects to those that really are intersecting, but we can narrow it down to a much smaller set that potentially intersect. In collision detection, this is called the broad phase. Having rapidly retrieved that narrowed down collection, we can more quickly scan for objects that are really intersecting. Unsurprisingly, the latter operation is called the narrow phase.

A spatial hash is one way of indexing objects in space. As Python programmers, we should perhaps call them spatial dicts, but let's go with the literature on this one. Like a dict, a spatial hash has O(1) properties.

The basic principle is to split space into an infinite number of cells - each cell can contain an arbitrary number of objects. A cell that is empty simply isn't stored. In Python, this is just a dict, where the keys are the coordinates of the cell in space, and the values are collections of objects. Let's briefly look at an implementation and how we populate it:

class SpatialHash(object):
    def __init__(self, cell_size=10.0):
        self.cell_size = float(cell_size)
        self.d = {}

    def _add(self, cell_coord, o):
        """Add the object o to the cell at cell_coord."""
        try:
            self.d.setdefault(cell_coord, set()).add(o)
        except KeyError:
            self.d[cell_coord] = set((o,))

    def _cells_for_rect(self, r):
        """Return a set of the cells into which r extends."""
        cells = set()
        cy = floor(r.y1 / self.cell_size)
        while (cy * self.cell_size) &lt;= r.y2:
            cx = floor(r.x1 / self.cell_size)
            while (cx * self.cell_size) &lt;= r.x2:
                cells.add((int(cx), int(cy)))
                cx += 1.0
            cy += 1.0
        return cells

    def add_rect(self, r, obj):
        """Add an object obj with bounds r."""
        cells = self._cells_for_rect(r)
        for c in cells:
            self._add(c, obj)

So this is easy - each object extends into one or more cells. To add the object to the spatial hash, we just add it to the dictionary with each cell coordinate as a key. A set to conatin the objects in each cell is created if it doesn't exist.

Removing an object is just the reverse process:

def _remove(self, cell_coord, o):
    """Remove the object o from the cell at cell_coord."""
    cell = self.d[cell_coord]
    cell.remove(o)

    # Delete the cell from the hash if it is empty.
    if not cell:
        del(self.d[cell_coord])

def remove_rect(self, r, obj):
    """Remove an object obj which had bounds r."""
    cells = self._cells_for_rect(r)
    for c in cells:
        self._remove(c, obj)

Then testing for a potential set of objects with a single object is just this:

def potential_collisions(self, r, obj):
    """Get a set of all objects that potentially intersect obj."""
    cells = self._cells_for_rect(r)
    potentials = set()
    for c in cells:
        potentials.update(self.d.get(c, set()))
    potentials.discard(obj) # obj cannot intersect itself
    return potentials

Testing for all potential collision pairs is also relatively easy - we'd loop through the keys of the spatial hash collecting all combinations of the objects in a cell wherever there are 2 or more objects in that cell.

Another use is in viewport culling - a spatial hash would let you easily retrieve a potentially visible set (PVS) of objects.

The asymptotic lookup complexity of O(1) for insertion, deletion, lookup, etc makes spatial hashes seem faster than they really is. In reality, a well-implemented quadtree - lookup complexity O(log n) - would typically perform faster until the space gets vast. This is because hashing is slow and trees are fast. But spatial hashes are easy to implement, especially in Python. A drawback is the need to tune the cell size. The performance of spatial hashing will suffer if moving objects cover too many cells or there are too many objects per cell. Cells should be a reasonable match for the sizes of objects in them.

The full code for this spatial hash is available, but this implementation is for reference and no doubt faster, more fully featured implementations are available.

Comments