Félix Saparelli

(about me)

a.k.a. passcod

Storing associations of values

Posted on Apr 10 ‘17

This post is also tagged as ,

This is a description of an algorithm for a particular pattern I’ve encountered several times throughout my programming and systems designing experience. It’s fairly simple, but nevertheless:

The pattern is this: you have a set of values, let’s call them vertices, and you want to associate them together, let’s call that association an edge, such that later on, given a single vertex you are able to retrieve the edge and all vertices it contains.

The prerequisites are a hash function, let’s call it H(), and a key-value store with two functions: KV.get(key) → value and KV.set(key, value).

To store the edge containing the vertices v1, v2, v3, you:

  1. Compute the hashes of vertices:
    • h1 = H(v1)
    • h2 = H(v2)
    • h3 = H(v3)
  2. Concatenate these hashes and compute the hash of that:
    • h_all = H(v1 . v2 . v3)
  3. Store the vertices under that common key:
    • KV.set(h_all, ntuple<v1, v2, v3>)
  4. Store the common key under the hashes of vertices:
    • KV.set(h1, h_all)
    • KV.set(h2, h_all)
    • KV.set(h3, h_all)

Then, to retrieve the edge given a vertex v, you:

  1. Compute the hash of the vertex:
    • h = H(v)
  2. Retrieve the common key:
    • h_all = KV.get(h)
  3. Retrieve the edge:
    • ntuple<v…> = KV.get(h_all)

That’s it.

Time requirements are thus (n + 1) * (H time + KV.set time) for set and 2 * KV.get + 1 * H for get, where n is the number of vertices to be associated.

And space requirements are (n + 1) KV entries, one storing exactly only the combined size of vertices, and n storing exactly only a fixed-size hash.

The datastructure described can be called a “Hypermap”, and is a non-overlapping, non-directed, non-multi Hypergraph. An overlapping Hypergraph (but still non-directed and non-multi) can be constructed trivially from this. A Hypermap is also a generalisation of a Bimap, which can be described as a Hypermap where edges only contain exactly two vertices.