a.k.a. passcod

# Storing associations of values

Posted on Apr 10 ‘17

This post is also tagged as ,

This is a description of an algorithm for a particular pattern I’ve encountered several times throughout my programming and systems designing experience. It’s fairly simple, but nevertheless:

The pattern is this: you have a set of values, let’s call them vertices, and you want to associate them together, let’s call that association an edge, such that later on, given a single vertex you are able to retrieve the edge and all vertices it contains.

The prerequisites are a hash function, let’s call it `H()`, and a key-value store with two functions: `KV.get(key) → value` and `KV.set(key, value)`.

To store the edge containing the vertices `v1`, `v2`, `v3`, you:

1. Compute the hashes of vertices:
• `h1 = H(v1)`
• `h2 = H(v2)`
• `h3 = H(v3)`
2. Concatenate these hashes and compute the hash of that:
• `h_all = H(v1 . v2 . v3)`
3. Store the vertices under that common key:
• `KV.set(h_all, ntuple<v1, v2, v3>)`
4. Store the common key under the hashes of vertices:
• `KV.set(h1, h_all)`
• `KV.set(h2, h_all)`
• `KV.set(h3, h_all)`

Then, to retrieve the edge given a vertex `v`, you:

1. Compute the hash of the vertex:
• `h = H(v)`
2. Retrieve the common key:
• `h_all = KV.get(h)`
3. Retrieve the edge:
• `ntuple<v…> = KV.get(h_all)`

That’s it.

Time requirements are thus `(n + 1) * (H time + KV.set time)` for set and `2 * KV.get + 1 * H` for get, where `n` is the number of vertices to be associated.

And space requirements are `(n + 1)` KV entries, one storing exactly only the combined size of vertices, and `n` storing exactly only a fixed-size hash.

The datastructure described can be called a “Hypermap”, and is a non-overlapping, non-directed, non-multi Hypergraph. An overlapping Hypergraph (but still non-directed and non-multi) can be constructed trivially from this. A Hypermap is also a generalisation of a Bimap, which can be described as a Hypermap where edges only contain exactly two vertices.