Extending Vessel's DAG with a Per-Event Clock

#Recap

We compared Merkle DAGs to Vessel’s DAG, to find that although they are constructed very differently, they offer very much the same guarantees: both provide logical ordering, and this ordering is deterministic across replicas.
The next article then explored the needs of CRDTs for logical clocks, and concluded that the clock provided by Vessel’s DAG is no worse than the clock of a Merkle DAG. The main problem both have is that while ordering is deterministic, it does not take into account the effects of applying one CRDT operation before vs after another.
We then explored Wyrd’s usage of the Vessel DAG with the conclusion that the DAG provides too coarse-grained an ordering.

This final entry in the mini-series now focuses on how to address that issue.

#Problems

I made much ado in the last entry that time synchronization is difficult to achieve in DTN, because DTN hates round trips, and time synchronization relies on them. That is, of course, just a TL;DR – if you’d like to read a paper on the issue, Space Network Time Distribution and Synchronization Protocol Development for Mars Proximity Link¹ has got you covered².

The problems with using Vessel’s DAG as a logical clock are twofold:

Exactly as Merkle CRDTs, Vessel’s DAG provide logical ordering. This is not identical to clock-based ordering, and cannot take into account whether ordering operations (A, B) or (B, A) has the same effect.
Additionally, Vessel lumps multiple CRDT operations into a single extent. This saves overhead, but means that the absolute order of events created in parallel on two distinct nodes cannot really be determined. Either all of the first node’s operations come first, or all of the second node’s.

The ordering is still deterministic and will be consistent across all nodes. But perhaps we can find something that better reflects reality?

#Generations

A side-effect of Vessel’s DAG is that it’s actually easy to determine which extents belong to the same logical generation.

Figure: Vessel’s DAG

This graphic is taken from the Vessel specification. It illustrates that the origin extent A is the sole member of the first generation. It’s descendants prefixed with B are the second generation. The third consists of extents prefixed with C, D and E – while the last generation has prefixes F and G.

A single node will not create more than one sibling in a generation. That is, B1, B2 and B3 are all created in parallel by three different nodes. The implication of this is that all three nodes must have synchronized their ancestor A.

Vessel contains a simple algorithm for choosing the ideal ancestor extent. This is designed to, ideally, avoid situations as in the above graphic. Given complete synchronization, each node should pick the same extent as the ideal ancestor.

The details of that are in the specs, but the important part is: this example implies that the node that created extent G cannot have known extent D at that time. Otherwise it, too, would have made D the ancestor.

From this, we can derive the following:

Statement: A generation’s degree of synchronization is inversely proportional to the number of ancestors in the generation. A perfectly synchronized generation shares a single ancestor.

What if we could leverage this property?

Specifically, what if we could achieve absolute event ordering in a perfectly synchronized resource (i.e. one in which all generations are also perfectly synchronized)?

From the point of view of achieving eventual consistency, this seems like a perfectly adequate definition! Eventual consistency assumes that things may not be consistent if synchronization is not perfect, but can be made consistent when it is.

So how do we go about that?

#Good Enough™ Timekeeping

I’m sorry, but we’re going to have to talk about clocks again in a while. More than one might like.

But let’s start lightly for a moment and let’s mash up that idea of a perfectly synchronized generation with something mentioned in the last post: monotonic clocks.

See, if all extents in a perfectly synchronized generation are siblings of each other, that means their shared ancestor effectively represents a synchronization point in the Vessel DAG’s logical clock.

So what if we recorded events with monotonic clock offsets in relation to that synchronization point?

It doesn’t really matter which node records events at what wall time. What matters for absolute event ordering is that all nodes agree whether an offset A is larger or smaller than an offset B from some shared point.

That in turn is guaranteed by the definition of the fundamental time keeping unit, the second. The definition is based on the Caesium standard guarantees that two observers, each observing a clock in front of them, will observe the same time interval between one second and the next.³

So basing event ordering on the monotonic clock offset of a shared synchronization point is absolutely valid.

So to summarize:

We’d like to leverage the fact that in a perfectly synchronized resource, the Vessel DAG’s generations imply that a shared ancestor represents a logical time point.
Having a logical – and shared – time point, it allows us to base finer-grained times off that frame of reference.
Monotonic clocks allow us to accurately determine elapsed time from some initial measurement, so if we record events with monotonic clock offsets from such a logical time point, we arrive at a synchronized logical time stamp for each event.
Finally, due to Vessel’s DAG-based approach, at any degree of synchronization, the order of events will be the same across all nodes, but only perfect synchronization will yield a perfect order.

That’s the general approach. Sounds good? It does, doesn’t it?

The devil, however, is in the details. So while that is the approach we’re taking, unfortunately we’ll need to discuss the nitty-gritty a bit…

… in the next post.

Simon Woo, Jay Gao and David Mills. 2010. Space Network Time Distribution and Synchronization Protocol Development for Mars Proximity Link. SpaceOps 2010 Conference.
https://doi.org/10.2514/6.2010-2360
See also: References ↩︎
Space link protocols include timestamps, and the protocol leverages those for a clock skew adjustment algorithm based on NTP. ↩︎
Time dilation exists! But it describes the effect of one observer observing two clocks; one stationary/in front of them, and the other moving. The moving clock will appear to be ticking at a different speed from the stationary one. ↩︎