On (De-)Centralized Communications: Part 3

#Recap

Part 1 focused on Baran’s terminology on this topic. Part 2 then briefly explored how graph theory treats network centralization, noting that multiple definitions exist, some of which align better with Baran’s terminology than others. Each definition of centrality has its own rationale; for that reason, analysis frameworks often include several of them.

For the most part, “betweenness” seems to be a measure that matches Baran’s thoughts rather well. In short, it measures how often a node sits on the shortest path between other nodes, thus becoming a likely centralization point the more often that is the case.

A distinguishing factor in centrality functions is that some are better at describing local effects, others global impact.

Figure: Networks with end, edge and core nodes

If we consider the fully coloured diagram from the first part again, this becomes apparent when one looks at the purple nodes. Just the three purple nodes in the upper left corner already show some interesting characteristics.

On the one hand, their betweenness is relatively low. It can only affect paths that involve the orange endpoints they’re directly connected to. Any of the vast majority of other nodes will bypass these nodes.

However, they have crucial betweenness locally, precisely because they are edge nodes that connect some end nodes to the rest of the network. In other words, betweenness centrality makes for a good measure both locally and globally, and a full network analysis should likely include both points of view – similar to how we did intuitively in the first part.

However, this part of the mini-series isn’t really focused on centrality itself – it rather focuses on a different dimension to the discussion about centrality that we didn’t cover so far.

It may be crucial to understanding why people so often describe the same network so differently, though, so let’s dive in.

#Layer Cake

When we’re discussing the Internet, we very often discuss it in terms of layering. Most often, we refer to the ISO/OSI 7 layer model of the Internet. A different if comparable model underlies the Internet Protocol, sometimes called the IP or Internet model.

In fact, when I made a series of videos describing the Interpeer Architecutre, I made extensive use of this latter model in the hourglass shape that it’s sometimes shown in, indicating that the narrow waist is pivotal to the definition of the architecture.

Figure: The Internet Model in hourglass shape

It’s worth noting that this hourglass-based description of the Internet is not without controversy. But it also isn’t without alternatives.

The book The Real Internet Architecture argues that each “layer” we think of in either of the above models is actually composed of multiple functions, and that such a group of functions together provides the functionality we expect of the layer.

The authors, Pamela Zave and Jennifer Rexford, further describe how most of those functions exist at most layers, albeit in different form. They then convincingly lay out two points that distinguish the “real” architecture from the ISO model, namely

that there are many more such groups of functions than the traditional seven in any real-world network, and…
… this is shown particularly well whenever we talk about e.g. “layer 2.5” or similar shims that simply shouldn’t exist according to the ISO model!

Part of this perspective is based on much earlier work, alternative architecture called RINA (Recursive InterNetwork Architecture), which eschews formal layering altogether! Continuing its development at the Pouzin Society, it not only defines such functions as above, but also implements a single protocol that can be stacked ad infinitum – recursively, as it were. Whether this protocol is a “transport” or “link layer” protocol depends solely on where in the stack it exists!

Cool stuff! But how does this relate to the topic of this post?

In the video series I mentioned above, I picked up on one of those function group characteristics that tie several functions together. It’s not often explicitly discussed, but maybe that is because it is rather obvious: each layer uses different addresses.

Layer	Addressed Target	Address
7	Resource	URL
4	Socket	IP:Port
3	Interface	IP
2	Hardware Port	MAC

Short of repeating the entire book, this goes a long way towards illustrating what kinds of functions are going to exist in various groups: if different layers use different addresses, we’ll need address resolution as well as protocols that disseminate the necessary information, i.e. routing.

For the purposes of this post, however, we’re not so much concerned with addresses – rather we care about what is being addressed.

#Address Targets and Networks

#Data Link Layer

The second layer in the OSI diagram is the so-called “data link layer”¹. It typically exists in two forms, at least in the networks you are most likely to encounter: one is Ethernet, and the other is Ethernet… but wireless.

It’s safe to ignore the wireless part here. That is because wireless Ethernet is modelled so closely on its wired counterpart that for our purposes it behaves in the same way. We can pretend the data link layer is wired Ethernet².

Wired Ethernet has had a long history – the earliest incarnations were, more or less, buses. That is, devices were connected along shared cables, and when it was one device’s turn, it’d effectively broadcast its packet on the cable. Every other device could read the information so sent.

Back when I was a kid, we used coaxial cables to connect the devices. At each end, a terminator was necessary. If either terminator or any of the devices were faulty or not connected properly, communications along the entire network was interrupted. We did this because the twisted pair cabling that is now standard was still too expensive and reserved for “professional” use.

The point is that in principle the “bus” analogy is still correct. If you stick your RJ45 plug into an Ethernet “hub”, it’ll still cheerily broadcast any signal to all of its connected devices. A more intelligent “switch” should only send signals along cables where it thinks a device with a target MAC address is connected. If your switch does IP layer forwarding, all bets are of.

We didn’t discuss this in the first part, but in terms of network layout, the “bus” type system has high centrality in all its nodes. All have to function in order for any node to be able to communicate. This isn’t much different when you physically cable devices into a star pattern to a “hub”. Only a switch will create a real star-pattern³.

In the first post, we identified the star as the most centralized network – but perhaps the bus should have that dubious honour?

#Internet Layer

The next layer, at which the IP protocol sits, is supposed to provide the Inter-networking functionality. By giving each device (network card) a globally unique Internet Protocol address, we can build a resilient, fully distributed network by simply sticking more Ethernet carrying cables into a node.

Those additional cables might belong to entirely different Ethernet-style buses. The node’s role in each bus may be centralized or not – but connecting multiple of them likely makes it centralized from the IP routing perspective!

Note that in terms of layering and recursiveness, this not only requires a translation from MAC to IP addresses and vice versa – it also requires routing protocols (ARP, NDP, …) to disseminate the information necessary to perform this resolution.

At any rate, this is the layer most people have in mind when they declare that the Internet is highly decentralized (or distributed). But again, locality matters! A faulty switch in the edge can cut off several end nodes, thus being highly central to their specific needs.

#Transport Layer

The transport layer sits on top of IP, and one might imagine that it doesn’t actually add much to the equation – at least not in terms of network structure and resilience.

But keep in mind that the transport layer addresses include port numbers, and in order to have a port number, one also needs a running process. Arguably, then, the transport layer is concerned with connecting processes to each other.

It likely won’t do much for the resilience of a network overall, but it’s worth noting that “node failure” in this case can be something as simple as a computer just not running a particular process. Perhaps it wasn’t part of the startup scripts, or it keeps crashing – or the IP node was never meant to run it.

Again, we see divergence in the centrality between two layers: what is true for IP may not be true for the transport and vice versa.

#Application Layer

At the application layer we address, well, application specific targets. In the case of HTTP those are resources, which are fully addressed via URLs. Resources, in turn, are “things” that are managed by processes.

The most interesting type of resource is not a file, however, but one that doesn’t actually exist, or at least not on the server that is receiving a request. That this is permitted is one of the great strengths of the REST architecture, and its defining principle: representation.

Whether your “resource” exists as a (series of) database entry somewhere else, or it exists on another microservice that is not accessible directly by the client, in either case, the resource path as understood by the receiving server functions as a proxy of sorts to the real resource in the place where it is managed. This place, in turn, is often a completely different physical machine, itself connected via the lower layers of the stack to the proxying process.

In that sense, the proxy routes requests elsewhere, and has transformed from an end node to a node somewhere along the path – a middlebox of sorts.

Quite a lot of effort has been expended on making this kind of resource routing responsive to changes in the network. HTTP proxy configurations usually track the liveness of their backends. We use database clusters to ensure that the resources themselves are reachable even if a single machine fails⁴. Automatically adding or removing containers to a cluster which itself may be distributed seems to be the culmination of all this.

All of this implies that the end node in a HTTP based system is actually the abstract resource being identified by a URL, not the server or servers responding to requests made to that URL.

HTTP, unfortunately, contains an asymmetry here: while resources can be viewed as end nodes (on the server side of things), the client side has no URL identifier. The identifiers for client side end nodes can therefore vary depending on how you look at it: it’s either its underlying IP:Port based addressing, or perhaps it is an authentication of sorts.

Despite the asymmetry in addressing and the lack of a standard “HTTP routing” standard, it’s important to emphasize that at this layer in the stack, processes take on the responsibility of routers, and so cannot really be seen as end nodes any longer. At minimum, they move to the edge, if not closer to the network core.

#Layer Summary

What the excursion into layering should have demonstrated is that the network structure, it’s degree of centralization, depends entirely on the layer you are considering at the moment – and so also on what the things are that are being addressed at each layer.

In the HTTP example above, it is very difficult to describe how centralized a specific HTTP server is. Depending on how it is set up and which other servers exist, it may be on the only path to any particular resource, or merely one of many.

#The Human Layer

There is another layer to the cake, and in terms of the human-centric nature of this project’s R&D efforts, it’s arguably the most important. It also happens to be the layer at which the fediverse addresses things⁵.

Well, mostly. A fediverse address does not, strictly speaking, have to identify a human being. It could be a group, or a bot, and such a bot could be arbitrarily complex and route Activities to services hidden behind it.

But in the social networking use case, it is at least somewhat assumed that fediverse actors are actually human. This matters for two reasons:

First, the social networking use case is the most prominent one, so it tends to flavour all fediverse discussions.
Some of the centralization discussions mentioned in the first part happened on and about the fediverse.

Between the two, it makes sense to at least conceptually treat the “human layer” as a layer that faciliates human-to-human communications, i.e. addresses can often be assumed to address real-life people. And in this, at least, the fediverse model then is roughly equivalent to the human-centric networking model.

In discussing the fediverse as a human-to-human network, it quickly becomes apparent that federation is nothing more than an alternative term for the routing function that we’ve stumbled across on multiple layers in this post.

How it functions is not as important as what it does: it transmits the fundamental unit of communication (activities) to and from the fundamental communication endpoints (humans) which are addressed with this layer’s “native” address, i.e. fediverse account identifiers.

#Fediverse Centralization

When I enter any discussion on fediverse centralization and claim it is extremely centralized, this is all context that I carry.

Now that we’ve examined the terminology, the maths, and the layers, the argument should be nearly self-explanatory: in terms of betweenness, an instance is an indispensable, single-homed edge node to the account (end node). It’s betweenness is paramount to every account that is served by the instance.

The fediverse also has a secondary characteristic that amplifies this: it’s routing is extremely shallow. While in principle, all of its instances are connected to each other (provided lower layer connectivity is given), at this layer this connectivity does not matter very much.

That is because these instances do not fulfil one of the network’s main functions: forwarding. No (regular) instance A will forward activities from an account on instance B to an account on instance C. Put differently, there is only ever a single path from one end node to another, and it always leads to one or two edge nodes: one if the accounts exist on the same instance, and two otherwise.

In terms of betweenness centrality, every instance has maximum betweenness for every account it serves. From the point of any connecting instance, its betweenness is directly proportional to the number of accounts on it.

To make matters worse, its betweenness is is also proportional to the number of accounts each account is connected to. Here, our mathematical approach in the previous post was a little too simple⁶, but we can fall back on our intuition to provide an easy to grasp model:

Assume an instance has N accounts. Across all N accounts there exist connections to M other accounts (uni- or bidirectional, we don’t care). Then each of the N accounts has, on average, M/N connections.

To get the sum of all paths from all N to all M, we then multiply the nodes N by their average connectedness, which yields N * M / N, or simply M.

A little counter-intuitively, this implies that large instances aren’t as big of a problem with regards to centralization as instances with popular accounts⁷. But that is a slightly twisted perspective, as large M are far more likely with large N. Still, an outlier account with huge numbers of connections will significantly increase the betweenness centrality of an instance.

The upshot is that from the perspective of human-to-human end node connections via instance edge nodes, the fediverse provides the same kind of bus-type abstraction as early day wired Ethernet: every of the 3-4 nodes on the path has maximum centrality for the path, because there are no alternatives.

#Summary

When we discuss (de-)centralization on the Internet, it is well worth understanding that Baran’s view on the terminology came from an understanding of network resilience. In the mathematical field of graph theory, many different approaches to calculating centralization exist, but betweenness centrality⁸ is a fairly close match to Baran’s arguments: it effectively measures the disruptive effect on end-to-end communications that a node’s failure might have.

Most of our discourse on the topic, however, seems to fail because we are prone to mixing layers. This is because we rarely emphasise what is being addressed when we examine a network, and assume it has something to do with the machine protocols in the uppermost layers (IP, HTTP).

A core argument of the Interpeer Project is that we should make human layer communications as distributed as possible. This is to minimize disruption at that level of abstraction – and it turns out, it is actually easiest to arrive at if the lower layers share characteristics of this abstraction. That may be the topic of another (set of) posts, though.

Note how that layer is missing in the Internet model, where instead the “media access” and “media format” layers exist. So much confusion! ↩︎
While LTE isn’t strictly speaking considered a “data link layer” protocol – and consists of multiple layers, thus proving the recursiveness of real networks – from the point of view of your mobile device’s typical apps, they both sit at the same “layer”. We’ll ignore it here, except to note that in practice, a few things discussed in this section wouldn’t apply here. ↩︎
So the physical layer of cables and the logical layer of signalling already have diverging centrality! ↩︎
What we do not have is a properly standardized “HTTP routing” protocol. All of this effort is dependent on system administration choices. ↩︎
I am well aware that I am mixing “the fediverse” and “ActivityPub” as one of several base protocols for the fediverse here, and additionally ignore that in some sense, fedi application protocols are defined by the specific activities that are in use. ↩︎
Given account a@b, then for a@b the betweenness centrality of b is the number of accounts following a@b plus those followed by a@b. An argument could be made to de-duplicate this for mutual followings. Summed up across all possible a yields C(b). Normalize in some fashion. ↩︎
Note that for resilience it makes a difference whether the accounts are both local or one is remote, but for the centrality function, it does not! ↩︎
Though it is worth pointing out that the measure is not ideal, either. Resilience also exists if a longer path than through a failing node is possible. It’s good as a statistical representation of the impact its failure would have. ↩︎