Consider an urn full of different color balls. In a general sampling without replacement statistical thought experiment, we choose a ball with a probability p, remove it from the urn and re-calculate the probabilities for the remaining balls. In this sampling with addition process, rather than remove the ball from the urn, we replace the ball and instead add an additional ball of the same color. Over time, the colors that are chosen more often are rewarded and this positive feedback loops leads to great heterogeneity. That’s basically it.

**Direct a2e** This heterogeneity is philosophically explained by sociologist Robert Merton [33] as the ‘Matthew Effect’, from Matthew 25:29, “For to all those who have, more will be given, and they will have an abundance; but from those who have nothing, even what they will have be taken away.” The Matthew Effect is one of the earliest advocates of the so-called rich-get-richer phenomena, that such connective advantage accrues cumulatively. In totality, the mechanism was perhaps most accurately by Derek de Solla Price in the modern sense, by a trio of papers describing (1) the Pareto distribution in the scientific literature community [34], (2) the power-law degree distribution of the scientific collaboration network [35], and (3) the mechanism of ‘cumulative advantage’, that network growth was determined preferentially based on prior connections [36]. After developing for over a century, the mechanism’s application explicitly to degreebased network evolution that solidified it in the complex systems circle.

This rich get richer phenonomenon can be described by a thought experiment known as the Polyna Urn Scheme. Imagine a tank of balls, each one with a different color. Start with 5 balls, each one color. Pull out a ball at random and mark the color, return the ball back into the urn and additionally add another ball of the same color. The balls that are more selected are more likely to be selected again because each time a ball of that color is selected it is rewarded.

In fact this result reflects a specific phenonemon. There is a particular class of networks, known as scale-free networks which are the realization of this effect. Scale-free networks are described by a highly heterogenous topology which has very few well-connected hubs and many lowly- connected links. The degree distribution follows a linear trend when described by a power law of the form P(k) = Ck^-L where L is typically found 2 < L < 3.

And that may be okay, for something like a crime rate. Why is the logarithm different? The main reason there’s a difference is Scale-free networks are named as such because they are invariant to changes in size at different orders of magnitude. The ratio of heterogeneity remains at different sizes, even logarithmic scales. The consistency in this same structured entropy. Scale-free networks are quite common in physical, biological and social systems, but they are a special and unique discovery in its own right. We have multiple faults with the measure of the engagement rate. For example, the lack of any attempt at benchmarking via regression, either parametrized or not, the lack of controlling for irrelevant audience sizes. The main issue, however, is that the non-linearity of the distribution. So at the minimum, you need a logarithmic correction to your data. Is that it? We just take the log and we’re fine? In a way, yes. Or at least much better off.

But this makes normal statistics literally arbitrary. This means that anytime anyone has used an engagement rate, an engagement ratio, even something as simple as an average, they have been basing this on the incorrect distribution. This means that the insight is practically irrelevant, and potentially dangerous. Without appreciating the complexity of the topology, without understanding the non-linear feedback, the power law distribution, analytical insight is unreliable at best.

Something has never tasted right about the engagement rate. Most social analysts actually know that, but are unable to put their finger on it. How it is fair to compare a company with 10 million likes to one with 10,000. Is it really as simple as simply dividing?

So we ask the same question we did earlier on: is it really as simple as just taking the logs and dividing? This, at the minimum, at least puts your estimations in the realm of valid statistics. It does at least to some degree.

It’s because we must transform this data into the world of logs, do our mathematics such as logarithmic binning, regression analysis, averages, etc. and then bring it back into the real world. If we say what is the average size of a 10,000 and 10,000,000 profile: the sum is 10,010,000. If we were to transform this into logarithms, we would see that the difference is in fact merely 5 to 8. While it’s a simple transformation, it’s doing something incredibly deep: it’s turning our additive mathematics into multiplication.

This multiplicity of the logarithms is because preferential attachment has a positive feedback loop.