11
$\begingroup$

Friendship paradox is the somewhat well-known statement that "statistically speaking, your friends have more friends than you do".

To my mind, which is surely ignorant of any complexities of social sciences, it seems that this should translate into the following statement:

Friendship Paradox Theorem I. Let $G = (V,E)$ is an undirected graph. Then the average degree of a vertex sampled uniformly at random from the neighbourhood of a vertex sampled uniformly at random from $V$ is at least as large as the average degree of a vertex sampled uniformly at random from $V$, i.e., $$ \frac{1}{|V|} \sum_{v \in V} \frac{1}{\deg(v)} \sum_{u : uv \in E} \deg(u) \geq \frac{1}{|V|} \sum_{v \in V} \deg(v).\tag{1} $$

Hence, I was somewhat to see that Wikipedia justifies the friendship by a different inequality.

Friendship Paradox Theorem II. Let $G = (V,E)$ is an undirected graph. Then the average degree of a vertex sampled by choosing a random endpoint of an edge sampled uniformly at random is at least as large as the average degree of a vertex sampled uniformly at random from $V$, i.e., $$ \frac{1}{2|E|} \sum_{v \in V} \deg(v)^2 \geq \frac{1}{|V|} \sum_{v \in V} \deg(v). \tag{2} $$

Now, both inequalities are true, and friendship paradox is an empirical observation, so there is not much of a problem. However, I would be grateful if someone could explain to me the intuitive appeal of (2) as a justification of said observation (right now, it seems to me that it's just obtained by choosing the distribution on $V$ so as to make computations easier). Of course, it could be the case that no such justification exist, in which case I would be grateful for references (and moral support) to edit the relevant Wikipedia page.

$\endgroup$

1 Answer 1

4
$\begingroup$

Wikipedia is in error here.

The formula quoted in the Wikipedia is the one that Feld's paper gives for computing the quantity "mean number of friends of friends", which is defined in a footnote as "the total number of friends' friends divided by the number of friends".

Construing this quantity as an actual mean requires taking the uniform distribution over edges, which can be construed as a uniformly randomly chosen among all ties of friendship. This description on Wikipedia seems original; I didn't notice it when skimming through Feld's paper. (although, it is the natural description you'd arrive at if you wanted to consider it as a random variable)

Earlier in the paper talks about, for each of the girls in the study, the "mean number of friends her friends have". In Table 1, an explicit formula is given for this quantity:

Mean Number of Friends of Her Friends ($\Sigma x_j / x_i$)

and then table 1 also lists the sum and average of this quantity over all of the girls. That is, this table really is computing the quantity

$$ \frac{1}{|V|} \sum_v \frac{1}{\deg(v)} \sum_{u : uv \in E} \deg(u)$$

exactly as you describe. This is something different from the "mean number of friends of friends", as acknowledged later in the paper; e.g.

First, it is important to recognize yet another distribution and another mean. Refer again to the situation in figure 1 as summarized in table 1. The eight Marketville girls have a total of 20 friendships, with a mean of 2.5. The friends have a total of 60 friends, with a mean of 3.0. At the same time, each girl has a mean among her friends, and the means for all the girls have a mean of 2.99. This last mean differs from the mean number of friends of friends (only slightly in this case) because the two- step averaging process weights each of the friends differently: each of the means of Sue's four friends are averaged, and that average counts equally with Betty's average based on her only friend. Thus, the particular arrangement of the friendships affects this last average.

$\endgroup$

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .