5
$\begingroup$

enter image description here

This plot represents the popularity of technologies in two "tools", vue and react. In left-top corner are specific technologies for vue but not for react, right-top technologies popular in both tools - you get the idea.

For example vue-loader (left-top) is in 51% of projects with vue but only in 0.08% projects with react.

Is there any specific name for this kind of data representation?

I'm asking about this because I want to find some guides explaining how to read this kind of plots.


EDIT

This plot is quite similar to Correspondence Analysis but axis are made of two variables from my data:

            technology          vue        react

                 acorn 0.0006145023 0.0003786922
           animate.css 0.0143383859 0.0191870740
 apache-server-configs 0.0006145023 0.0001262307
              archiver 0.0012290045 0.0005049230
                assert 0.0020483408 0.0012623075

EDIT 2

This kind of plot can be explained by converting

slope plot

enter image description here

to scatter plot

enter image description here


Edit after final answer

Since I will not find any ready to use guides how to read this kind of plot I made my own explanation:

enter image description here

$\endgroup$
9
  • 2
    $\begingroup$ What specific features, in your view, distinguish this scatter plot from other scatter plots? $\endgroup$ Commented Feb 22, 2018 at 11:32
  • $\begingroup$ @RubenvanBergen Squared size, characteristic diagonal (identity function), this idea about corners I was trying to describe. Fact that data on either side of diagonal is more specific to one of vue or react in my example. $\endgroup$
    – Everettss
    Commented Feb 22, 2018 at 11:38
  • $\begingroup$ @RubenvanBergen And each point is not from on paired observation - this can suggest that this is not a scatter plot? stats.stackexchange.com/a/187773/196312 $\endgroup$
    – Everettss
    Commented Feb 22, 2018 at 13:07
  • 1
    $\begingroup$ I would say these are paired observations, because each dot represents two measurements for the same technology: its popularity in vue and its popularity in react. Unless I've misunderstood something? $\endgroup$ Commented Feb 22, 2018 at 14:40
  • 1
    $\begingroup$ These are paired in that each dot represents two values for the same technology, as @RubenvanBergen notes. Those 2 pieces of information are estimated separately, & in this case you can say something about their uncertainty, whereas in a more typical case (w/ 1 measure on each) you couldn't. If you wanted, you could put little horizontal & vertical error bars around the observed %ages based on the possibly differing $M$'s. You may not want to, b/c that would make the plot even busier, but you could. At any rate, these data are paired. $\endgroup$ Commented Feb 23, 2018 at 21:23

1 Answer 1

7
$\begingroup$

This is ultimately just a scatterplot. I don't think there is a special name for it. I don't see this as meaningfully related to correspondence analysis except in that you can make a scatterplot of the results from a correspondence analysis, and this is also a scatterplot. Notably, this does not have much to do with a biplot.

I do see a couple features here that are somewhat different from the way scatterplots are most commonly used. First, in scientific contexts, scatterplots are typically used to help us think about relationships between two variables. In your case, you seem to be more interested in thinking about the individual observations and how they rank (in a multidimensional sense) relative to each other and the joint distribution. Hence, you have your points prominently labeled. That is, given a default data matrix with variables in columns and observations in rows, usually a scatterplot is made to help you think about the columns; here you seem to be making the plot to help you think about the rows.

Second, there is a subtle distinction in statistics between correlation and agreement (cf., Does Spearman's $r=0.38$ indicate agreement?). Most commonly people look at scatterplots from a correlation-ish frame of mind; your scatterplot seems to connote an agreement-ish perspective. For example, your two variables are naturally on the same scale and you have a prominent diagonal line marking perfect agreement. Your "slope plot", which you think of as analogous, is also presenting a kind of agreement-ish information (i.e., the stability of the rankings over time, coupled with the consistency of the increase).

Under the assumption that this is what you hope to glean from your visualization, the current plot is not optimal. It is harder by nature to compare positions to a diagonal—the human visual system just isn't designed for that. Instead, it would be better to 'turn' the plot so that points can be compared to a horizontal line (and also possibly their right to left horizontal position). That is what Nolan did in another plot you find similar. This line of reasoning leads us to a different plot, which, while also a type of scatterplot, does have a special name, viz., a Bland-Altman plot (also called 'Tukey's mean-difference plot'; see Wikipedia and Creating and interpreting Bland-Altman plot). You can still label your points in a BA plot. In that version of your plot, the vertical position tells you if its use is dominated by vue (above the midline) or react (below). The left to right position will provide a sense of how commonly that technology is used overall. Furthermore, if you wondered about the explicit agreement attributes of vue and react as variables (e.g., is there a bias towards one or the other), those could be incorporated naturally.

$\endgroup$
2
  • 1
    $\begingroup$ Bland-Altman plot looks promising! I will definitely try it. I'm just wondering is Bland-Altman or my way of presentation will be easier to explain for "non-statistic oriented audience". My final presentation is not focused on a detailed analysis - just get a grasp of trends. By the way, quality of your answer overtook everything I've experienced on stackexchange network. $\endgroup$
    – Everettss
    Commented Feb 23, 2018 at 21:57
  • 1
    $\begingroup$ You're welcome, @Everettss. I'm not sure that it will be easier or harder. I suppose it might be a little more involved than a standard scatterplot, if people are completely statistically naive. It will depend on your audience. Note that if vue or react occurs more often than the other, you could do a weighted average on the horizontal axis to better measure overall prevalence of the technology. $\endgroup$ Commented Feb 23, 2018 at 22:04

Not the answer you're looking for? Browse other questions tagged or ask your own question.