Skip to main content
18 events
when toggle format what by license comment
Nov 10, 2020 at 20:16 comment added Mario Carneiro @user3570982 As for scale invariance, that's not a requirement of the benford distribution, it is a property of the benford distribution. It doesn't really make sense to say that an RV is itself scale invariant because a given RV has a mean and that mean changes if you scale the variable. But if it is benford distributed then a scaled version of the variable will also be benford distributed.
Nov 10, 2020 at 20:10 comment added Mario Carneiro What Rick said. And moreover, it's not just whether voting in general has a tight distribution, but whether voting in chicago districts are tightly distributed, where there is apparently a good reason for them to be so, namely that the districts are all about the same size (and also politics is not uniformly distributed in any sense). I would stand by the claim that multiple magnitudes are a necessary condition for a benfordian distribution, or at least, I see no a priori reason to believe that a tightly distributed RV should be benfordian.
Nov 9, 2020 at 21:20 comment added Eph @user3570982 not spanning at least an order of magnitude means that it was restricted. Maybe that restriction is natural (the number of kids people have is not Benfordien but the data certainly hasn't been tampered). The number of lentils in a soup bowl has also not been tampered with, it's just a natural consequence of people filling the bowl to roughly the same place. If a set of numbers naturally have a tight (less than an order of magnitude 6 sigma) distribution it will not be Benfordian. Then the question remains of whether voting naturally has a tight distribution.
Nov 9, 2020 at 17:59 comment added user3570982 Look, I'm not saying that, because Benford's Law isn't always observable with Biden, that fraud took place, or even that Benford's Law properly applies to precinct totals. What I've pushed back on is the misapplication of rules that aren't always applicable. Benford's Law does not require scale invariance or multiple magnitude spans for datasets being analyzed. Insisting that those are requirements is just flat out wrong. I get that, for many, this is really a political argument, but doubling down on falsehoods is certainly outside of what I'd consider skeptical reasoning. Anyway, gotta work.
Nov 9, 2020 at 17:47 comment added user3570982 @Mario Carneiro I agree. If you constrain the lentil servings to be "same size-ish" then you shouldn't see a distribution consistent with Benford's Law... because you constrained the serving size. Any time you add constraints like that, you're necessarily skewing the dataset. If you open servings sizes to sizes that are outside of "same size-ish," Benford's Law should emerge. Similarly, if you constrain precinct vote tallies, you'll also see something other than Benford's Law... which is the point. Note, I'm not saying that deviating from Benford's Law proves a constraint (fraud).
Nov 9, 2020 at 17:29 comment added Mario Carneiro @user3570982 Your claim is within striking distance of one that is verifiably incorrect. If we specify Hagen's example a bit more to impose that "same-size-ish" means that it is normally distributed about a mean with a standard deviation of, say, 10% (which seems reasonable for eyeballed pasta), then the vast majority of the distribution will be in one order of magnitude, and you will probably be able to see the specific mean on the distribution. There will be a bell curve around some number (perhaps you can argue that this mean is itself benford distributed, but it is where it is).
Nov 9, 2020 at 2:47 comment added user3570982 @Hagen von Eitzen That's not accurate: scale invariance is sufficient for Benford's Law to apply, but it is not necessary. Also, your example actually seems to illustrate the general contention. If you don't constrain the amount of lentils per plate, you'll see a Benford distribution; if you do, you won't. Incidentally, that's exactly why Benford's Law is claimed to suggest election fraud: if you don't constrain the amount of votes a candidate receives, you should see a Benford distribution; if you do constrain (that's the fraud part), then the distribution won't match Benford's Law.
Nov 8, 2020 at 21:54 comment added Hagen von Eitzen @user3570982 While spanning multiple orders may not be required, a scale-invariance is (as in measuring in feet vs. meters when the units are unrelated to the observables). With precinct sizes roughly the same, we are but much closer to a Bernoulli situation. If you cook a large pot of lentil soup and distribute is to same-size-ish plates, then the number of lentils per plate will not be Benfordian
Nov 8, 2020 at 16:37 comment added David K @user3570982 I think Henry's point is that while data not spanning multiple orders of magnitude can follow Benford's Law--as the heights would have, if a few between 160 and 190 m had been left out of the list--there is no reason to expect them to. The main claim we are discussing here is the predictive power of Benford's Law to election results. It is not looking good for that.
Nov 8, 2020 at 16:17 comment added user3570982 @David K This illustrates a potential context dependency of applying Benford's Law itself, which is, I believe, the primary thrust of what Henry originally wrote. However, that still does not mean that the contention that multiple spans of magnitude in the dataset being analyzed is a requirement for Benford's Law, is accurate. It is not. It's possible that applying Benford's Law to voter precincts may not be applicable for contextual reasons, but the primary reason given to reject its application (magnitude spans) is not actually disqualifying.
Nov 8, 2020 at 16:10 comment added user3570982 @David K I should have taken the time to fully read what Henry was writing, rather than focus on the partial quote. My bad. What you've written, though, doesn't "disprove" the application of Benford's Law to the example dataset given. From mathworld.wolfram.com/BenfordsLaw.html , "Benford's Law applies to data that are not dimensionless, so the numerical values of the data depend on the units." That is, scale invariance is not a requirement for applying the analysis, but when unit dimensions are included, the specific context matters. (continued)...
Nov 8, 2020 at 15:46 comment added David K @user3570982 You have quoted the article accurately, but that part of the article is simply wrong. The prevalence of the leading digit 1 is dependent on the unit of measurement. For some choices of a unit (e.g. feet, meters) 1 is most common; for other possible choices of a unit, it is not. Henry gave a counterexample disproving the article's claim. The claim is "almost true", because this particular set of data span almost exactly one order of magnitude and are mostly evenly distributed logarithmically (though with a notable peak around 175 m).
Nov 8, 2020 at 14:11 comment added user3570982 @user3570982 What you've said is literally the opposite of what is written in the reference article. I'll fully quote: "Examining a list of the heights of the 58 tallest structures in the world by category shows that 1 is by far the most common leading digit, irrespective of the unit of measurement (cf. "scale invariance", below)". Instead, you've illustrated exactly what is true: requiring a span across multiple magnitudes tends toward accuracy, but that's a rule of thumb that is context dependent
Nov 8, 2020 at 14:04 comment added Henry @user3570982 - except that that example does not fit Benford's law since the pattern of heights in metres does not match the pattern of heights in feet. "1 is by far the most common leading digit" may be true in that particular example in metres and feet, but it would not have been true for example a scale of half-metres (3 would appear more often as the first digit than 1); the overall Benford distribution does not match that data at any scale.
Nov 8, 2020 at 13:33 comment added user3570982 @SomeoneSomewhereSupportsMonica That's not really true. Look at the example provided in the Wikipedia article referenced: en.wikipedia.org/wiki/Benford%27s_law#Example . Regardless of units used, building heights in that example differ by only one order of magnitude, and in the case of meters, only 5/58 buildings are less than 100m, yet Benford's Law applies -- it's the example used, after all. Spanning multiple magnitudes is not a requirement, it's a rule-of-thumb for judging accuracy, but it's a rule that is highly context dependent.
Nov 8, 2020 at 11:33 comment added SomeoneSomewhereSupportsMonica @user3570982 Not having multiple orders of magnitude is a (soft) bounding of possible values in itself.
Nov 8, 2020 at 4:01 comment added user3570982 That's a good explanation, though not entirely accurate: There is no requirement for spanning several orders of magnitude, and Benford's Law can be observable even when there is not a wide span of magnitudes. If there is a wide span, Benford's Law tends apply more accurately, but it's not a requirement. What's required is that there not be a cutoff of possible leading digits (a bounding requirement).
Nov 8, 2020 at 2:47 history answered Henry CC BY-SA 4.0