2
$\begingroup$

I'm aware that when a quantile is mentioned, it may mean a point or it may mean a group. But i'm not asking about the quantile points.

In the case where it means a group, then there seem to be two meanings..

I'd like to confirm that there are these two meanings for a quantile group.

One is they overlap, smaller within larger, and another is they don't overlap and they are equal size. (I'll explain what I mean).

Note that I understand that with percentiles, then if talking points, you'd have 99 points(Excluding min/max %), and 101(including min/max %).

I notice that this wikipedia page on percentile https://en.wikipedia.org/wiki/Percentile says "every score is in the 100th percentile"

So that seems to be a definition of a percentile group that goes from 0 all the way to that point. So smaller percentile groups exist within larger ones.

Whereas another definition of a quantile group, is used when one refers to "lower quartile" and "upper quartile". In that instance, you have two quartiles of equal percentage size(25% each). And another two quartile groups, the lower-middle quartile and the upper-middle quartile.

Whereas wikipedia mentions the 100th percentile (group) as being not 1% in size, but 100% in size.

Am I correct here that there are these two different definitions of quantile group?

As a side note, I suppose if you have equal sized groups you'd have 100 groups.. Whereas if you have groups going from 0 up to whatever point, then you'd have 101 groups. So that'd be another difference that occurs depending on which definition of quantile group is in use.

$\endgroup$

1 Answer 1

2
$\begingroup$

Are you reading or writing? It is not clear whether your concern is to understand confusing references to 'quantile' in the literature, or whether you want to avoid causing confusion by what you are about to write.

  • If the former, there are indeed various and inconsistent uses of 'quantile'--both in terms of point values in a sample and in terms of intervals. If the definition is not clear from the immediate statement, you might look back for clues in the earlier context.

  • If the latter, you can take care to state explicitly what definition of 'quantile' you are using.

Different versions of point quantiles. You should be aware that there several particular formulas for finding point values in a sample that match a particular quantile, such as the 25th. R defines nine types, uses type=7 as default, and briefly discusses several of the nine in the documentation for its quantile function. The R code below generates a sample of size $n=90$ rounded to one decimal place, and then lists nine versions (types) of the lower quartile.

Consequences. Points as estimates: For what it's worth the population 25th quantile is 89.88265, so as an estimator of the 25th quantile of a normal distribution, none of the nine results are really bad.

qnorm(.25, 100, 15)
[1] 89.88265

However, if you know that the population in normal, then the best way to estimate the 25th percentile is to use sufficient statistics to estimate $\mu$ and $\sigma,$ and then to find the 25th percentile of the estimated normal distribution.

qnorm(.25, mean(x), sd(x))
[1] 89.46862

Intervals: In this example, there are a couple of different versions of what observations constitute the lower quartile of the sample. Some include the first 22 sorted observations and some include the first 23.

set.seed(810) 
x = round(rnorm(90,100,15),1)
sort(x)
 [1]  66.1  68.6  70.7  72.1  74.9  77.4  77.5  80.3  80.9  81.5
[11]  81.9  82.2  82.5  83.0  83.6  83.6  85.2  86.3  87.5  87.8
[21]  87.9  88.9  89.9  90.2  90.4  91.0  91.0  91.2  91.7  92.4
[31]  93.0  93.4  94.1  94.4  94.6  95.5  95.6  96.9  97.2  97.4
[41]  98.1  98.1  98.3  98.3  98.8  98.9  99.2  99.2  99.4 100.7
[51] 101.1 102.4 103.0 103.1 103.6 103.6 103.9 104.0 105.0 105.7
[61] 106.1 106.4 107.5 107.8 108.3 109.6 110.5 111.8 113.0 114.0
[71] 114.3 115.0 115.7 115.9 117.0 117.9 118.0 118.1 118.3 118.9
[81] 119.1 127.3 128.2 128.3 129.6 130.3 130.3 130.7 136.5 141.7

q.25 = numeric(9)
for(i in 1:9) {
 q.25[i] = quantile(x, type=i, .25) }

q.25
[1] 89.90000 89.90000 88.90000 89.40000 89.90000 89.65000
[7] 89.97500 89.81667 89.83750
$\endgroup$
3
  • $\begingroup$ thanks.. on a related note..(and please let me know if I should make a separate question for this). If an item of data is in percentile 98.99999% does that make it in percentile 98% or percentile 99%? and if you say percentile 99%, then if an item is in percentile 99.9999% does that mean it is in the 100% percentile (which is technically the 101st percentile). $\endgroup$
    – barlop
    Commented Oct 9, 2019 at 4:03
  • 1
    $\begingroup$ Technically, I'd say 'close', but not 'within'. However, you'd have to be looking at a really huge sample for this issue to arise. What specific kind of application is prompting your questions about quantiles? Quantiles are often used for descriptive summaries of data where many of these borderline issues aren't relevant. // Notice that in very large samples, the differences among the various types of quantile definitions don't make a practical difference. $\endgroup$
    – BruceET
    Commented Oct 9, 2019 at 6:24
  • 1
    $\begingroup$ Thanks. It would apply for a smaller sample eg suppose an item is in percentile 98.9% but so ok it's still in the 98% percentile. (could also apply for deciles and a smaller sample, though no doubt they aren't used so much).. So you answered that. For me I look at theoretical cases including edge cases, so as to better understand the concept / understand it more lucidly. $\endgroup$
    – barlop
    Commented Oct 9, 2019 at 17:22

Not the answer you're looking for? Browse other questions tagged or ask your own question.