0
$\begingroup$

I have the following set of numbers: {20, 20, 20, 22, 22, 22, 22, 22, 23, 23, 23, 23, 23, 23, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 24, 25, 25, 25, 25, 25, 25, 25, 25, 25, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 26, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 27, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 28, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 29, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 30, 31, 31, 31, 31, 32, 32, 34, 35, 36, 36, 39, 40} and each number has a different value associated with it which I wanted to place on an XY plot.

When I calculate the deciles I come up with: {1st decile: 23; 2nd decile: 24; 3rd decile: 26; 4th decile: 26; 5th decile: 26.5; 6th decile: 27.5; 7th decile: 28.5; 8th decile: 29; 9th decile: 30.5}

Deciles 3 and 4 are both equal to 26 and decile 5 is 26.5.

Should I switch to Quintiles or some other format (since deciles 3/4 are equal there aren't any numbers that fall between 26 and 26.5)?

$\endgroup$
5
  • $\begingroup$ More information will be helpful, like what does these numbers represent, what are the values of Y as well as what does Y represent. Deciles could be one way, same values for two deciles may or may not be problematic a decile = 26.5 where all the numbers are integers may or may not be problematic. In a lot of cases, showing a chart with individual X, frequency of X and value of Y works well. $\endgroup$ Commented Jan 5, 2017 at 4:00
  • 1
    $\begingroup$ Why are deciles relevant to graphing the data? How are you planning on using them in making the plot? $\endgroup$
    – whuber
    Commented Jan 5, 2017 at 15:44
  • $\begingroup$ To elaborate, each number has a true/false event attached to it. I wanted to separate in deciles to get a bit more of a detailed graph where I would take the %True for each decile and plot it on the graph (Where Y=%true and X=range of numbers above) and each decile would be a point on the graph, The issue I have with using deciles is that decile 3 and 4 both equal 26 on the X-axis. Should I switch to something where each point will have a unique X-Value (say quartiles) or should I just merge the results of decile 3 & 4? I hope this helps! Thanks for your help in advance! $\endgroup$
    – obsoleet
    Commented Jan 5, 2017 at 16:25
  • $\begingroup$ Your explanation of your real aim arrived too late for my answer. I'd suggest plotting fraction true as a smoothed function of whatever these data are; or as a function of the individual values that actually occur. Division into deciles is arbitrary here. I see in some fields an apparent belief that e.g. quintiles or deciles are natural classes with some kind of extra magic, if so, why not sextiles, septiles, octiles, noniles? Meta-tip: don't be coy about asking the underlying question! $\endgroup$
    – Nick Cox
    Commented Jan 5, 2017 at 16:31
  • $\begingroup$ Decile 5 is naturally the median and so the mean of the two middlemost values, the comedians, as some say. For an even number of values (here 124), it's no surprise then that it can be a half-integer, even for integer data. $\endgroup$
    – Nick Cox
    Commented Jan 5, 2017 at 16:39

1 Answer 1

2
$\begingroup$

For graphing the data, a quantile plot often works well, as illustrating both the general form of the data and their detailed structure, including level, spread, shape, ties, gaps and outliers. Here, as is common in statistical graphics, a quantile plot (by default) implies plotting all the quantiles (or, as some might prefer, all the order statistics). The deciles don't lie, as there are many ties, but in what sense is that a problem?

enter image description here

There is thus no obvious need here at all to reduce the data first to particular quantiles.

Naturally other kinds of graph may be helpful too, and some would want to emphasize that (empirical cumulative) distribution plots and survival function plots and other relatives show precisely the same information. Conventions about whether to show data points as such, data points with connecting lines, or just the connecting lines, are, as said, conventions and so at choice. For a small sample as here, I often prefer to see points; for a large sample, the points often blur together any way.

What is here labelled fraction of the data is in this graph a plotting position (rank $-\ 0.5$)/ sample size, other flavours of plotting positions being possible too.

It's not central to the question, but there are many different ways to calculate deciles as summary statistics too.

This thread has good references on quantile plots.

EDIT: "on an XY plot" is (as I write) the only hint in the question of what a later comment from the OP reveals as being the true purpose here. For the moment I am letting my answer stand.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.