1
$\begingroup$

I have the following vectors:

vec_1=c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
vec_2=c(1,1,1,1,1,1,1,1,1)

from which I compute the corresponding sums:

> print(sum(vec_1))
[1] 18
> print(sum(vec_2))
[1] 9

Is there a way to test if these two sums are statistically different?

$\endgroup$
6
  • 1
    $\begingroup$ Well... clearly they are different since they are of different sizes, since if they were of same size the sum would be identical because you only have 1's in your data. Right now you are asking something like if the number of occurrences is statistically different between these two vectors, is this what you want? Give us some context on the actual problem. $\endgroup$ Commented Jan 29 at 10:53
  • 3
    $\begingroup$ Detail: Don't be misled by the function name used in R. The test you're referring was proposed by Wilcoxon. $\endgroup$
    – Nick Cox
    Commented Jan 29 at 11:34
  • 2
    $\begingroup$ Please tell us more about the data generating process. Why are the number of ones different? Is the number of observations in both groups identical and sometimes you observe an event and sometimes you don't? $\endgroup$
    – Brandmaier
    Commented Jan 29 at 11:58
  • 1
    $\begingroup$ yes. Assuming that we have two time-series of days from 1 to 100, I observe the 1s in specific days within the whole time-series and the 1 observations change between the two time-series (here represented as vectors) $\endgroup$
    – aaaaa
    Commented Jan 29 at 12:03
  • 1
    $\begingroup$ So, you really have 18 ones and 82 zeros in one series, and 9 ones and 91 zeros in the other series. With that, you can test whether the proportion of ones is different between the two, e.g., in R prop.test(c(18,9),rep(100,2)). Is this what you are looking for? $\endgroup$ Commented Jan 29 at 12:36

1 Answer 1

4
$\begingroup$

In your comment, you suggest that you might have wanted to use Wilcoxon rank sum test or t test. This implies that you are interested in some aspect of location. (The t test tests means, the Wilcoxon tests whether one distribution is different from another in the sense that

null hypothesis that, for randomly selected values X and Y from two populations, the probability of X being greater than Y is equal to the probability of Y being greater than X.

Your locations are identical: Both sets have mean = 1, median = 1 and no variation. So, no test of any hypothesis about location is needed or reasonable.

But in your question, you ask about sums. Well, of course 18 is greater than 9, but, for this, what is your sample and population and hypothesis? For t-test or similar, each item in each vector is an observation, so you have 27 observations. But for sums, each vector is an observation and you have only 2 observations. Not much you can do with that, at least, not without a lot more information.

$\endgroup$

Not the answer you're looking for? Browse other questions tagged or ask your own question.