0
$\begingroup$

I want to compare the daily average revenue of a promotion period (7 days) of a business with the daily average of the rest of the year.

So, sample 1 has 7 data points, whereas sample 2 has 300 data points (300 days). Sample 1 is a small sample and sample 2 is a right-skewed distribution because of seasonality.

My goal is to create an evaluation method of the sample 1 average, based on sample 2 average, for which:

  • If sample 1 average is between sample 2 average -+ 1 deviation it went okay.
  • If <1 deviation, it went poorly.
  • If >1 deviation it went well.

So far, I tried comparing the mean of sample 1 with the mean of sample 2 -+1 standard deviation and comparing the median of sample 1 with the median of sample 2 -+1 median absolute deviation. I also tried to bootstrap sample 1 and use the central limit theorem for sample 2. The CLT method doesn't reflect reality. The first two methods might be kinda good, but I don't feel the evaluation is precise.

As I want to compare the performance of the avg revenue of a sample of just 7 days with a highly right skewed distribution, is mean and standard deviation or median and mad good?

Is there a better alternative?

$\endgroup$
9
  • 1
    $\begingroup$ This can be simple or as complicated as you want. You can calculate mean and median for your promotion period and compare with mean and median for the rest of the year. And stop there. A more challenging problem is to compare results for your promotion period compared with what you would expect any way given time of year and seasonality, How effective the promotion was needs some kind of analysis bringing in (e.g.) the effects of other promotion periods in the past. Otherwise put, if seasonality is the backdrop, it needs to be brought into your analysis. $\endgroup$
    – Nick Cox
    Commented Dec 4, 2022 at 10:08
  • $\begingroup$ just wanted to mention that looking at percentiles might be more appropriate than symmetric intervals about the mean/median if the right-skew is significant. If you plan on doing this for each week (like, was Jan 1-7 Different? Was Jan 8-15 different?,...), you should take a look at control charts as well. $\endgroup$ Commented Dec 4, 2022 at 13:10
  • $\begingroup$ Thank you both! @nick yeah about seasonality and promotion my idea was to remove the outliers and define a "normal" range through deviations, but still the analysis should consider the period of the year, otherwise data alone doesn't mean a lot. john thank you for the suggestion of control charts, definitely worth considering for an ongoing analysis. Among all of the tries, the ones who satisfy me most are taking iqr and using the mean and 1+- std to define the normal range Or using the median and median absolute deviation of the original distribution, which leads to a narrower range $\endgroup$
    – andstat
    Commented Dec 4, 2022 at 15:42
  • $\begingroup$ Watch out that normal range is your own terminology here. Whatever works for your purposes ... but watch out that two kinds of explanations may be needed, either below to less technical people to explain what you're doing or above to more technical people, ditto but also differently. $\endgroup$
    – Nick Cox
    Commented Dec 4, 2022 at 16:21
  • $\begingroup$ If I were a manager [no way that is going to happen] I imagine I would most want to see a graph of what happens in a promotion period and immediately afterwards within a context of before and after. An entire year's worth of data might be too much but you can compromise. You might need to smooth or aggregate the data depending on how noisy they are. Comparisons between promotion periods should also be relevant. $\endgroup$
    – Nick Cox
    Commented Dec 4, 2022 at 17:54

0