9
$\begingroup$

We know that, formally, the assumptions of a test cannot be tested, because if we choose what test to use based on the results of the test, the resulting composite test has unknown properties (Type I and II error rates). I think this is one of the reasons why "Six Sigma" kind of approaches to statistics (use a decision tree based on test results to choose which test to use) get a bad rap among professional statisticians.

However, with real world data, we often get samples for which classical assumptions may not hold, and thus we need to check in a way or another. So what do you actually do in your job/research? Perform an informal check, for example have a look at the distribution of data, and use a t-test when the empirical distribution doesn't seem too skewed? This is what I see being done most of times. However, as long as we take a decision based on the result of this "informal test", we still affect the test properties, and of course if we don't use the check to take a decision, then the check is useless and we shouldn't waste precious time doing it. Of course, you could answer me that formal test properties are overrated, and that in practice we do not need to be religious about that. This is why I'm interested in what you do in practice, not just from a theoretical background.

Another approach would be to always use the test with less assumptions. Usually, I've seen this approach being framed as preferring nonparametric tests over parametric tests, since the former don't assume that the test statistics comes from a family of distributions indexed by a vector of parameters, thus should be more robust (less assumptions). Is this true in general? With this approach, don't we risk using underpowered tests in some cases? I'm not sure. Is there a useful (possibly simple) reference for applied statistics, which lists a list of tests/models to use, as better alternatives to classical tests (t-test, Chi-square, etc.), and when to use them?

$\endgroup$
1
  • $\begingroup$ Six Sigma methods are designed for processes that have been and are to be run over and over, e.g., as in manufacturing. They have little or nothing to say about the issues concerned with data (information) that is custom and ad hoc, ex novo or completely novel. This means that real knowledge discovery is inherently risky and requires replication for consecration. $\endgroup$
    – user78229
    Commented Sep 2, 2016 at 16:09

2 Answers 2

2
$\begingroup$

What I have seen done most often (and would tend to do myself) is to look at several sets of historical data from the same area for the same variables and use that as a basis to decide what is appropriate. When doing that one of course should keep in mind that mild deviations from e.g. normality in the regression residuals are generally not too much of an issue given sufficiently large sample sizes in the planned application. By looking at independent data, one avoids the issue of messing up test properties like type I error control (which are very important in some areas like confirmatory clinical trial for regulatory purposes). The reason for (when appropriate) using parametric approaches is, as you say, efficiency, as well as the ability to easily adjust for predictive covariates like a pre-experiment assessment of your main variable and to get effect size estimates that are easier to interpret than, say, the Hedges-Lehmann estimate.

$\endgroup$
2
  • $\begingroup$ Interesting - if I had more sets of data, I'd try to aggregate to gain power, but not aggregating and reserve historical data for assumption checks is an interesting alternative idea. Also reviewing literature may help. Definitely agreeing on the fact that effect size estimates from parametric approaches are easier to interpretate. $\endgroup$
    – DeltaIV
    Commented Sep 3, 2016 at 15:47
  • 1
    $\begingroup$ I guess coming from the pharmaceutical industry, I was thinking about trials of different drugs. If strict type I error rate control is not necessary and it is more for internal decision making, I guess one might still also use the previous trials of other drugs also for getting a prior for the control group, but the focus is usually on analyzing a new trial of a new drug. That may explain my particular perspective. $\endgroup$
    – Björn
    Commented Sep 4, 2016 at 5:20
0
$\begingroup$

Personally, I like to run a parametric test and its non-parametric equivalent, and test the assumptions of each all at once. If the assumptions of the parametric test aren't massively violated or if I get similar results with the non-parametric text, I will use the parametric test. Even if the parametric assumptions are violated, if you get significant results, you can be pretty confident in them because the test was weakened by the violation. Plus, let's be honest, it hard to make a meaningful interpretation of results like "group A had a mean rank score that was 12 higher than the mean rank score of group B."

$\endgroup$
2
  • $\begingroup$ If you test the assumptions of the parametric test, and use the nonparametric when the former's assumptions are violated, otherwise revert to the parametric, then you're effectively using a composite test of unknown properties. Do you think this is not an important issue? I agree on the difficulty in interpreting the results of some nonparametric test - for example, in the Mann Whitney Wilcoxon, scale and location get confounded, which surely doesn't simplify interpretation. $\endgroup$
    – DeltaIV
    Commented Sep 3, 2016 at 15:41
  • 1
    $\begingroup$ Honestly, I hadn't thought of it that way. It does raise a good point. Ultimately though, I think, at least for the work that I do, clearly understandable results that don't massively violate test assumptions is the biggest concern. People tend to have a hard enough time understand statistics anyway. $\endgroup$
    – JRF1111
    Commented Sep 3, 2016 at 16:23

Not the answer you're looking for? Browse other questions tagged or ask your own question.