Significance tests for streaming data

Question

I want to compare multiple classifiers on multiple data streams. For a stream of length $n$ I test a single classifier each $t/c$ time steps using a dedicated (hold-out) subset of my data and calculate the AUC.

How can I apply a Friedman test with a post-hoc Nemenyi test as described in Statistical Comparisons of Classifiers over Multiple Data Sets?

Lets say I have 3 stream, 5 classifiers and calculate 100 AUCs per stream.

I suppose I have to average the AUCs of each classifier calculated on a stream and not treat each AUC as a result of an experiment. The reason why I think the individual AUCs cannot be used is this part from the paper:

In our examples we have used AUCs measured and averaged over repetitions of training/testing episodes. For instance, each cell in Table 6 represents an average over five-fold cross validation. Could we also consider the variance, or even the results of individual folds? There are variations of the ANOVA and the Friedman test which can consider multiple observations per cell provided that the observations are independent (Zar, 1998). This is not the case here, since training data in multiple random samples overlaps. We are not aware of any statistical test that could take this into account.

In the case of my setup above, there is obviously overlap within a stream.

Community · Accepted Answer · 2020-06-11 14:32:37Z

I suggest you determine if/how to handle "concept drift" and what kind of inference you need on streaming (big?) data.

There are a number of excellent papers but this one from Prof David Blei is among the top few.

"The Population Posterior and Bayesian Inference on Streams" https://arxiv.org/pdf/1507.05253.pdf

The abstract gives good hints about how this relates to "standard probabilistic modeling approaches":

Many modern data analysis problems involve inferences from streaming data. However, streaming data is not easily amenable to the standard probabilistic modeling approaches, which assume that we condition on finite data.

We develop population variational Bayes, a new approach for using Bayesian modeling to analyze streams of data. It approximates a new type of distribution, the population posterior, which combines the notion of a population distribution of the data with Bayesian inference in a probabilistic model.

We study our method with latent Dirichlet allocation and Dirichlet process mixtures on several large-scale data sets.

A few others

"A Comparison on How Statistical Tests Deal with Concept Drifts" http://worldcomp-proceedings.com/proc/p2012/ICA2334.pdf
"A SURVEY OF CLASSIFICATION METHODS IN DATA STREAMS " http://charuaggarwal.net/streambook.pdf
"[Charu C. Aggarwal] A Survey of Stream Classification Algorithms" http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.675.1122&rep=rep1&type=pdf

Since this is not really a full answer I will wait for feedback before going further in this direction.

Stack Exchange Network

Significance tests for streaming data

1 Answer 1

Not the answer you're looking for? Browse other questions tagged
hypothesis-testing
statistical-significance
anova
cross-validation
p-value
or ask your own question.

Hot Network Questions

Significance tests for streaming data

1 Answer 1

Not the answer you're looking for? Browse other questions tagged hypothesis-testingstatistical-significanceanovacross-validationp-value or ask your own question.

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
hypothesis-testing
statistical-significance
anova
cross-validation
p-value
or ask your own question.