1
$\begingroup$

Suppose I have two data samples of the same size N generated from two separate distributions, and a new number is given. How could I find out which distribution this new number is coming from? What should I do if the sample size were different?

My first thought is to compare the empirical probabilities based on these two samples, but I am not quite sure if it is the optimal solution. I am very grateful for any help.

$\endgroup$
2
  • $\begingroup$ What do you know (or are willing to assume) about these distributions? $\endgroup$
    – Annika
    Commented Dec 26, 2021 at 2:03
  • $\begingroup$ @Bay I was only given two data samples. $\endgroup$
    – www
    Commented Dec 26, 2021 at 15:35

1 Answer 1

1
$\begingroup$

Given $x$, we compare $P(D_1|x)$ and $P(D_2|x)$ and decide which has a higher value.

To compute these values though, some assumptions and prior knowledge are required.

From Bayes' rule, this is equivalent to comparing $P(D_1)P(x|D_1)$ and $P(D_2)P(x|D_2)$. We have to make assumption about $P(D_i)$.

$\endgroup$
1
  • 1
    $\begingroup$ +1 — that was what I was getting at in my comment too — just the data is not enough $\endgroup$
    – Annika
    Commented Dec 26, 2021 at 15:48

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .