4

For example, X is a random integer from 1 to 16. Now I get a piece of information: X is 3, 5, 9, or 14. This has 2 bits of information for the knowledge about X. But if the list of options is random enough, I'd have to list all four integers to describe this knowledge, which is much more than 2 bits. It may need more information, if the probability distribution is not uniform, and I attach a probability to each of the options.

Is there a term for this phenomenon?

In mathematics context, this is basically just saying, for a random variable X, there are infinite possibilities of Yi that I(X;Yi) is a given number, so if Y represents the pair <i,Yi> where i is also a random variable independent of everything else, I(X;Y) = I(X;Yi), but H(Y) could be made very high. Or for simplicity, let's ignore the first part, and say for random variables X and Y, H(Y) could be much higher than I(X;Y). This is a quite dull question in mathematics, because they are clearly different as we use different symbols to refer to them, and it is usually unambiguous which one we are talking about. But in philosophy, we could easily say "some knowledge about X", that could refer to either of them, that is the real information inside X, and the information inside X and of the approach to find X together. So I think something is required to address this problem.

In the example, the approach could be to categorize the values into 4 arbitrary lists of 4 elements each, and find which list it belongs. It has more information if we consider the approach not predecided, but chosen in a way using an extra random variable.

It's especially important in philosophy context because if we are discussing about the concept of knowledge, the approach is usually an open set, and there isn't a default one if nothing is specified.

6
  • Are you talking about needing a prior distribution to get the (posterior) distribution after conditioning the prior on the "event itself"? How is the amount of space needed to record the options and their probabilities relevant here?
    – Conifold
    Commented Oct 16, 2023 at 13:48
  • What do you mean when you say "X is 3, 5, 9, or 14" has two bits of information? Commented Oct 16, 2023 at 14:42
  • @DavidGudeman It has reduced the 16 options to only 4 options.
    – user23013
    Commented Oct 17, 2023 at 1:33
  • 1
    This just sounds like Shannon information aka surprisal or information content of some event x consisting of several outcomes within a probability distribution, defined as -log P(x). In your second case with a greater range of the distribution the probability of the same said event would most likely decrease, ergo the surprisal now becomes larger… Commented Oct 17, 2023 at 7:05
  • 3
    I’m voting to close this question because it's not about philosophy but about math.
    – Olivier5
    Commented Oct 17, 2023 at 10:23

1 Answer 1

0

No since there are two misunderstandings about information entropy in the OP.

A sample from a random process does not have information (entropy), so there is no entropy/information defined with/for X, defined as "a random integer from 1 to 16" in the OP. Information is defined for the distribution. A uniform distribution over 16 states does in have 4 bits of information. If you really insist on assigning an entropy to a population of events consisting of one instance, the information in it would be 0.

The labels of the states does not matter in the definition of the entropy. The fact that you chose to label the states in the restricted set with the particular symbols 3,5,9,14 does not affect the entropy of the distribution over these 4 states.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .