Skip to main content

Questions tagged [information-theory]

A branch of mathematics/statistics used to determine the information carrying capacity of a channel, whether one that is used for communication or one that is defined in an abstract sense. Entropy is one of the measures by which information theorists can quantify the uncertainty involved in predicting a random variable.

0 votes
0 answers
12 views

Mutual Information of nonadjacent nodes in Bayesian Network

How do you compute the mutual information of two non-adjacent nodes in a Bayesian network? In this case, what would $I(D;A)$ be? Would I need to take the conditional probabilities of all intemediate ...
phylosopher's user avatar
0 votes
0 answers
20 views

differential entropy for comparison distributions

I want to use differential entropy to compare the outcome of Bayesian updating (multidimensional probability distributions) for different datasets. My parameters are different physical parameters i.e. ...
Sobol's user avatar
  • 41
2 votes
3 answers
104 views

Is feature-extraction and dimensionality-reduction a kind of compression?

I'm struggling to understand what these terms have in common: Feature extraction Feature selection Compression Dimensionality reduction Relatedly, the information / entropy in our data should always ...
sueszli's user avatar
  • 23
1 vote
0 answers
27 views

Fourier transform in information transfer in biological neural network

Principles of Neural Design by Peter Sterling and Simon Laughlin describes a usage of information theory in calculating the rate of information transfer in the brain. ...when successive signal states ...
Leo Juhlin's user avatar
0 votes
0 answers
4 views

Sample Complexity of BHT with varying degrees of (large) compression

In Communication-constrained hypothesis testing: Optimality, robustness, and reverse data processing inequalities, the following (up to some mild editing to highlight my question) is established. ...
Mark Schultz-Wu's user avatar
0 votes
0 answers
18 views

Interpretation of time series spectral entropy values wrt forecastability by a general neural network

I recently started using spectral entropy to analyze time series (already windowed). I'm having difficulty for interpreting the results, the entropy of the last 25% of a series is 0.19, and the ...
Marco's user avatar
  • 51
2 votes
0 answers
41 views

Shannon source coding theorem and differential entropy

Loosely speaking, Shannon's source encoding theorem says that there is an encoder with rate at least $H(x)$ such that $n$ repetitions of the source can be mapped to at least $nH(X)$ bits of binary ...
nervxxx's user avatar
  • 121
1 vote
1 answer
59 views

Chain rule conditional entropy

A textbook I am reading states that$$H(X,Y)=H(X)+H(Y|X)$$where $H(X,Y)$ is the joint entropy of random variables $X,Y$, $H(X)$ the entropy of $X$, and $H(Y|X)$ is conditional entropy. It then states ...
user124910's user avatar
0 votes
0 answers
21 views

How do you choose to put a distribution on the right or left of KL divergence? [duplicate]

I always thought of KL divergence as a distance metric between distributions, much like Earth-Movers distance. But I can no longer ignore the asymmetry. A real distance metric is symmetric. How should ...
profPlum's user avatar
  • 401
1 vote
0 answers
28 views

Mutual Information decay

Consider $m$ channels indexed by $i$ with $1 \leq i \leq m$. The input alphabets are from the same finite set $\mathcal{X}$. Let $\pi$ denote a probability distribution on $\mathcal{X}$. Define the ...
Sushant Vijayan's user avatar
3 votes
0 answers
283 views

Minimum Description Length, Normalized Maximum Likelihood, and Maximum A Posteriori Estimation

TL;DR: I believe MDL using NML is a special case of the joint MAP of model and parameters, and need to verify this and find sources that have acknowledges this. This is how I understand Minimum ...
Feri's user avatar
  • 197
0 votes
1 answer
43 views

Is there any work linking information channel theory to statistical inference?

I wonder what is the theoretical limit of a statistical inference problem. For example we have a model with many parameters, and we can sample many data points from the model. This can be viewed as a ...
yuanyi_thu's user avatar
1 vote
0 answers
28 views

Choosing number of lag AND model form for Augmented Dickey-Fuller test

Before realising an Augmented Dickey-Fuller (ADF) test, one has to answer 2 questions, how many lags p to include in the model, AND which model to choose among the following: No constant, no trend ...
cp123456's user avatar
1 vote
1 answer
44 views

Minimum entropy decomposition of probability distributions

Say you want to decompose a probability distribution (a PDF) into a mixture of distributions in such a way as to minimize the mean entropy of the component distributions. I have an idea that this is ...
zonofzin's user avatar
1 vote
0 answers
16 views

Effect on entropy when we scale Bernoulli plus Gaussian

Question: Given $X\sim\text{Bernoulli}(\alpha)$, $Y\sim\mathcal{N}(0,1)$, and non-random positive constants $C,\epsilon>0$. Let $H(\cdot)$ be the differential entropy. Is it true that $$ H((C+\...
Resu's user avatar
  • 229

15 30 50 per page
1
2 3 4 5
45