24

I am fascinated with information theory, as put together by Claude Shannon in the 1940s. It is amazing to me that this concept arose from analysing letters in the alphabet and then was later abstracted to black holes. But what I find lacking is the definition of what information actually is.

Wikipedia's page on information theory gives me this very early on:

A key measure in information theory is entropy. Entropy quantifies the amount of uncertainty involved in the value of a random variable or the outcome of a random process.

It seems to me that the definition of information already moves the goalpost further at a very early point. It establishes that information has to do with entropy: this other thing. The very first line on Wikipedia's page about information itself states:

Information can be thought of as the resolution of uncertainty; [...]

The "can be thought of" there throws me off a bit.

From the comments on this other PSE question I got that information theory's entropy correlates with entropy in thermodynamics. Which is great, but doesn't define what information is, just that it abides by a similar law than thermodynamic systems.

Another comment stated that information is physical (leading to this empty Wikipedia page) and that it may be analogous to energy (SEP).

So it seems to me that information and energy are related concepts. From Wikipedia's page on energy we get this very precise, very physical definition of what it is:

In physics, energy is the quantitative property that must be transferred to a body or physical system to perform work on the object, or to heat it. Energy is a conserved quantity; the law of conservation of energy states that energy can be converted in form, but not created or destroyed.

However, another PSE answer states something that seems contradictory:

Information is a non-physical concept, [...]

Thus it sounds like we have a pretty good grasp of what energy is and that it is affected by entropy. However, we do not seem to have defined information except for how it is also affected by entropy.

Entropy is most visible when a system changes from one state to another. Entropy in information theory also arrises when something is communicated from a sender to a receiver. But that sounds to me like defining water as "the thing that goes through a pipe".

Thus the question is: what is information?

Is it a quantity, like the number "2" in "the two apples on the table"? Or is it a quality, like the roundness and sweetness of the fruit that makes it an apple? Or is it the apple itself (either as a Kantian "apple-in-itself", inaccessible to us, or a particular approximation of the apple)?

A follow-up question then is: is there more contemporary work on defining it?

Moreover: shouldn't there be now a Philosophy of Information as a field of enquiry?


Edit 1:

It has been noted in answers and the comments that defining information, or defining energy is fruitless. To summarise and quote the argument: “the more we investigate nature, the more we fail to get anything but abstract math.”

It has also been discussed the correlation between Shannon entropy and Boltzmann entropy, the latter arising from the transformation of a thermodynamic system from state A to B and the correlation between all the micro and macro states of the system in states A and B.

So perhaps a more refined question would then be: if Boltzmann entropy happens when heat or pressure is transformed in a thermodynamic system, what is being transformed when Shannon entropy arises?


Edit 2:

Just to reiterate, I'm not looking for the meaning of the word "information". I'm looking for the phenomenological study on information as "a thing" that exists in the universe.

It has also been suggested that it is the reduction of uncertainty in a symbolic system. Examples were given using a deck of cards or dice to illustrate the point, and it has been raised that the uncertainty in those systems is subjective. If we don't know the sequence of the cards on the deck, there's more uncertainty there. However, this is too narrow an approach. Say I came from a planet where we store decks of cards in the precise order that earthlings call random. I would then have more information about the deck than the earthling, which shows that information is subjective. But it is only subjective because decks of cards and dice are things that earthlings make!

Contrast that with the debate whether information is lost at the event horizon of a black hole. Is that information subjective too? The no-hair theorem postulates that only mass, electrical charge and angular momentum is preserved when a body falls into a black hole's event horizon. Is angular momentum subjective too?

So it seems to me that information is not subjective. It is subjective when we apply it to things that are particular to us. But there's strong indication that it is a "thing" that "happens" in spite of us. What is this "thing"?

I think this is a question worthy of philosophical exploration.


Note 1: I'm not looking for a semantic definition of information, I'm looking for the epistemological definition of the concept.

Note 2: I'm a long-time lurker, but first-time asker, and not a trained philosopher, so please correct any mistakes in my question.

10
  • Comments are not for extended discussion; this conversation has been moved to chat.
    – Geoffrey Thomas
    Commented Jun 7, 2021 at 18:34
  • 1
    Regarding your first edit: If you are not looking for the meaning of the word "information" you should not ask "What is information?". Seriously. Secondly, if you studied the "phenomenology" of a thing you'll have a pretty good idea what the word means, so you actually are looking for the meaning by another name. Commented Jun 7, 2021 at 21:11
  • 1
    I feel this should be a 'protected question', for those who can change that. Just wanted to say as well, really well asked question - good work @BellAppLab!
    – CriglCragl
    Commented Jun 8, 2021 at 0:26
  • Could you take that back and absorb the Edits and Notes into the original Question, please? Commented Jun 8, 2021 at 0:45
  • 2
    @BellAppLab It's not really an answer, but I do know Luciano Floridi of Oxford has a couple of books on the subject; his page is at philosophyofinformation.net/about . Meanwhile, see the wiki page for more on Information Science!: en.wikipedia.org/wiki/Information_science Commented Jun 8, 2021 at 11:34

14 Answers 14

18

First, to speak very broadly, information (in the mathematical sense of information theory) is a quantity calculated from a probability distribution.

In the Bayesian interpretation of probability, probability is a person's subjective degree of belief in a proposition. The consequence is that information is also subjective, and is relative to an individual's prior beliefs. A message may convey a lot of information to a person who was unaware of it, while the same message conveys less information to a person who already knew of it.

Information is measured as some number of bits. This does not necessarily correspond to the actual number of physical ones and zeros in a message. It is instead a theoretical minimum number of bits needed. Specifically, it is the minimum number of bits necessary, on average, to communicate which of a set of random outcomes actually occurred.

On the one hand we talk about a probability distribution involving many outcomes with a different message sent for each outcome, and on the other hand we talk about information in an individual message. For this we need to make a distinction between "entropy" and "surprisal." First we have the information entropy - the Shannon entropy. This is a quantity defined as - ∑_i pᵢ log pᵢ, where pᵢ is the probability of the i'th outcome, and the logarithm is taken as base 2 (if we want a result measured in bits). As mentioned, this is the minimum number of bits necessary on average to tell someone which outcome occurred. Entropy is a property of the whole probability distribution.

Then we have the "surprisal" (also called self-information), which in many ways is closer to a person's intuition about the information contained in a single message. The surprisal is defined for a single outcome i, as -log pᵢ. Outcomes of very low probability have very high surprisal. In an optimal code for the whole distribution, we would expect each message to require about -log pᵢ bits to communicate that particular message. The optimal code would assign short code words to common outcomes, and would assign longer code words to less common, or more "surprising," outcomes.

Naturally, the surprisal of a message depends on who is hearing it. If you already expected the message, then in your internal probability distribution the probability of that message is high, so to you the surprisal (and the psychological surprise) will be low. But if it is something astounding to you, then you assign a low probability to that outcome, so the surprisal is high.

The connection to thermodynamic entropy is simply that we use probability in thermodynamics, so the same concepts apply. We can look at a box of gas molecules and ask what is the probability that each molecule will have a particular velocity and position (up to a certain, predefined precision). The collection of all these velocities and positions for every molecule in the box is called a "microstate," and a microstate is one of those outcomes, i, that were mentioned earlier. Now, if we know that the box is at 50 degrees Celsius and 1.5 atmospheres, then we expect a certain distribution of positions and velocities for the gas molecules. This distribution gives us the probability pᵢ for each microstate. Then we can calculate the thermodynamic entropy - which is just the Shannon entropy of the distribution.

(Well, with some technicalities. It's actually - k_B ∑_i pᵢ ln pᵢ, where k_B is Boltzmann's constant and ln is base e instead of base 2. This is the Shannon entropy multiplied by k_B ln 2 / ln e, which gives it units and makes it more convenient for physicists).

This entropy tells us, essentially, the minimum number of bits we would need to describe the entire state of every molecule in the box, up to some predefined level of precision, to someone who already knows that the box is at 50C and 1.5 atmospheres.

This description, "50C and 1.5 atmospheres," is what's called a "macrostate." We have described some large-scale properties of the box, without being specific about where each molecule is. The macrostate yields a probability distribution over microstates, and this allows us to determine the entropy of the macrostate.

One last thing. Doesn't the Bayesian interpretation of probability mean that thermodynamic entropy is subjective too? The answer is yes. Different observers, with different probability distributions over the microstates of a system, may calculate different thermodynamic entropies for the same system. They may even use this to extract different amounts of mechanical work from the system. Essentially, to one person, a system may seem very random, and high entropy, and therefore they can't extract much work from it. To another person, who knows more about the system, the system is less random, and they can use the patterns they see to extract useful work from it. Maxwell's demon is an example of this. A machine that uses extra information to extract thermodynamic work is called an "information engine." See for example, https://phys.org/news/2018-01-efficiency.html . Note, however, that the requirement to gather information about the system consumes more energy than the information engine can extract.

15
  • I think I understand your explanation. But let me try a thought experiment. If Ei is the total energy of a system at its initial state and Ef its final total energy after some transformation, we should expect that Ef = Ei - entropy (please correctly me if I'm wrong). We know that between Ei and Ef something changed (say the temperature has reached equilibrium between a hot and a cold glass of water). It is observable that something has been transformed. But what is it that has been transformed or transferred when we incur in Shannon entropy? Commented Jun 6, 2021 at 13:03
  • 4
    @BellAppLab - That equation cannot be right since thermodynamic entropy does not have units of energy, rather it has units of energy/temperature. The only minor change I would suggest to causative's answer is the statement "the thermodynamic entropy - which is just the Shannon entropy of the distribution"--the thermodynamic entropy is actually the Shannon entropy times Boltzmann's constant, and that's why it has dimensions of energy/temperature (Shannon entropy is just a number with no units, what physicists would call 'dimensionless')
    – Hypnosifl
    Commented Jun 6, 2021 at 15:50
  • 2
    It is true that if a system goes through a transformation that involves an exchange of energy with an external thermal reservoir but no other changes to the system's macro-variables (variables like volume and particle number are held constant), then the change in internal energy dU is equal to T*dS, where T is the temperature and dS is the system's change in entropy, see the thermodynamic identity. In this case dS = dU/T, dU/T is the heat the flowed bt. system and reservoir.
    – Hypnosifl
    Commented Jun 6, 2021 at 15:57
  • @Hypnosifl Yes, that’s what I mean. My equation is wrong and very crude but what I’m getting at is: take the total temperature of the closed system before mixing cold and hot water and calculate that in terms of energy. The mix the two and when the system reaches equilibrium, the total energy remains the same, but the overall temperature is lower and the difference between the two is lost due to entropy. (Right?) if that’s right, then what is the transformation that is happening when Shannon entropy occurs? Commented Jun 6, 2021 at 19:11
  • 1
    @BellAppLab: It's conventional to say Shannon entropy is about the conversion of signal, into noise. It's a nice concretisation of the 2nd law of thermodynamics to give to free-energy freaks, to look at signal recovery/error correction, and 2nd-law-analogous: en.wikipedia.org/wiki/Noisy-channel_coding_theorem You can't get more signal out, than goes in (following the resolution to Maxwell's demon, to get 'more signal' would cost measurements & calculations that outweigh the benefit). Heat is like molecular noise.
    – CriglCragl
    Commented Jun 7, 2021 at 23:08
9

You wrote:

Thus it sounds like we have a pretty good grasp of what energy is

Actually, according to Richard Feynman we do not know what energy is:

It is important to realize that in physics today, we have no knowledge of what energy is. We do not have a picture that energy comes in little blobs of a definite amount. It is not that way. However, there are formulas for calculating some numerical quantity, and when we add it all together it gives “28”—always the same number. It is an abstract thing in that it does not tell us the mechanism or the reasons for the various formulas.

It seems to me that what Feynman says about Energy applies to Shannon entropy as well.

That problem applies to other concepts as well. Check out this beautiful video of Feynman explaining that we do not know what magnetic and electrical force "really" are:

https://www.youtube.com/watch?v=Q1lL-hXO27Q

Speaking of electrical and magnetic forces of attraction he says:

I can't explain that attraction in terms of anything else that's familiar to you. For example, if I said the magnets attract like as if they were connected by rubber bands, I would be cheating you. Because they're not connected by rubber bands ... and if you were curious enough, you'd ask me why rubber bands tend to pull back together again, and I would end up explaining that in terms of electrical forces, which are the very things that I'm trying to use the rubber bands to explain, so I have cheated very badly, you see.

10
  • 3
    Thank you for transcribing that excerpt from the video. That is a very powerful anecdote - I didn't expect any less from Feynman. It is said that science deals with the "hows" but not the "whys" and I understand that. However, isn't it the job of philosophy to delve in the "whys"? In other words, can it be the case that we are lacking a Philosophy of Information as a field of enquiry? Commented Jun 6, 2021 at 9:33
  • Feynman argues against the general idea that physicists should try to figure out what any physics concept "really is" in non-mathematical terms, or that we need a conceptualizable "mechanism" to explain a mathematical law, see ch. 2 of The Character of Physical Law, "The Relation of Mathematics to Physics." He gives an example of a failed attempt to "explain" Newton's law in terms of little particles filling space which are at lower pressure between massive bodies, shows why it doesn't work, and then says that the more we investigate nature the more we fail to get anything but abstract math.
    – Hypnosifl
    Commented Jun 6, 2021 at 16:08
  • 1
    @BellAppLab If philosophy ventures to explain the "whys" that physicists can't answer, it fully enters the domain of many meaningless, though possibly beautiful words, imho. A philosopher has no more information than scientists have, and if they enter the realm of the untestable (which every good scientist is strict to avoid), the result is invariably hot air at best, and about as valuable as Donald Trump claiming something to be a fact. What they say might be wrong, or it might even be right, but the fact that they say it is not correlated with whether it's true or not. Commented Jun 7, 2021 at 20:18
  • 1
    It seems to me that science has stumbled upon - more than once - this large blob of stuff and called it information. It’s very opaque and we have no clue what it does. If philosophy is a toolbox, than we might as well pull it out and start probing. We mess around with it, conjecture, poke it with a philosophical stick to see what it does. Maybe we never arrive at what exactly it is (most of science and philosophy never seem to “arrive” anywhere), but we learn something and make some progress. Commented Jun 8, 2021 at 11:13
  • 1
    @CriglCragl - I don't take Feynman to be attacking all philosophy in those quotes, just a type of philosophy which seeks to find some "intrinsic nature" to physical entities independent of their mathematical relations given by the laws of physics. Structural realism is a philosophical view which rejects any such notion of intrinsic non-relational essences, and Sean Carroll seems favorably inclined to it, see his discussion with James Ladyman who advocates this view.
    – Hypnosifl
    Commented Jun 8, 2021 at 16:11
3

Allow me to indulge myself and baldly theorize...

Information (as I see it) is a field of relationship. The two necessary conditions of having information are:

  1. The ability to separate and distinguish one object/event from another object/event
  2. The ablity to establish a relation between these objects/events (e.g., causation, correlation, opposition, equilibration, similarity, difference, etc.).

For instance, in order to say that an electron has a negative charge, we must first distinguish the electron from some other particle (e.g., a proton, neutron, positron, etc), and then note that the other particle has a different charge. If we only have two electrons (or one electron and one anti-proton), we might be able to distinguish them from each other on other grounds (mass, location, velocity, etc), but we have no information about their charge. There would be nothing to compare charge to, and thus no way of measuring it.

I think of information as a 'field' of relationship because it's clear to me that information is not a property of any specific object/event, but something that lies between objects/events. It's wiser to talk about the potential for information that lies in a defined context than to talk about information as though it were a concrete object that could be accessed independently. By analogy, this is a bit like gravity: it makes no sense to talk about gravity if all we have is one point mass; we must instead talk about the potential for gravitational attraction that would occur if we were to introduce a second point mass. That potential is determined by laws of physics. Information is the same way: we have no information about an object/event unless we start to consider the field of potential relationships it might have to other objects/events.

4
  • I like this answer because it’s the first that attempts to define information as “something”. It doesn’t involve an observer with knowledge of what a system is expected to do and thus capable of identifying “surprises”. If a pair of matter/antimatter particles spontaneously appears at the event horizon of a black hole, with one particle falling into it, the black hole doesn’t go “oh, that’s a surprise! I’m going to emit radiation because I’m surprised”. I’m being cheeky, of course. Commented Jun 7, 2021 at 9:27
  • Although the “surprise” factor is relevant in understanding information theory, it only exists if an observer exists. In other words, when we get a surprise in some outcome, it is the observer’s surprise we are talking about, not the universe’s “surprise”. This is my biased, realist view though, that information may be a phenomenon (“a thing”) that exists a priori. Perhaps it isn’t, but isn’t it the job of philosophy to undertake these kinds of questions? Commented Jun 7, 2021 at 9:36
  • It is important to note entropy is always relative, between two situations - before & after doing work, different heat-reservoirs, either end of a channel etc. It may be impossible to define the entropy of the universe. @BellAppLab: We could say the 2nd Law is 1 limiting 'surprise'. A blackhole will decay by Hawking radiation, exactly because in open state with a chunk of space, a net-less surprising state is possible. You could frame Shannon entropy in terms of 'surprise' too.
    – CriglCragl
    Commented Jun 8, 2021 at 0:11
  • @CriglCragl - Defining the entropy doesn't require two situations at different times, although of course most useful applications of physical entropy involve statements about how it changes (or doesn't change) over some period of time. But the Boltzmann entropy just depends on coarse-graining the set of all possible current microstates of a system into macrostates, while the Gibbs/Von Neumann entropies just require that you have a probability distribution on current classical microstates/quantum states.
    – Hypnosifl
    Commented Jun 8, 2021 at 16:26
3

Perhaps we might constructively refer to the original source! Shannon's Mathematical Theory of Communication (1948) starts with the following:

The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point ... The significant aspect is that the actual message is one selected from a set of possible messages. The system must be designed to operate for each possible selection, not just the one which will actually be chosen since this is unknown at the time of design.

There are two concepts here of a particularly mathematical character that are doing work: The set of "possible messages", and the point at which messages are "selected" and "reproduced"

Shannon outlines his five-component system to understand how a Source produces messages, it is sent along a channel via a transmission and reception, to a Destination. In this way, he explores how communication can be affected by the introduction of noise to the system, and that the connection from Source to Destination is not always easy, but that strategies of compression and error correction can produce a maximum reliability given the capacity of the channel.

In his discussion of the nature of Sources, Shannon discusses messages of various "types". Sequences of letters, functions of time, light intensity, multi-dimensional colour representations in a vector field. All of these things could still be understood as having potential values in an underlying set, and these sets can have more or less structure depending on the needs of the communication - we might think of them as sets of varying degrees of algebraic richness - and the discussion that follows seems to apply to all equally well.

Shannon acknowledges the work of abstraction in a short comment:

To do this it is first necessary to represent the various elements involved as mathematical entities, suitably idealized from their physical counterparts.

He argues that sources can be modelled as stochastic processes - specifically, he proposes that they are ergodic stochastic markov processes well defined in their behaviour over time and with well defined limits of sequences - and this helps him extract interesting properties for his measure of the information quantity of a message - it is how much choice or uncertainty there is in receiving this message rather than any other.

This is where the Entropy measure comes into play, and Shannon doesn't argue for this measure because of its connection to thermodynamics (though he does note the interesting parallel), but simply because it meets the criteria he outlines for what uncertainty should mean; the function is continuous, it is monotonically increasing with the number of equally likely events, and the uncertainty of a compound choice should be a weighted sum of the uncertainty of its individual component values. The measure is specifically just what is needed to satisfy these desiderata.

You might find it helpful to think of it as a concept of "probabilistic distance" in a topology of the states in a message-generating process - those states that are less likely have greater uncertainty, further distance, to travel, which is what makes it "more interesting" to arrive there.

What this measure is is just an abstract class of mathematical measure function defined over the mathematical objects that we take to represent some underlying phenomenon. If you want to understand this in terms of the thing you are representing, what part of the system is "the measure" has to be understood in the semantics of what is represented. Shannon acknowledges this in his paper, but that's not what he's trying to achieve by extrapolating the interesting mathematical structures.

What might help you progress is that in mathematical representation, the set membership criterion for the possible messages is usually about making sense of the idea of "sameness" of possible messages - we tend to use sets in maths because we want to say that one and the same mathematical operation is involved in membership. Representation theory likes sets because they are "concrete" - the notion of identity at work is simple and basic to the discipline in the form of "extensionality". When we come to apply such representations, we're often doing so to ground the basic distinctness notions so we can then use our more revealing algebraic, abstract methods in generality.

Semantically, then, we might want to say there are interesting things about the sameness and distinctness of messages beyond simply the extensional understanding of set theory. What does it mean to identify two messages as "the same" or as "replicating" a message in practice? Human communication is filled with very conceptually deep notions of the distinctness and identity of messages, individuals, images, signs and so on - this is why the actual practice of Library and Information Studies is really connected to the human scale, of the work of people in communities and their associated psychological tools and behaviours.

For Shannon, this is part of the engineering, part of the application of the maths to the real world, rather than the understanding of information as such, and it's still an interesting venture to understand how to interpret this topology and its features in different conceptual and practical settings. But it doesn't really factor in to the validity of his mathematical work, or even the "substance" of what is being worked on - that is for people to afterwards apply, no more a problem for information than it is for number or set.

2
  • This is an outstanding summary of information theory; is this trichotomy of information, representation, and, let's say, human-centric semiotics your insight or did you glean it elsewhere?
    – J D
    Commented Jun 11, 2021 at 18:11
  • @JD, thank you! I attended lectures by Cambridge computer scientist John Daugman ( cst.cam.ac.uk/people/jgd1000 ) in both information theory and computer vision as an undergraduate, so I'm sure this particular read of Shannon probably has some sort of genesis in his work. Commented Jun 14, 2021 at 16:21
2

Information means different things in different contexts.

Shannon information theory is a mathematical edifice and therefore defines information and its manipulation in mathematical terms. However it tells you nothing about what the information might mean.

The meaning or semantics of an item of information is sometimes described as its context. For example if the answer is 2, it helps to know that the context is the number of wheels on a bicycle. (others would argue that definition of meaning, but it serves here).

Avoiding such issues of meaning is a necessary part of any rigorous mathematical system, such as information theory. But since we, as cognitive thinking machines, inherently require information to be "about" something, Shannon's approach can never wholly satisfy our philosophical side.

1

Cloning

I assert that at the most fundamental level, information is the necessary and sufficient substance needed to make one blob of reality identical to another blob of reality (assuming you have an appropriate amount of spacetime + energy available for the operation). That is, if you take two volumes of spacetime with the same energy, the only difference between them is the information encoded therein.

Unfortunately, quantum mechanics tells us that it is physically impossible to clone a quantum state, so my definition is not realizable in practice (which some pedants might use to claim that it invalidates the definition in theory).

Separability

Now, in order to tell whether we have successfully cloned a bit of spacetime-energy, we need to be able to tell whether two such blobs are distinct or identical. Again, we are at a loss to realize this operation physically because of quantum uncertainty. And yet, we have equations which describe non-identical physical states, and we behave as if those equations correspond to reality. Thus, distinguishable configurations of energy correspond to entropic/thermodynamic microstates, and form the foundation of information. Bits, then, are simply a way of counting and keeping track of these microstates.

Because the microstate encodes everything about the particles and waves in a particular spacetime-energy blob, one could argue that information is actually the most fundamental physical entity, and that physics is just a bunch of bits sloshing around under a particular set of rules. This idea is what led Edward Fredkin to claim that the universe is just a big computation.

Hierarchy

Of course, we overload the word "information" to mean lots of things, which is why there are lots of answers (even within the answers). The fact is that humans cannot usefully deal with quantum microstates, and so we summarize them as macrostates at various levels of description. Each level blurs the details at the lower levels, and focuses on emergent properties which are only relevant at higher levels. There's no such thing as a "peach electron" or an "alligator phospholipid". At the level of electrons, you can't really describe a peach, and at the level of individual biomolecules, you can't really describe an alligator, let alone a predator/prey network.

And yet, peaches and alligators are real objects, even if we don't have quantum (or chemical, or even genomic) equations to describe them precisely. Peaches only emerge at the level of macrobiology, and so, we cannot simply say: "information is a bunch of quantum states" and call it a day. Because information is ultimately hierarchical, and is perfectly happy with this blurring of microstates into macrostates, turning them into the microstates of the next higher level.

At the same time, there are no universally unique labels for these macrostates, which means there are an unbounded number of equivalent descriptions for the emergent entities. And since different information processors can label these macrostates differently, we have all the fuzziness of language and knowledge to contend with. But the important point is that high-level fuzzy knowledge is, IMO, just as fundamental as the quantum microstates. It's just the non-uniqueness of the labelling which makes it feel derived.

Is the pattern of photons on a screen information? Yes. Is the bond angle of hydrogen in a water molecule information? Yes. Is the conformation of a protein in a cellular nucleus information? Yes. Is the metabolic history of a tree information? Yes. Is the time evolution of a galaxy information? Yes. It's all information, because it all describes what a particular universe looks like, and how it is different from a universe which does not look exactly like that.

We have tools to measure information at many levels, because information exists at many levels. Does Bayesian reasoning yield information? Yes. Does frequentist reasoning yield information? Yes. Does mathematical logic yield information? Yes. Does sociological observation yield information? Yes. These are all sources of information, because they all give data to separate this universe from a counterfactual one which is slightly different.

1

One answer would be to give up the notion of an exact definition, and just say that the when we call something "information", it belongs to a family of measures with certain characteristics.

That is in short the approach taken by the book 'The Mathematical Theory of Information' by Jan Kåhre, stating that "any measure [of information] is acceptable if it does not violate the Law of Diminishing Information".

It is relatively recent (2002), but does not seem to be widely known. Whether that is merited or not I am not in a position to judge, all I can say is that I found it inspiring when I read up on information a couple of years ago.


P.S. There is a nice quote from Karl Popper I like to keep in mind whenever the question of defining something comes up:

The view that the precision of science and of scientific language depends upon the precision of its terms is certainly very plausible, but it is none the less, I believe, a mere prejudice. The precision of a language depends, rather, just upon the fact that it takes care not to burden its terms with the task of being precise.

From 'The Open Society and its Enemies', chapter 11.

2
  • I think you should have given the book title and author as the hyperlink, so I am going to give here: 'The Mathematical Theory of Information' by Jan Kåhre. The blurb sounds fine, sure, but there is suspiciously little biographical info about him online for a (we assume) academic, and no reviews of the book in journals. So it suggests it's not a serious reference text.
    – CriglCragl
    Commented Jun 7, 2021 at 23:48
  • Thanks, added that to the post. I agree with you that there is too little academic support to call it a serious reference text. Kåhre is however not an academic, which is normally also a warning sign, but there was enough connections to "serious" people (like Hans Christian von Baeyer) for me to be intrigued.
    – Erik
    Commented Jun 8, 2021 at 7:53
1

It’s a bit odd that no one has pointed out that there is no expressible definition of information which does not assume the presence of information (as language). You are using information to define information. Problems are to be expected.

One dodge is to claim that information is a set of facts that differentiates one state from another; “A is hotter than B” differentiates A from B, and hence the measurement is information. Information allows you to create partial orderings of things. For example, information can be used to partitions true things from false things. But these are all dodges; the first sentence uses the world fact, and what is a fact? It’s a piece of information we attach to statements to partition them into true and false. But then what are true and false? And you never actually hit bottom.

So I think a handwave is going to be necessary. You can say that information is some statement about reality that can be used to judge a course of action. Information you can’t know or don’t know how to act on isn’t information, at least in your frame of reference, in this view. Of course all that does is bury the notion under a concept of decision making, which implies sentience, which is a bigger can of worms. And nothing screams “we don’t or can’t know” like tucking a definition under the rock of someone’s “frame of reference.”

3
  • I think this is a cop out. We don’t assume that a theory of breathing is plagued by circularity because human observers depend on it as a prior to theorising. Now you could introduce Godel/Turing/Tarski’s ideas into this to suggest that there is a worry about “completeness” in our theorising, but even then, it is possible for suitably modest and informed theorists to present and be informed by descriptive models that do not collapse into inconsistency.
    – Paul Ross
    Commented Jun 9, 2021 at 11:38
  • Can't you see those kind of recursive cohesiveness-based definitions as, emergent? It just puts me in mind of strange loops.
    – CriglCragl
    Commented Jun 9, 2021 at 12:22
  • +1 for noting impredactivity. However, we can iterate on a definition by adding or subtracting to or from it.
    – J D
    Commented Apr 29, 2023 at 17:04
1

Information is the ability to distinguish possibilities.

e.g., the more information you have about a person, the better you can distinguish them from other people. The more information you have about a phenomenon, the better you can distinguish it from other phenomena.


I find it illuminating to compare information with energy:

Energy is the ability to do work (or produce heat).

1

Information is a purely mathematical concept, usually a a characteristic of uncertainty (of a probability distribution function), but can be interpreted in different ways. In the simplest form it is introduced in information theory as a difference between uncertainties of two distributions, with uncertainty being the logarithm of a number of possible equally-probable states of a discrete random variable. For continuous distribution, it can be introduced as a logarithm of an integral. Sometimes introduced proper information - a quantity which differs from negative entropy only by a constant independent of the distribution (this constant can be taken as zero).

Thus information is a difference of proper information (difference of negative entropy) of two states. The states are represented by probability distribution functions, thus information is a formal operator of two functions.

For continuous distributions (of which discrete case is a variant) proper information of distribution enter image description here is

$$I[w]=-H(w)=-\int_{-\infty}^{+\infty}w(x)\log(w(x))dx$$

and relative information of enter image description here compared to enter image description here is

$$I[w_2,w_1]=H(w_1)-H(w_2)=I(w_2)-I(w_1)$$

or

$$I[w_2,w_1]=\int_{-\infty}^{+\infty}\log \left(\frac{w_1(x)^{w_1(x)}}{w_2(x)^{w_2(x)}}\right)$$

This operator is not much different from norm or angle in vector spaces. It is just one measure, attributed to members of the space.

Compare this with the definition of norm:

$$||w||=\sqrt{\int_{-\infty}^{+\infty}w(x)^2dx}$$

distance

$$D[w_1,w_2]=||w_1-w_2||=\sqrt{\int_{-\infty}^{+\infty}(w_1(x)-w_2(x))^2dx}$$

angle

enter image description here

So think about information as of a mathematical quantity similar to angle.

0

Thus the question is: what is information?

Like many concepts in the sciences, such as energy, space, time etc., we know the word but we don't necessarily really understand what the word is supposed to be referring to. However, we should at least be able to articulate what we mean with the words we use. The problem exists in part because most concepts in science recycle old words to give them new definitions. You could say that the problem is that scientists are lazy and that this is what causes the confusion. The word information is definitely typical in that respect. However, the definitions you quoted I think provides a very good starting point:

A key measure in information theory is entropy. Entropy quantifies the amount of uncertainty involved in the value of a random variable or the outcome of a random process.

Information can be thought of as the resolution of uncertainty

I would say myself that information is exactly a reduction of uncertainty.

Suppose we start with the statement "There is something on the table". I think we would all agree that this statement clearly provides a certain quantity of information, even though it doesn't even specify what it is exactly which is supposed to be on the table. Now suppose we are given a second statement: "There are two apples on the table". This second statement clearly provides more information. It is also clear that it reduces our uncertainty about the world. It also seems that it is doing nothing else. We should therefore conclude that information is nothing but a reduction of uncertainty.

This is not enough to resolve all the confusion but it seems enough to answer the question. One additional problem is that of the truth of the statement. False statements don't actually reduce our uncertainty. And true statements don't necessarily reduce our uncertainty. Statements don't reduce our uncertainty whenever we don't believe that they are true. We can try to circumvent this problem by saying that what matters is not the statement itself but what we believe about the world. True beliefs reduce uncertainty. But we then have a new problem which is that false beliefs reduce our uncertainty about the world in exactly the same measure. If so, information can be false information and false information is still information.

To move beyond this difficulty, we have to accept that information is a property of the message, not a property of what the message is supposed to be about. When we receive a message, we can tell what the message is. We may not be able to tell if the message is truthful or not, but we can tell how many characters it contains and how many possibilities of alternative messages there are.

Thus, a message "01" somehow constrained to use only zeroes and ones, will contain a certain quantity of information. This quantity is not determined by the reduction of uncertainty about what the message may be about, but about the message itself. There are four alternative possibilities: "00", "01", "10" and "11". Thus, receiving the message "01" reduces the number of possibilities from four to just one. Thus, our uncertainty has been reduced four fold.

However, it is our uncertainty about the message itself which has been reduced four fold. We could have received four possible messages, "00", "01", "10" and "11" but we have only received "01". Still, the message is part of the real world so that a reduction of our uncertainty about the message is a reduction of uncertainty about the world, but not a reduction of our uncertainty about what the message is understood to be about.

It has to be funny that information be so elusive. The notion of information actually relies on the way our mind works. In particular, it is crucial here that we should be able to think of "01" as one of four possibilities. Without this property of our mental representation of the world, if this is what it is, there would be no information.

And then receiving a message "01" is not fundamentally different from the situation where we throw a die and get for example a 6. The reduction of uncertainty is only possible between before and after because we have a representation of the world whereby before throwing a die, we believe there are six possible outcomes. The reduction in uncertainty comes from the fact that only one outcome will usually obtain. Thus, information only exists because we have a mental representation of the world, representation which will contain initially six values from 1 to 6, all regarded as possible outcomes, but will contain only one value, for example 6, once the die will have been cast. And once the die has been cast, our representation of the world says that there is now only one value possible, namely 6.

So information is a reduction of uncertainty, and uncertainty in this context is the number of possible states of some part of the world as represented in our mind. Of course, this interpretation applies to the data stored inside a computer or to some theoretical model of something.

22
  • Thank you for your thorough answer. But I believe your explanation stems from the colloquial definition of information. In that, a sentence would convey more information if it is more accurate. However, that it at odds with Shannon's definition: "An event with probability 100% is perfectly unsurprising and yields no information." source Commented Jun 6, 2021 at 9:19
  • So the system containing an apple on a table and the sentence "there is an apple on a table" has less information, not more. Commented Jun 6, 2021 at 9:20
  • @BellAppLab No, you completely misunderstood my answer. I carefully explained that the information content of a message is not related to what the message is understood to be about. Commented Jun 6, 2021 at 15:10
  • @BellAppLab "An event with probability 100% is perfectly unsurprising and yields no information" This is exactly what I explained. If you throw a dice which has only 6's on its six sides, you will expect a 6 and you'll get a 6. Casting the dice bring no information. Commented Jun 6, 2021 at 15:13
  • That’s the thing though. As per my understanding of information theory, if you roll a dice and it yields 6, there is no information there regardless of the dice having 6s on all sides or not. In other words, picking the number 6 from a set of numbers that already contains a 6 only tells us that the set contained a 6 before. If the dice yielded the letter “A”, then we’d have information. Commented Jun 6, 2021 at 19:33
0

That is a great assessment of the problem and it would seem that the goal post is moved at every opportunity that one may take to define it as every answer given would include the 'thing in-and-of itself' that is in question; no matter the approach as no answer could be given without it. Perhaps it is not an object per se as some have pointed out and is a term that is purely conventional. The question would remain nonetheless as to why that would be the case if true!

Removing the observer from the equation altogether is an ideal (imaginary) scenario at best that does not nullify the fact of the matter that they are involved in the process of gathering whatever 'information' there may be. Perhaps it is like a mirage. The conditions for one to occur are there (in the desert) without an observer present but one would need to be there in order to know since visual-photo transduction is a faculty of the observer that is required in order for it to emerge (same would be true for rainbows).

6
  • I do not wish to move the goalpost. I have been addressing what I perceive as gaps, or as overly specific definitions of a concept that seems to be much much broader. This answer does address it in a more elucidative way, mentioning that it may just be a convention, but as far as I can tell, we don't know yet. Commented Jun 7, 2021 at 18:24
  • However, although some information may emerge from an observer observing something, in some other contexts information is treated like a conserved quantity. Are these two separate concepts that use the same name? Or are we talking about the same "thing"? Commented Jun 7, 2021 at 18:28
  • 1
    Indeed. It is as perplexing as the notions of "mind", "time" etc. which can be considered as concomitants to the notion of "information" and do suffer similar if not the same problem(s) in the various definitions given of them.
    – Somnis
    Commented Jun 7, 2021 at 18:37
  • 1
    That is a good question. It would appear that they are two separate concepts that fall upon a scale or gradation of one grand "thing" but I certainly would not know. If information is not a, for lack of a better idea, 'unified substrate' that is "beneath" every event from a mirage to a blackhole then what exactly is it that one is alluding to? Whatever it may be it enables our ability to assess what more could come from it.
    – Somnis
    Commented Jun 7, 2021 at 18:49
  • @BellAppLab -- When physicists talk about the idea that information should be conserved in quantum physics (at least for an isolated system) and this leads to the black hole information paradox, as mentioned in this article by physicist Juan Maldacena they are talking specifically about the von Neumann entropy, not other thermodynamic notions of entropy or non-thermodynamic notions of information.
    – Hypnosifl
    Commented Jun 7, 2021 at 22:44
0

Answer

This is a question central to a nascent discipline self-labeled the philosophy of information, though clearly, it has roots in theoretical unification rooted in works like General System Theory by Ludwig von Bertalanffy. As such, what 'information' and 'energy' are is somewhat open to philosophical interpretation. Perhaps the most relevant philosophical terminology to your later edits are Das Ding-an-sich which posits that roughly offers that a physical state exists but is unknowable.

Broadly, information is anything which provides belief or knowledge, and energy is anything that causes work or changes in matter, both of which are context-dependent. In such a broad light, the questions are highly metaphysical, which is to say that they involve largely epistemological and ontological questions. We'll set aside the broader questions related to existence and knowledge and focus on the technically popular definitions.

For a lack of better terms, let's stereotype the popular technical definitions with Shannon's and Einstein's work. Shannon coined the term bit and began looking at all information as corresponding to a mathematical model (quantization entails measurement) and Einstein demonstrated the fundamental equivalence of matter and energy. To paraphrase Shannon, all knowledge can be stated or understood as a mathematical entity represented by binary numbers, and Einstein said that while mass and energy are neither created nor destroyed, they can be interconverted. The former is foundational to understanding digital information and communication technology, and the latter to understanding fusion and fission and nuclear technologies such as the atom bomb.

A good introduction into contemporary thinking about what exactly are information and energy is The Philosophy of Information by Luciano Floridi (though the prose is a bit florid, IMNSHO). It might be best to start with a notion of information that suggests that knowledge about the physical world that cannot be observed directly occupies a middle ground between the physical and the mental if one observes the basic dichotomy of naive realism. If one seeks to know, understand, and explain the physical, and one begins using technology to tackle open problems, one is inevitably left with the challenge of reconciling maps and territories and deciding what is knowable and what is not. A simple example is the Mach-Zhender interferometer that creates questions about information, energy, and action at a distance. This is where people start getting mired down between what is physical and what is mental.

Properly speaking, that is to say, on a literal reading under widely accepted precepts of analytical philosophy, information is an experience of an observer and is phenomenological whereas mass is held to be noumenological. Information is mental (rocks don't have information, but can be described by it), and mass inheres to the physical. If you have any knowledge of the history philosophy, the relationship between the physical and the mental is metaphysically contentious. Two very famous positions on this are Cartesian dualism and a rejection of it.

Now, while there are some thinkers who make radical claims like "the universe is made of information" and "particles are bits", this is often viewed with a heavy empirical skepticism by the establishment. What the orthodoxy does accept is that energy seems to be both, because, on the one hand, it's subject to mathematical calculation, but on the other hand, it has a direct effect on matter. An example to clarify:

Let's take gravitation. Can you sense mass? Touch and pressure are directly accessible by consciousness. A falling body demonstrates itself by EMR perceived by the eye. But, when I lift that mass and place it on a leaning tower, can I detect gravitational potential energy? I can calculate it (and accurately so), but what empirical clues give it away? The whole question of the development of the concept 'inertia' is an interesting study of how energy can be highly detectable (think any kinetic energy) or difficult to perceive (think any potential energy).

-2

Well, information is precisely just that: information.

Think about it, what does this word mean? Information is a condition when something is being informed.

So there is some kind of substance/medium and then some form is applied to it, where the "in-" prefix means that after applying the form the substance still remains itself, as if being caught in the form (contrast that with "en-" where the substance would have attained different qualities that it did not possess before).

So when you have such a substance and observe the process of information of that substance (the substance is being informed) that results in information of that substance, the substance getting organized into forms, it just happens that there are certain characteristic points to that process.

These points are very well described in the aforementioned Information Theory. Simply speaking there are two extreme conditions: zero information of the substance and absolute information when the entropy is locally maximized. And then there's everything in between -- that's usually where meaningful things are to be found, between the emptiness and the chaos.

The Information Theory and its probabilistic interpretations deal with the "subjective" perception of such an informed substance: when entropy is maximized that means that the substance had been superinformed to a point where sampling its state in a linear fashion won't result in anything but noise, i.e. all possible informations (states) of the sets of quantas of the substance would occur with the same probability (even distribution). Such a state maxes out substance elasticity and is very characteristic of the substance itself.

However, in subjective perception, non-linear sampling functions might still be able to "extract" information from such a state if it was artificially created. That's what archivation programs are all about. You can try to feel what you would feel like if you were such a program: https://play.google.com/store/apps/details?id=poly.sphere.puzzle That's where the "information can be thought of as resolution of uncertainty" comes from -- you are either able to extract the encoded information in those puzzles or you are not. If not -- you will never know that there was any information encoded in those flying polygons. Either you become informed of an apple, bird, fish or whatever or not. So did you receive information or not depends on your personal success in resolution of the uncertainty.

So information is a subject of perception of how a particular sampler would interpret a quantified substance in one of its states. It is not surprising that this covers anything from letters to black holes since it's simply a way of saying about when something had been arranged in a meaningful way from a certain viewpoint. And that there are degrees of that arrangement that are exposed in interactions while an interaction itself if also perceived as a continuum/susbtance/medium.

This is a very cursory overview but I hope it shares some thought.

PS: it's kind of ironically funny that in this question on Philosophy SE you underline that you are not looking for a semantic definition of information while there is nothing but a semantic definition to it because infromation is itself a semantic definition.

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .