It is not an argument at all. At least, not a scientific argument.
The concept of "bits that you want or don't want" or "that you understand or do not understand" is at least very vague: is the author suggesting an arbitrary subjective definition of information or of entropy, depending on what different persons want or understand?
Even worse, there is no physical definition of a bit. Ok, the minimum information, as thought in the introductory course on information theory, is ok. But, what is a bit in the real world according to the author? In any case, information theory never identifies information and entropy. Missing the definition of what represents a bit in the physical world, claiming that bits can never be destroyed is pure nonsense. Is the author able to reconstruct the content of his hard drive after melting it in a furnace? And how can we know that it is possible to create bits from nothing? Because, if it is not possible the creation of new bits, entropy, according to his definition, should stop to increase at some point.
Leaving aside these unjustified claims, there are a few things that could be said on this subject which may help to understand why this "derivation" is nonsense.
There are many different concepts which (unfortunately) have all been named "entropy". Those immediately relevant for the second law of thermodynamics are
- the thermodynamic definition introduced by Clausius:$$
S(B) = S(A) + \int_A^B \frac{dQ_{rev}}{T},$$
- the statistical mechanics definition by Boltzmann/Planck/Gibbs, which can be expressed in many ways, depending on the set of state variables one likes to use to describe a macroscopic state;
- the information theory definition by Shannon:
$$S = -k \sum_i p_i log(p_i). $$
In the Shannon's formula, a generic system, (even a system without any thermodynamic behavior like a deck of cards) is supposed to be found in each $i-$th state with a probability $p_i$. It is reasonable to associate a quantity named $information$ to each state via the quantity $log(1/p_i)$. Thus, a value of the Shannon entropy can be assigned to each system characterized by a given probability of its states. It is evident that Shannon entropy is not the same as information but it can be seen as the average information
embodied in a probability distribution.
It is also interesting to notice that it is possible to assign different probability distributions to the same physical system, according to different ways of listing its (micro)states. A different entropy will correspond to each of such distributions.
Last but not least, nothing is stated, at level of information theory, about the time evolution of the probabilities. So, in order to make contact with the second principle, something more should be said.
The conceptual chain of links between the three entropies goes as follows:
- Shannon entropy reduces to the statistical mechanics expressions for the entropy in different ensembles if the probability distribution used in the information entropy coincides with the probability distribution of the relevant ensemble.
- The different formulae of the entropy in each ensemble are not always equivalent but the corresponding entropy per particle or per unitary volume coincides, after taking the thermodynamic limit, and always in that limit, it has all the properties of the Clausius thermodynamic entropy.
In conclusion, only for an infinite system, controlled by the Boltzmann-Gibbs probability, it is possible to establish a safe scientific link between information and second law. Unfortunately, such a clean conceptual connection is very often ignored or misinterpreted, even in textbooks. Even though the whole scenario was very clear already in the fifties when Brillouin's book Science and Information Theory was written.