American electronics researcher Ralph Hartley's 1928 paper "Transmission of Information", in which he explained how the "logarithm", in the form of x = y log z, specifically of the "number of possible symbol sequences" is the best "practical measure of information", specifically in regard to a telegraph operator sending 1s (HIs) and 0s (LOs) in a telegraph transmission; a model later used by American electrical engineer Claude Shannon in 1948 to found the science of information theory. [8] |
“The two—information theoretic ideas and thermodynamic entropy—have been repeatedly confused since the time of von Neumann.”
“When physicists, chemists, biologists, neuroscientists, and psychologists adopt ‘information’ as an explanatory term, they often do so based on a confusion of categories, and a misunderstanding of its function in communications theory.”
“Shannon’s work roots back, as von Neumann has pointed out, to Boltzmann’s observations, in some of his work on statistical physics (1894), that entropy is related to ‘missing information’, inasmuch as it is related to the number of alternatives which remain possible to a physical system after all the macroscopically observable information concerning it has been recorded. Leo Szilard (Zeitschrift fur Physik, Vol. 53, 1925) extended this idea to a general discussion of information in physics, and von Neumann (Mathematical Foundation of Quantum Mechanics, Berlin, 1932, Chap V) treated information in quantum mechanics and particle physics. Shannon’s work connects more directly with certain ideas developed some twenty years ago by Harry Nyquist and Ralph Hartley, both of Bell Laboratories; and Shannon has himself emphasized that communication theory owes a great debt to Norbert Wiener for much of its basic philosophy [cybernetics]. Wiener, on the other hand, points out that Shannon’s early work on switching and mathematical logic antedated his own interest in this field; and generously adds that Shannon certainly deserves credit for independent development of such fundamental aspects of the theory as the introduction of entropic ideas. Shannon has naturally been specially concerned to push the applications to engineering communication, while Wiener has been more concerned with biological applications (central nervous system phenomena, etc.).”
“Boltzmann himself saw later that statistical entropy could be interpreted as a measure of missing information.”
H = n log S
“What we have done is to take as our practical measure of information the logarithm of the number of possible symbol sequences.”
S = k log 2
“This was the point of departure chosen by Szilard (1929), who laid the groundwork for establishing a conversion factor between physical entropy and information.”
S = k ln 2
“Gain in entropy always means the loss of information, and nothing more.”
“It was not easy for a person brought up in the ways of classical thermodynamics to come around to the idea that gain of entropy eventually is nothing more nor less than loss of information.”
See main: Neumann-Shannon anecdoteIn 1939-1940 Hungarian chemical engineer John Neumann suggested to American engineer Claude Shannon that he should called information by the name ‘entropy’ as the reasoning that the equations are similar (they both are in logarithmic form) and that nobody knows what entropy really is, so in a debate you will always have the advantage. In 1948, Shannon took Neumann’s advice and in his famous paper "A Mathematical Theory of Communication", would credit this derivation by Hartley as being the point at which the logarithmic function became the natural choice for information and in the same paper, to the ire of many thermodynamicists, equate Hartley's 1928 telegraph "system" model with that of Clausius' 1865 heat engine "system" model. In short, Shannon, using a similar formulation to that above, declared that H, being a measure of information, choice, and uncertainty, is the same H as used in statistical mechanics, specifically the H in Boltzmann's famous H theorem, concluding with: [10]
“We shall call H the entropy of the set of probabilities.”
“Stored information varies inversely with entropy; lowered entropy means a higher capacity to store information.”