Four varieties of possible information sequences (high: 1, low: 0, or no signal), transmitted in typical telegraph messages, at increasing lengths of cable transmission, A being a short transmission, D being a long cable length transmission. |

Overview

To illustrate his derivation, Hartley gives the situation in which a hand-operated submarine telegraph cable system in which an oscillographic recorder traces the received message or rather “information” on photosensitive tape. The sending operator has at his or her disposal three positions of a sending key which correspond to either a high voltage, low voltage, and no applied voltage. The following figure shows three different recordings of a given transmission, where A shows the sequence of the key positions as they were sent, and B, C, and D are traces made by the recorder when receiving over an artificial cable of progressively increasing length. Figure B shows a signal that can be reconstructed to read the original sequence, whereas C shows that more care is needed to reconstruct the original message, and D show a hopelessly indistinguishable message.

To put this information transmission into formulation, Hartley explains that at each point in the reading of the recorded tape of the transmitted single, the reader must select one of three possible symbols (high, no-signal, low). If the reader makes two successive selections, symbolized by

He then notes that the measure of the amount of information transmitted would increase exponentially with the number of selections. On this basis, Hartley states that the value ‘H’ is the amount of information associated with

Hartley then comments:

“What we have done is to take as our practical measure of information the logarithm of the number of possible symbol sequences.”

Information theory

In his 1948, American engineer Claude Shannon, in his famous paper "A Mathematical Theory of Communication", would credit this derivation by Hartley as being the point at which the logarithmic function became the natural choice for information and in the same paper, to the ire of many thermodynamicists, equate Hartley's 1928 telegraph "system" model with that of Clausius' 1865 heat engine "system" model. In short, Shannon, using a similar formulation to that above, declared that H, being a measure of information, choice, and uncertainty, is the same H as used in statistical mechanics, specifically the H in Boltzmann's famous H theorem, concluding with:

“We shall callHthe entropy of the set of probabilities.”

For ever after, countless numbers of information theory scientists have since taken any and all types of information, "which is a very elastic term, ... whether being conducted by wire, direct speech, writing, or any other method", in the Hartley's words, as being a direct equivalent to thermodynamic entropy, as derived from the study of the steam engine. In his paper, Shannon would go on to define the entropy H of the source in units of “bits per symbol”, in which in the Hartley derivation s = 2, corresponding to a source that can only send two voltage or current levels, high or low; hence Shannon’s entropy being measured in binary digits per symbol.

References

1. Shannon, Claude E. (1948). "A Mathematical Theory of Communication" (bit, pg. 1),

2. Hartley, R. V. L. (1928). “Transmission of Information”,