We will find a lower limit for the length (bits) of a coded symbol from a source characterized by symbol probabilities pi.
If symbol i is binary-coded using li bits, for instance, then the
average code length is
.
is representative of the information content of the source.
The smaller it is, the less information the source can be said to
contain.
We introduce the notion of a Prefix Code. This is a code where no code word is a prefix of any other.
In order for a code to be unambiguous or uniquely decipherable, it is sufficient if the code is a prefix code. (However, it is not necessary - Why? Think of a non-prefix code that is also unambiguous. Prefix codes have the added advantage that the code is also instantaneous i.e. a receiver can decode symbols on the fly rather than wait for disambiguating information.)
If a code is laid out in the form of a binary tree where every node has a 1-branch and a 0-branch and the leaves are labeled with the complete path from the root leading to them, then the leaves are the codewords in a prefix code. Work out an example.