Previous | ToC | Next Labs: Error Correction and Compression. Part 2. Math Alive

Lempel-Ziv Algorithm: Coding


The LZ-coded version of the chain in the example on the previous page is then formed by writing all the coded strings (from the last column in the table) in one long chain.
In our example, the original string of:
If_one_doctor_doctors_another_doctor,_does_the_doctor_who_doctors_the_doctor_ doctor_the_doctor_the_way_the_doctor_he_is_doctoring_doctors?_Or_does_he_doctor_ the_doctor_the_way_the_doctor_who_doctors_doctors?
becomes:
0I0f0_0o0n0e3d4c0t4r7o0c9o0r0s3a5o9h6r11c13r0,11e15_18e20t10_0w0h4_0d8t10s3t29e26o 14_31o12t27t35_38c21_25_28a0y34h6_42t27h48i24d32o14i5g36r15?3O37d4e24h48d53r47e56_ 44w0a46_44d63_28h30d63s56s0?

As you can see, the coded chain is a little bit shorter than the original chain. This is because our original had many repeating pieces. For a chain this short the encoded chain is usually longer than the original. For files on the order of millions of bits, the algorithm generally compresses to amounts less than 50%. How much compression you obtain depends on the file. If the file contains just random bits, then you won't get much compression, if any. On the other hand, if the file contains many systematic patterns, then you get a lot of compression.


To summarize, we have 3 steps:
  1. parsing
  2. coding individual strings
  3. joining individual coded strings to a chain.

Can you code the following sequence? As you code, pressing return in the coded text window will tell you the correct part of your answer so far.

Practice
Lempel-Ziv Algorithm: Coding

In the next page we explain what to do if the text contains digits.


Previous | ToC | Next Last Modified: August 2008