CS 432 Project 4: Subband Coding *

In this project, you will experiment with a method for image compression called subband coding. Subband coding involves three stages: 1) analysis; 2) quantization; and 3) synthesis. In the analysis stage, the image is decomposed into a set of sampled bandpass images. Because the bandpass images typically have a smaller range of grey values than the original image, reconstructions of comparable image quality can be achieved using fewer grey levels. Furthermore, because many grey values in the bandpass image are quite small, the process of quantization creates long runs of zeros which can be efficiently compressed using lossless methods. The final stage in subband coding is termed synthesis, and involves inverting the analysis stage by reconstructing the unfiltered image from its bandpass components. The various schemes for subband coding differ in the particular methods employed in each of the three stages. In this project, you will use a Laplacian pyramid transform for analysis and its inverse for synthesis.

Laplacian Pyramids

Laplacian pyramids were introduced in the early 1980's by Peter Burt** and Ted Adelson. The basic idea is quite simple. First, convolve the input image, V0, with a Gaussian lowpass filter and then downsample it to create an image with half the number of rows and columns, V1 (this is called the reduce operation). Second, upsample and interpolate the values of V1 to create an image with the same number of rows and columns as V0 (this is called the project operation). Third, subtract this image from V0. The resulting image, L1, represents the difference (or error) between the original image and the downsampled lowpass filtered image. L1 is equivalent to V0 convolved with a difference of Gaussians bandpass filter.*** This analysis process can be repeated to produce V2 and L2 from V1 (and so on). The lowpass filtered and undersampled images (i.e., V0, V1, V2, V3 and V4) form the levels of a Gaussian (or lowpass) pyramid. The images representing the differences between adjacent levels (i.e., L1, L2, L3, and L4) form the levels of a Laplacian (or bandpass) pyramid.

V0 V1 V2 V3 V4

L1 L2 L3 L4

The original image, V0, can be reconstructed (without error) solely from the Laplacian pyramid levels (i.e., L1, L2, L3, and L4) and the final level of the Gaussian pyramid (i.e., V4). First, the values of V4 are upsampled and interpolated to form an image with the same number of rows and columns as L4. This image is added to L4 to reconstruct V3. This synthesis process is then repeated until V0 is reconstructed.

Image Compression

By itself, the Laplacian pyramid is not an image compression scheme. Indeed, the amount of space required to store the Laplacian pyramid is 33% greater than the amount of space required to store the original image. However, because of the smaller range of grey values in the bandpass images, we can use fewer grey levels and yet achieve reconstructions of comparable image quality. In addition, because many of the grey values are very small, quantization produces many zero values. This results in further space savings if a standard lossless compression scheme is used after quantization.

Project Outline

You will begin by writing two kroutines called Reduce and Project. These kroutines will in turn be used as components in two encapsulated workspaces called Laplacian Pyramid and Invert Laplacian Pyramid.

After you have finished and tested your implementation of the Laplacian pyramid transform, you will estimate the compression ratios which can be achieved by using different levels of transform coefficient quantization. To do this, you will need to implement three encapsulated workspaces, Sigmoid, Quantize, and Sigmoid^-1.

Your Cantata workspace will look something like the workspace shown below:

Note that Sigmoid is not applied to the final Gaussian pyramid level (i.e., V4), only to the four Laplacian pyramid levels (i.e., L1, L2, L3, and L4). The UNSIGNED BYTE outputs of the five Quantize glyphs can be stored to disk in .pnm format. We define the compression ratio to be the ratio of the amount of space required to store the .pnm file representing the original image and the amount of space required to store the five .pnm files representing the Laplacian pyramid and the final Gaussian pyramid level (where all files are compressed with the UNIX compress utility).

Hints

What You Should Hand In

You should hand in a concise (less than 10 page) research report describing your project. The following should be included either in the body of the report (where appropriate) or in an appendix:

When You Should Hand It In

The above should be handed in at the beginning of class, Mon. May 1.

Of Possible Interest

* This webpage is located at http://cs.unm.edu/~williams/cs432/project4s00.html
** A fellow U. Mass. Amherst alumnus!
*** The difference of Gaussians is a very good approximation to the Laplacian of Gaussian.