Bandwidth resources are limited. Digital media information (audio, images, video) is information dense. There is a high value in conservation of bandwidth while serving digital media. Digital media information has a high degree of redundancy. Through exploitation of these redundancies we can gain advantages bandwidth and storage conservation. This paper explores some basic concepts in digital media representation within computers. First, we look at two popular color representation schemes: RGB and YCbCr. Then, we look at simple schemes called upscaling and downscaling, which are used to reduce the storage requirements of digital media. While not compression proper, these methods are the foundation for digital media representation. The goals of the paper are:
- Learn how to read and display images in Matlab.
- Learn RGB and YcrCb color spaces.
- Learn subsampling, upsampling, interpolation, replication methods and MSE computation in Matlab.
Our first task was to read the jpeg image into Matlab. Matlab represents an image as an array of pixels where each pixel represents an R, G or B value. Therefore we get a wh3 array (where w is the width of the image and h is the height), or a 3D array. To display each color band, we simply replace the other band values with zeros (Figure 1). We then explore the YCbCr color space which can be used w/ the rgb2ycbcr command (Figure 2). This will give us three different bands: Y, Cb and Cr which operate in a way similar to the RGB color space but exploit information about human vision to reduce redundancy.
Figure 1: RGB color bands
Figure 2: YCbCr color bands
We then move to the part of the lab where we experiment with sampling. First, we sub-sample the chroma bands of the YCbCr image. We use a ratio of 4:2:0 which removes every other row and column from the Cb and Cr pixels. We then use two different upsampling schemes to restore the image. The first technique, replacement, simply takes an adjacent color and copies it in place of the pixel removed in the sub-sampling step. This is a simple technique, so it can be improved upon by using the second technique: linear interpolation. Using linear interpolation, we take the average of the adjacent pixels and replace the missing pixel with that value. When we compare the two upsampled images with the original, it is clear that linear interpolation provides smoother color gradients in the reconstruction.
Figure 3: subsampled cb and cr bands
We then perform some analysis on the reconstructed images. It should be noted that the Cr and Cb bands upsampled images have half as much pixel information than the original, however, they are still almost indistinguishable (Figure 5). This is because the implementation of the YCbCr color space exploits the weaknesses in human vision. Human vision is most sensitive to changes in intensity, not color which RGB is based on. So, even though 4:2:0 means 1/2 resolution in both horizontal and vertical directions, the difference is imperceptible even on the naive upscaling method of simple color replacement.
The differences between the original and upscaled versions can be quantified by the mean squared error equation:
Figure 4: MSE equation for an image
y | 0.000000 Cb | 0.291327 Cr | 25.119344
Table 1: MSE across ycbcr bands
As displayed in Table 1, there is no error between the yband of the original and the 4:2:0 upscaled because we did not subsample anything from that band. We can see that there is a relatively large error on the cr band, however, that is within our allowable margin of error. This is because the overall effect the cr band has on the human eye’s perception of the image is negligible compared to the y and cr bands. See the below figures for an example:
Figure 5: The difference between the interpolation methods is almost imperceptible.
We then compare the compression ratio of the 4:2:0 image w/ the original. This is simply the ratio of pixel count in the original to the upsampled :
original/upsampled = 2
or a 1:2 compression ratio, a modest compression and essentially no quality loss.
It is clear from this presentation that by use of naive compression methods such as interpolated upsampling we can achieve a reasonable degree of data compression with minimal quality loss. Given, this fact it would be prudent to explore more aggressive lossy compression methods in which a higher degree of compression is achieved for a higher loss of quality.