Image representation

The most fundamental aspect of digital image processing is probably the representation of digital images. An image is represented by an MxN matrix of pixel values which indicate the light intensity of a picture at a particular point. Now say we want to take a real picture and represent using this matrix of pixels.Take the example of a binary image where the pixel can only have two values with 0 representing no illumination (i.e black) and 1 representing maximum illumination (i.e white). A threshold illumination value is chosen to indicate when pixel value goes from 0 to 1. There is no information in the pixel values about intermediate illumination levels and hence this results in a poor representaion of the image. To improve on this, more bits need to be used to represent the pixel values. Depending on the levels of illumination that are required the number of bits can be chosen accordingly.The grayscale, as it is called, that we used in our implementations used 8 bits for each pixel which allowed the intensity level of each pixel to range from 0 to 255 (2^8). By increasing the number of gray levels the quality of the image is greatly enhanced and the transitions between diferent levels of illumination are more subtle (or closer to the original image).

Operations on images

Once the image is converted into a matrix of pixels a number of operations such as multiplication, addition , subtraction and convolution can be performed on it. One of the most important operations utilized in digital image processing, however, is convolution. An example [2] is shown below of a 3 x 3 convolution :

 _          _        _          _       _          _
|a    b    c |      |+    -    + |     |            |
|            |      |            |     |            |
|d    e    f |   X  |-    +    + |  =  |     e'     |
|            |      |            |     |            | 
|g    h    i |      |+    +    - |     |            |
|_          _|      |_          _|     |_          _|

 Where :

   e' = + a + b - c - d + e + f + g + h - i 

In convolution (like the one above) the pixel value in the center is computed based on the values of the values in every location ( in the case above the 9 locations are used). Once this is done the location is shifted by 1 and the process is repeated until the entire image matrix is generated [2].

Filtering images

Filtering images is very similar to filtering signals. For example, the interpretation of a low pass filter in the context of images is a filter which basically removes or reduces sharp changes in the values of adjacent pixels (so it basically reduces noise in an image). High pass filters and blurs have the similar corresponding interpretations. Below is a list of a an example of a number of very useful convolution masks that can used to perform various operations on images[2] :


1) Low pass filters (coefficients are positive and are normalized) :

 _            _        _             _      
|0.1  0.1  0.2 |      |1/16  1/8  1/16|    
|              |      |               |     
|0.1  0.2  0.1 |      |1/8   1/4  1/8 |    
|              |      |               |                  
|0.1  0.1  0.1 |      |1/16  1/8  1/16|                 
|_            _|      |_             _|               


2) High pass filters (coefficients sum to zero) :


 _            _        _             _      
| 0    -1    0 |      | 1    -2     1 |    
|              |      |               |     
|-1     4   -1 |      |-2     4    -2 |    
|              |      |               |                  
| 0    -1    0 |      | 1    -2     1 |                 
|_            _|      |_             _|     


3) Blurs :

   Horizontal            Vertical             Diagonal
 _            _        _            _       _           _
|  0    0    0 |      |  0    1    0 |     | 1    0    0 |
|              |      |              |     |             |
|  1    1    1 |      |  0    1    0 |     | 0    1    0 |
|              |      |              |     |             | 
|  0    0    0 |      |  0    1    0 |     | 0    0    1 |
|_            _|      |_            _|     |_           _|


4) Laplacian filters (used for edge enhancement, coefficients sum to zero) : 

 _            _        _            _                 
|  0   -1    0 |      |  1   -2    1 |                 
|              |      |              |                 
| -1    4   -1 |      | -2    4   -2 |                 
|              |      |              |                  
|  0   -1    0 |      |  1   -2    1 |                 
|_            _|      |_            _|       

With this basic knowledge one can start to process images as we did in matlab. One optimization (in terms of time) that can be used to to process images (especially large ones) is to take the 2 dimensional FFT's of the pixel matrix and the operator matrix, multiply them and then take the 2 dimensional inverse FFT. This is a lot faster when images are large as the convolution of large matrices takes a very long time. Note that it is important when taking the FFT's to zeropad them to twice the size of the larger matrix to avoid getting circular convolution. If this is done then the new positions added need to removed when the inverse FFT is taken.


Last modified: Thu Dec 18 23:37:38 CST 1997