Step 1: Parsing the image

About the image

Well, it should be noted first that we assumed certain things about the scanned image, in order to make the problem focused more on image recognition, rather than trying to fix everything that could go wrong. We assumed that the scan was made without extraneous noise in it, at a resolution that would show enough detail (e.g. numbers were not 2 pixels tall) and that the zip code was located on the last line, the five right-most characters. These assumptions are reasonable in that most people tend to follow them, anyways. Here is an envelope we used to test, and which we will use to illustrate our function.

Finding the last line

So first thing we do is read in the file using Matlab's built-in imread function. After converting everything to either black or white and inverting the image (Matlab considers white data and black non-data, while humans tend to work the other way), we have a very large array.

What we do is sum each of the pixel rows, adding up the number of pixels on each row, essentially. Those rows with a zero sum obviously have no data in them. We start at the bottom of the image, moving up, skipping over any zero-sum rows, until we find a row with data in it. Then we move up until we find a row with no data in it. There are also special cases in our code in case the bottom line contains data, and so on. Now we have defined the top and bottom of the last row in the envelope, as such:


We save the highlighted area into a seperate matrix, just to be clever.

Finding the zip code digits

Finding the zip code once we have the last line is remarkably similar - we sum all the columns together, and search for areas of data bounded by areas with no data, each time saving the digits we're peeling off into separate matrices. It's that easy.

Here is how our parse function would parse the above line:


For good measure, here is our Matlab function which accomplished all of these amazing feats.


Next