Cleaning the Skeleton

The the thinned image produced by the Matlab's bwmorph function has many problems. The most common problem is short stubs sticking off the main skeleton, as we see on the picture below. Another common problem is slightly incomplete skeletonization. For example, sometimes we see something like this:

0000
1110
0010
0010

The corner 1 can be erased, without loosing connectivity. Other one pixel defects also are possible and they are corrected. The skeleton spurs or lines that are all by themselfs that are shorter than a certain critical value that is passed to the cleanskel.c are erased. The later action has very good de-noising qualities, since any specs that were left within the rectangle of the object are erased.

The cleaning function is rather effective in cleaning the skeleton, but it does have some problems. The tail-length parameter which function uses to determine if the tail is short enough to be chopped is a rather arbitrary number. Of course, for the large numbers (lots of pixels), the tail length can be longer, since we can afford to chop off more. For the very thick characters (especially printed fonts), the tail-length parameter must increase significantly to clean it well. But here is the catch -- if the tail length is too long, it will chop off the actual skeleton, not the spurs! This is really obvious in case of Z with a cross bar -- the bars can easily be erased with a "normal" tail length paramenter of 0.25*(height of the image). The hand-written Z can look like a 2, and confusion can arise if the bar is erased.