| 
Methods
Basic Structure of Algorithm
Used
 Our goal is to segment a 512x512 pixel image
into three classes: text, picture, and background.  The image is divided
into 256 32x32 blocks, and each of these blocks are classified.  If
the 32x32 blocks cannot be classified, the block size is changed to achieve
accurate segmentation.
 For each 32x32 block, we transform the block
using a Daubechies wavelet to one level.  We make a vector of the
wavelet coefficients in the LH and HL bands to classify the block. 
If the variance of this vector is zero, the block is classified as background. 
If the variance approaches extreme values, it is classified as text or
picture; otherwise, the block is marked as undetermined.
 The undetermined block is then divided into
four 16x16 subblocks.  We then try to classify each of these blocks
individually as either text, background, picture, or undetermined. 
If the subblock is undetermined, we use context information to classify
the subblock.  If the subblock has adjacent text and picture blocks,
the block is at a boundary.  Classification cannot be improved by
merging the block with surrounding blocks since the merged big block would
contain mixed types. In this case, we classify the subblock as either as
text or picture.  If the subblock doesn't have adjacent text and picture
blocks, we assume that it is unclassifiable because the sample size is
too small.  Thus, we merge the eight surrounding 32x32 blocks with
the block and produce a block of size 96x96.  The subblock is then
classified according to the variance of the 96x96 block. |