Research Menu

Skip Search Box

Method of Image Binarization Using Histogram Modeling


Gaussian Model-based Image Binarization

Technical Challenge:

Conversion of gray-scale and color document images to a binary representation for text extraction


This technology is a new algorithm for the binarization and resolution expansion of gray-scale images of text. The process models the distribution of gray scale intensities with five different Gaussian models. These models are then used to determine the output pixel values of the resultant binary image. Binarization of text images is an essential preprocessing step required for many automated document content extraction.

Demonstration Capability:

A demonstration is available.

Potential Commercial Application(s):

This algorithm would provide a preprocessing step before a number of automated document content extraction processes such as script ID, language ID, Optical Character Recognition (OCR), signature block verification, and others. The OCR application probably represents the largest market arena.

Patent Status:

Issued - United States Patent Number 6,941,013.

Reference Number: 1235

If you are interested in exploring this technology further, please express your interest in writing to the:

National Security Agency
NSA Technology Transfer Program
9800 Savage Road, Suite 6541
Fort George G. Meade, Maryland 20755-6541


Date Posted: Jan 15, 2009 | Last Modified: Jan 15, 2009 | Last Reviewed: Jan 15 2009