The Next Wave (TNW) About Subscribe to TNW Archive Security Enhanced Linux What's New Frequently Asked Questions Background Documents License Download Participating Mail List Remaining Work Contributors Related Work Press Releases Information Assurance Research NIARL In-house Research Areas Mathematical Sciences Program Sabbaticals Computer & Information Sciences Research Technology Transfer Advanced Computing Advanced Mathematics Communications & Networking Information Processing Microelectronics Other Technologies Technology Fact Sheets Publications Related Links
Aliases:Aladdin Triple-Pass Tagger "AVDT", Aladdin Visual Document Tagger"
Technical Challenge:The objective of this invention is to find and markup various identifiers embedded in text, including culturally specific personal names for non-western cultures, and match those identifiers against known identifiers when possible, and mark such matches.
Description:This invention is the fast and accurate location and markup of various identifiers embedded in text. The Aladdin Tagger is a knowledge-based approach to tagging of objects in text, using lexical, contextual, and morphological information. It includes culturally specific personal names for non-western cultures where the native script is not the script of the text. In the matching of those identifiers against known identifiers, when possible, a visual indication of the match is given for convenient access to further information.
The Aladdin Tagger is very fast over sections of the input, which do not include Arabic roots. Where roots exist it is able to recognize them very effectively and accurately because of the strengths of the Aladdin routines and embedded knowledge bases. Where roots exist it is able to apply linguistically appropriate rules in determining the bounds of the full-text name. It is then able to directly apply the Aladdin Name Matcher to determine which candidate names are known within Aladdin and therefore of likely increased interest to the reader. The Aladdin Tagger is case independent and does not depend on a full parse of the background text or on much structure grammatical. Within the background, it allows largely arbitrary strings within multipart (foreground) names.
Demonstration Capability:Demonstration is available upon request.
Potential Commercial Application(s):Relevant general industry areas include Information Retrieval, Document Management, and Text Mining.
Patent Status:Issued: United States Patent Number 7,539,611 (Updated)".
Reference Number: 1380
If you are interested in exploring this technology further, please express your interest in writing to the:
National Security Agency
Date Posted: Jan 15, 2009 | Last Modified: Jan 15, 2009 | Last Reviewed: Jan 15 2009