The Next Wave (TNW) About Subscribe to TNW Archive Security Enhanced Linux What's New Frequently Asked Questions Background Documents License Download Participating Mail List Remaining Work Contributors Related Work Press Releases Information Assurance Research NIARL In-house Research Areas Mathematical Sciences Program Sabbaticals Computer & Information Sciences Research Technology Transfer Advanced Computing Advanced Mathematics Communications & Networking Information Processing Microelectronics Other Technologies Technology Fact Sheets Publications Related Links
Aladdin Name Matcher - Name Matching by Normalization of Both Query and Data
Aliases:Aladdin Arabic Name Matcher
Technical Challenge:To support the name searching requirements of individuals, who must search for transliterated names effectively within large data sets The Aladdin Arabic Names Matcher, based on the Arabic Names Dictionary, requires no linguistic expertise, and is dependent on no query language.
Description:This matching method allows the best available linguistic rules to be applied to the parsing of query names and to the representation of names to be matched in databases. The pre-processing of the names to be searched in the same way as the query name enables the direct matching of query and a normalized index into the data without generation of large numbers of combinations of variations at query time. The use of linguistic knowledge enables searching and correct interpretation of results by individuals with little linguistic or cultural knowledge of the query name.
The invention consists of normalization and matching strategy for complete names, which in its first application used a government-owned 1994 Arabic Names Dictionary to normalize both data and queries in the same way. Matches of query names against loaded data are ranked in a conventional and easily explained information retrieval style, according to how many elements of the query name matched to the target, adjusted for the frequency of occurrence of the name element matched within a sample data set. If an unusual part of a name matches, the hit is ranked higher than if a common element matched.
Demonstration Capability:A demonstration is available upon request.
Potential Commercial Application(s):Information Retrieval in large document sets; Record matching in structured data; mining records and document collection of Arabic names.
Patent Status:Issued: United States Patent Number 7,761,286 (Updated)".
Reference Number: 1378
If you are interested in exploring this technology further, please express your interest in writing to the:
National Security Agency
Date Posted: Jan 15, 2009 | Last Modified: Jan 15, 2009 | Last Reviewed: Jan 15 2009