Summary Text Extract Association Metric (STEAM)
Aliases:Summary Extract Metric, Summary Association Measure.
Technical Challenge:Several methods currently exist for automatically summarizing a document, however there is no unanimously accepted measure for evaluating and comparing these methods. Given that a certain machine-generated summarization method is commercially regarded as best, the STEAM approach could then be used to rank its competitors against an established industry standard.
Description:STEAM is a new statistical algorithm for determining the amount of "agreement" between two textual summaries, which are generated by extracting specific sentences from the same full-text document. STEAM preserves both sentence position in the document and sentence order given in the summary. This new metric essentially provides an objective correlation-like measure between these summary extracts, i.e., on a normalized scale of -1 to +1, which is easily interpretable and allows for convenient comparisons between distinct machine-generated summarization systems or machine-generated and human-generated systems.
STEAM avoids the major problem of incurring a penalty in the evaluation score for not having exact sentence matches between the summaries. The additional problems of both subjective inputs and lack of a baseline are circumvented by utilizing STEAM since its value is independent of the summary's content or its "relevance judgment" by experts.
Demonstration Capability:This technology can be demonstrated readily by running software developed on a UNIX platform.
Potential Commercial Application(s):The primary utility of this new approach would be in evaluating two (or multiple pairs of) competing machine-generated textual summaries of a single full-text document.
Patent Status:A patent application has been filed with USPTO.
Reference Number: 1250
If you are interested in exploring this technology further, please express your interest in writing to the:
National Security Agency
Date Posted: Jan 15, 2009 | Last Modified: Jan 15, 2009 | Last Reviewed: Jan 15 2009