Summarizing news
Ben Brown-Steiner
Issue date: 4/21/08 Section: News
|
News summaries are currently used in some papers as a Table of Contents of sorts, allowing a reader to get an idea of what an article is about before reading the paper. It is also used on online news sites where readers choose what to read based off of a blurb that can be only a few words long.
Right now, these summarizations are done by people who write a short summary themselves, or an automatic system that extracts major sentences from the article. The existing automatic systems do not actually summarize the article, and there is much room for improvement.
One of the major problems with automatic summarization programs is that the English language is especially complicated, with delicate intricacies and subtleties that is difficult to get an artificial intelligence to understand.
Luckily, news articles are written with a style that reduces vagaries and with a more formal structure, which makes it easier for an AI program to understand.
Fetterman's program used a ranking system to choose which words were more important that others, and therefore are the most important words to include in a summary. The program linked words based on grammar and frequency of use, and then would combine them into a short summary. Hopefully, it would be coherent and follow the rules of English.
Sometimes, the summary would be coherent, while other times it would be less coherent. The program would produce a sentence like "The students opted." that follows the rules of English but doesn't make sense with the common usage of the language.
To help tweak the program, Fetterman created a web site that allows the user to tweak the algorithms of the summary system and see the results of those tweaks. The program produces a graph that shows the distribution of ranks in the article, as well as a visual aid for seeing which words are linked to other words.
This web site turned out to be a great tool for understanding how the program links words together. It emerged as a surprise product that can be used by others in their own work as automatic article summary programs are improved in the future.


Be the first to comment on this story