We are slowly making progress on a “tagger” that uses methods from computational linguistics to identify and tag named entities in arbitrary ancient texts. One can imagine various efforts at text visualization resulting from that initial stage of disambiguation.
From the Humanist list comes word of 10×10, a visualization service evaluating and displaying images of top news stories by the hour, month, and year. How it works:
Every hour, 10×10 scans the RSS feeds of several leading international news sources, and performs an elaborate process of weighted linguistic analysis on the text contained in their top news stories. After this process, conclusions are automatically drawn about the hour’s most important words. The top 100 words are chosen, along with 100 corresponding images, culled from the source news stories. At the end of each day, month, and year, 10×10 looks back through its archives to conclude the top 100 words for the given time period. In this way, a constantly evolving record of our world is formed, based on prominent world events, without any human input.
Currently, 10×10 gathers its data from the following news sources:
* Reuters World News
* BBC World Edition
* New York Times International News