Posted on behalf of Ben Gracy at the University of Denver: an article on an assisted transcription system that uses OCR. It sounds fascinating.
*edit: elsewhere in the article reference is made to “ancient documents and manuscripts”, which indicates that this system has been developed for handwritten materials in addition to printed… although the word “handwritten” itself doesn’t appear in the article.*
Traditional Optical Character Recognition (OCR) systems give rise to transcription problems and provide results with many errors that need to be edited afterwards. State, however, is a transcription system that integrates a series of tools with which images can be processed in order to remove noise and clean up the original image, the page structure can be detected, the text can be recognised and mistakes can be quickly and easily edited with interactive tools such as an electronic pen applied directly on the text. Andrés Marzal, one of the researchers in the project, explains: “It is a practical solution to the problem of a supervised transcription, since it shortens the most time-consuming phase, that is, editing the automatic transcription so that it is true to the original”.