Today Wolfgang Schibel (of the phenomenal project CAMENA – Lateinische Texte der Frühen Neuzeit) mentioned to me the Wolfenbüttel images of the Leiden Calepinus ca. 1650. This is a fascinating work for many reasons but my first reaction on seeing the plethora of character sets used in each entry was: what a nightmare for a data entry team! Here’s a sample to show what I mean:
Perhaps for projects like this, and the nomenclatores, one might set up a database of entries with a front-end to enter each language separately. Set up one member of the team as a specialist to enter just the Hebrew text, another the Greek, ktl. Thus each team member enters only in one or two encodings (set up the front end to switch encodings easily), and enters only a bit for each entry. Later, the database could be compiled into TEI-marked xml.