June 22, 2006

Online tool searches early texts

University Library has received $39,000 in funding from its partners in the Committee on Institutional Cooperation (CIC), a consortium of major research libraries in the Midwest, to develop an online tool that will allow students to search texts published between 1500 and 1800 using modern English vocabulary. The project has also received funding from Proquest’s Chadwyck-Healey publishing group, a leading producer of humanities databases.

The CIC CLI Virtual Modernization Tool, as it is called, aims to solve a fundamental problem associated with database searching. In general, a researcher conducting a keyword search can only retrieve text with the exact word used in the query. This becomes a significant problem when the text dates from the mid-16th century and contains spellings no longer in use. For

example:

Your duty is to lyue wel, to practyse good workes, to exercise al godlye actes, to lede
a virtuous conuersacio…
(From A Christmas Bankette by Thomas Becon)

The tool being developed at Northwestern will allow the searcher to enter the keywords live, godly and conversation and pull up passages containing the early spellings lyue, godlye and conuersacio — among dozens or even hundreds of other variant spellings. Boolean and proximity searches combining terms will also be possible. The project involves mapping approximately 1 million early spellings to modern headwords and will eventually cover all spellings that occur more than once in full-text databases of early modern English. When complete, the Virtual Modernization Tool will allow student to more effectively search thousands of early texts with the English they use every day.

Proposed by Martin Mueller, professor of English and classics, the project is being developed in collaboration with Academic Technologies and University Library.