•  ()
  •  ()
  • Print this Story
  • Email this Story

Online Tool to Enable Search of Early Modern English Texts

text size AAA
May 9, 2006

EVANSTON, Ill. --- Northwestern University Library has received $39,000 in funding from its partners in the Committee on Institutional Cooperation (CIC), a consortium of major research libraries in the Midwest, to develop an online tool that will allow students to search texts published between 1500 and 1800 using modern English vocabulary. The project has also received funding from Proquest's Chadwyck-Healey publishing group, a leading producer of humanities databases.

The CIC CLI Virtual Modernization Tool, as it is called, aims to solve a fundamental problem associated with database searching. In general, a researcher conducting a keyword search can only retrieve text with the exact word used in the query. This becomes a significant problem when the text dates from the mid-16th century and contains spellings no longer in use. For example:

Your duty is to lyue wel, to practyse good workes, to exercise al godlye actes, to lede a virtuous conuersacio…

(From A Christmas Bankette by Thomas Becon)

The tool being developed at Northwestern will allow the searcher to enter the keywords live, godly and conversation and pull up passages containing the early spellings lyue, godlye and conuersacio -- among dozens or even hundreds of other variant spellings. Boolean and proximity searches combining terms will also be possible. The project involves mapping approximately 1 million early spellings to modern headwords and will eventually cover all spellings that occur more than once in full-text databases of early modern English. When complete, the Virtual Modernization Tool will allow student to more effectively search thousands of early texts with the English they use every day.

Proposed by Martin Mueller, Northwestern professor of English and classics, the project is being developed in collaboration with the Academic Technologies unit of Northwestern University Information Technology and Northwestern University Library.

“Virtual orthographic standardization of early modern archives offers the tremendous service of letting non-expert readers search old books as if they were modern,” says Mueller. “The Northwestern-CIC project will lead to a very significant improvement in the documentary infrastructure for research in the humanities.”

A prototype of the Virtual Modernization Tool was released in January 2006. A substantially improved version will be available by September 2006. During the two-year pilot phase, access will be restricted to the 13 CIC partner institutions and to Chadwyck-Healey. The tool will later be released to the full research community for use at no charge via the Internet.