The DEEP project started on time in November last year. Our project plan has been finalised, and will shortly be available from our page on the JISC website.
As it promised, the Survey of English Place-names (SEPN) is a complex and fascinating document. Produced by the English Place-Name Society (EPNS), the SEPN is a true community effort. Its 86 volumes document the names of some 40 English counties, and have been compiled by different place-name scholars over the years. Thus, a succession of different people have moulded the text itself to fit and reflect England’s ancient and rich toponymic landscape.
While this provides an unrivalled resource for the place-name scholar, the historian, the geographer and the linguist, this makes digitizing it a challenge. Our aim is to put the forms into a structured gazetteer, but the structure varies from county to county. The basic hierarchy goes from large units, such as counties and hundreds, to smaller units, such as parishes, townships, settlements and minor names. Some conventions persist. Parish names are mentioned as headings for example, followed by townships and settlements, but there are inevitable exceptions, which makes tagging these sections of text complex – we do not wish to impose artificial structures on anomalous portions of text, since they will all be anomalous for a reason.
OCRing the text is the responsibility of CDDA. This process has thrown up problems, for example in some cases matching Anglo Saxon characters to their supported Unicode equivalents requires expert input from the team at Nottingham. Sometimes AS characters are simply hard to read due to printing issues, sometimes the problem is that the Unicodes themselves need correcting. E.g. a character initially assigned Unicode E624 was misread and reassigned 01ED (?).
Cheshire is now completed, and work is underway on Shropshire.