WWP The Project Newsletter Archive Volume 2, Number 1 An SGML Textbase

The Women Writers Project Textbase:
An SGML/TEI Textbase

by Julia Flanders, Textbase Editor

Contents

The Textbase

The Brown University Women Writers Project textbase at present contains approximately 200 texts, printed between 1546 and 1830. They range widely in genre, including drama, lyric poetry, satire in prose and verse, religious tracts and lyrics, moral essays, letters, memoirs, autobiographies, scientific writing, novels, miscellanies, periodical literature, and medical manuals. Rather than being treated as purely literary "works," each document is transcribed in full, including any subsidiary material, so as to be useful to as wide as possible a range of users: literary scholars, historians, linguists, sociologists, scholars of material culture. Included with each text is a TEI header containing detailed information about the provenance and bibliographic status of the original source document, the transcription and encoding methodology, and a record of changes made to the document.

SGML and TEI

The Women Writers Project textbase is encoded using TEI-conformant SGML, with slight modifications necessitated by the idiosyncrasies of the texts in our corpus. Because the WWP is dealing with a body of material which in some cases departs dramatically from familiar models, the literary and conceptual expectations upon which the TEI Guidelines are founded sometimes do not apply. The WWP conducts ongoing research into the structure and printing of early modern women's writing, in order to articulate the areas in which the TEI Guidelines are not yet fully adequate for our corpus and others like it.

Texts and the Classroom

The WWP's immediate goal has been to increase the availability of early women's writing for teaching and research. Because the large quantities of women's writing which still survive are largely held in archives and rare book libraries, it has been difficult in the past to get access to them for teaching purposes, and for the research which is so often the necessary prerequisite. One of the most visible achievements of the WWP's recent history is our progress in disseminating these hard-to-find texts in classrooms across the USA and in six foreign countries. We provide on-demand printed versions of all of our texts, and are also able to construct customized anthologies in collaboration with instructors developing special courses. Almost 200 instructors at over 85 universities have purchased our texts, a growing number of which are included in course packets which multiply the actual distribution figures. We estimate that between 5000 and 10,000 texts have reached students in the past two years alone, with a significant impact on course curricula. At present the WWP distributes texts in hard copy while we undertake preparations for electronic delivery; within three to four years we expect to provide institutional licenses allowing full electronic access to the textbase.

Research and Other Uses

An SGML-encoded textbase forms a foundation for a wide range of activities and derived products. The wide usefulness of the WWP textbase derives from precisely this principle; rather than creating a specialized database of limited capabilities, we have conceptualized the project from the start to accommodate a variety of disciplinary and research interests. To begin with, our transcription does not amend the text or conflate it with any other; each transcription is a single physical copy, and preserves all the idiosyncrasies of its source. In cases of printers' errors, our encoding preserves the original reading but also provides a corrected reading; the user can choose which to view. Texts transcribed in this way can be used as the basis for scholarly editions (and additional encoding can include information on variants), while they also remain reliable as witnesses of unique source documents. Furthermore, our encoding does not include elements which require literary interpretation for their application, such as identifications of theme, mood, or literary tropes. Our goal is to provide a text which will be raw material for research of this sort, or of any other sort, but not to prejudge the question by performing this work ourselves. Based on this infrastructure, the WWP expects to provide users with a wide range of possibilities for research and other activities. Over the next three years we will be planning and developing an electronic delivery system which will allow access both to the source SGML and to a variety of viewing options. The former will enable users familiar with SGML to include WWP files in hypertext projects, to add specialized encoding, and to study the WWP's encoding methodology more closely. The latter will allow the user to perform such tasks as searching the text using sophisticated search algorithms, comparing multiple texts, viewing texts with all original printer's errors or in a corrected version, linking to glossaries and other databases, using and creating links within texts, and controlling variables such as size, formatting, and color to suit individual preference.

The WWP in Partnership

The WWP works within a fast-growing community of humanities text encoding projects, and our cooperation with other projects is a crucial part of our growth and contribution to the general progress of the community. One way in which we hope to make our own research useful to others is by publishing our documentation, which describes our encoding methodology as it applies the TEI to early modern printed books, enabling similar projects to build on our preliminary work rather than repeat it. Another contribution we hope to make is in assisting new projects to design their approach and train their encoders; already we have had visits from researchers at several universities (Bologna, Humboldt, and the University of Pennsylvania) interested in developing similar textbases and in learning from our experience. We also plan to collaborate with projects which are developing browsing and search software. Specific partnerships on larger projects are also under way, most notably the liaison with the Oxford English Dictionary described in this newsletter. Finally, we have ongoing collaborations with Project Electra at the University of Nottingham, with the Deutsche Schriftstellerinen Projekt (a German-language women writers project), and with the Integrated History of Women's Writing in the British Isles, at the University of Alberta in Canada.

The Project | The Texts | Research and Encoding
Contact | Site Index | Northeastern University