Renaissance Women Online: A Final Report
Julia Flanders, Textbase Editor
Women Writers Project
February 1, 1999
Table of Contents
- RWO collection
- General design
- Selection criteria
- Encoding methods and delivery system
- Early user response
- Description and analysis of user survey
- Academic research and electronic publication
- Appendix A: Reviews and feedback
- Appendix B: Survey results on use of electronic resources
- Go to The RWO Online Collection
“Never before in the academic world,” asserts a recent article, “has there been so great a divide between technical capability and the actual culture of use” (Robinson and Taylor 1998, 283). According to this assessment, the average academic is still deeply reluctant to use digital resources for research, despite the growing number of such resources and their increasing accessibility and user-friendliness through the medium of the web. At the same time, within certain academic subcultures there is clearly a “culture of use” which encourages scholars to use digital resources and integrate them into their research (Hall 1998, 289). For publishers and producers of online materials, understanding academics’ attitudes towards these resources is crucial to predicting the nature of the future market and the kinds of products which may succeed, but it can be difficult to get an accurate picture of these attitudes.
There is a growing body of research investigating the research needs of scholars—most in the sciences, but increasingly in the humanities as well. These have tended to focus on methods of retrieval, and on scholarly needs in the context of online bibliographies, indexes, and finding aids. Less studied is the question of scholarly usage of full-text resources: resources which provide direct access to materials rather than merely locating them. Such resources are only now entering the scholarly horizon, but already they raise distinct questions: How can such resources respond most efficiently to scholarly needs? How well do they fit in with scholarly habits and attitudes, and how well can they persuade the scholarly community of their desirability?
We can identify two contrasting views of the relationship between current scholarly practice—the “actual culture of use”—and the advent of digital resources:
- The view that online research is profoundly different from the activities which literary scholars currently regard as belonging to their discipline, and that changes in behavior will be slow in coming.
- The view that scholars already want and anticipate these functions, which in fact represent a more efficient or effective way of accomplishing what scholars already do (albeit more laboriously) in their existing research.
In some sense, this dichotomy recapitulates the familiar debate over whether technology creates new desires and new disciplines, or whether it acts only to enable what is already there in potential.
This report describes the research initiative entitled “Renaissance Women Online” (RWO), which was conducted by the Brown University Women Writers Project from September 1996 through August 1999. The aims of the initiative were, broadly, to gain an understanding of the intellectual and economic impact of electronic resources on scholarly research in the humanities. To this end, RWO included both a model digital resource project—a collection of primary source texts within the WWP’s online textbase, with added materials specific to RWO—and a user survey. For both of these initiatives, our assumption was that the second view articulated above (that current academic work relies on activities which can be more efficiently accomplished in the digital medium) was true, and constitutes in effect an entry point through which digital resources will become firmly rooted in humanities research and teaching. Research in more experimental uses of digital resources, though, seemed to us to indicate the truth of the first view’s suggestion that digital resources, once adopted, are likely to motivate far more profound changes in academic work over the long term. It is these changes which we consider most important to understand and assess, since they will represent the real conditions of academic work once digital resources are fully naturalized within the academic environment.
The Renaissance Women Online collection has already been recognized within the community of scholars and teachers as a premiere resource for the study of early women’s writing. In a recent session entitled “Old Texts, New Strategies: Researching and Teaching Early Women Writers Online” at the Modern Language Association convention in Chicago (December, 1999), three speakers described research and teaching projects which they had undertaken using RWO, and the audience was full of scholars who had also worked with RWO or contributed to its development. Thanks to the Mellon Foundation, and to the efforts of those who have participated in the RWO initiative, this resource is now well-known and increasingly widely used, and is making possible a dramatic reshaping of the early modern curriculum. The discussion which follows will describe the general design of the RWO resource, the criteria by which texts were chosen for inclusion, the method of publication and how it was conceptualized, and the encoding methods and delivery system used.
The Renaissance Women Online collection was conceptualized as serving a variety of purposes, all of which contributed to the development of the collection. Above all, since it was intended to offer a comparison with print editions, it had to match the scholarly integrity, credibility, and general usefulness of such materials in order for the comparison to be meaningful. In addition, it was intended to support both teaching and research, and thus we had to consider a broad range of users and levels of expertise in designing the interface and pitching the contextual materials. Finally, in keeping with the broadly interdisciplinary scope of the Women Writers Project’s existing collection, we wanted to include a wide range of texts not limited to the traditional belles lettres bias of the male canon, but rather representing the literate culture of the period as comprehensively as possible.
To ensure that the collection would meet the highest standards of textual integrity and reflect current scholarly and pedagogical needs, we appointed a selection committee chaired by Professor Elizabeth H. Hageman of the University of New Hampshire, who has served on the Executive Committee of the Women Writers Project as its Renaissance specialist since its inception. The committee’s other members are Professor Boyd M. Berry, Professor of English at Virginia Commonwealth University, Georgianna Ziegler of the Folger Shakespeare Library, and Dr. John Lavagnino of King’s College, London. They compiled a list of 55 texts to be transcribed, which together with 45 more from the WWP’s existing collection would constitute the Renaissance Women Online collection of 100 texts.
In addition to these primary texts, RWO was designed to include a brief introductory essay for each text, describing its production and cultural context, and a set of topic essays discussing broad themes which are of importance for the entire collection. Aimed at readers unfamiliar with Renaissance women’s writing—whether students first encountering the period, or faculty from other areas—these essays resist offering overly evaluative or interpretive readings, but provide essential background which for many of these rare authors is difficult to uncover. They also suggest links with other texts and authors, thus providing strands of continuity which serve to bind the collection together conceptually. Paul Caton, the WWP’s Electronic Publications Editor, worked with Professor Hageman to identify and contact scholars with expertise on particular authors, and to ask them to contribute to this collection of essays. All of the essays were written on a volunteer basis.
Selecting a group of 100 texts to represent the wide expanse of Renaissance women’s writing was a difficult task, because this number is such a small fraction of the total material extant. RWO is the largest anthology or collection of 16th- and 17th-century English women’s writing ever to be assembled, but it is only a sample of a much larger potential archive (currently estimated at over 500, with more discovered every year). In addition, because of current limitations in the available text encoding methods, we were unable to include any manuscript materials, thereby eliminating a significant sector of early women’s writing from consideration.
Within these constraints, the selection committee attempted to achieve a balanced collection on a number of counts. They chose a mix of now well-known writers such as Katherine Philips and Margaret Cavendish, together with lesser-known women such as Elizabeth Poole. They also included both “original” work and translations, and included as well writing under probable female pseudonyms like “Jane Anger” and written representations of an otherwise lost female voice (as in the accounts of the Flower witchcraft). They selected a range of genres, including drama, poetry, religious tracts, philosophical writings, cookbooks, trial narratives, works in translation, and introductions to translated works. They also tried to include as many women from “lower” socio-economic classes as possible, and provided a range of religious stances (Anglican, Fifth Monarchists, Quakers, Roman Catholic, etc.) from this religiously fraught period. Although the materials are chronologically balanced across the period covered, since there is a smaller amount of extant writing by 16th-century women, the committee selected a larger percentage of it than of the 17th century, where more choices were necessary.
As a general rule the committee chose first editions of works, using as source texts exemplars with no known defects. On occasion (for instance, Elizabeth Cary’s life of Edward II) when there are large and interesting variations between editions, they included more than one. With a few exceptions, the texts were transcribed in full, including all frontmatter and backmatter, and including textual segments by writers thought or known to be men (for instance, a preface by a male publisher or compiler). The exceptions to the full transcription rule were texts mostly by a male author containing a small portion by a female author (for instance, Mary Sidney’s “The Doleful Lay of the Fair Clorinda”, excerpted from Edmund Spenser’s “Astrophel” in Colin Clout’s Come Home Again, 1595). Similarly, in cases of long works in translation, the translator’s preface was included but the translation itself omitted for the time being. The Women Writers Project may in the future transcribe these texts in full.
In publishing our collection, the WWP has faced two challenges. The first is technical: although the WWP has always assumed that we would publish our collection online, the key has been to find a system that would marry ease of access with powerful functionality. The advent of the web in 1993 has made broad access possible, but the web (which relies on HTML, a very simple encoding system) does not yet provide any easy way to exploit full TEI/SGML markup, which we felt was necessary in order to make our effort both worthwhile and long-lasting. The second challenge was more sociological: who would in fact “publish” the WWP’s collection, and what sort of publication would it be?
We first explored the possibility of working with one of the publishers who are now undertaking electronic publication: Oxford University Press, Routledge, Cambridge University Press, Chadwyck-Healey, and a few others. Our assumption was that the advantages of commercial publication would be significant, offering access to established marketing venues and putting the burden of management and advertising onto the publishers’ professional staff rather than on the WWP. Most importantly, we expected that the publisher would develop the delivery software and interface, thus solving our first challenge as well. At the time, both Routledge and CUP were in the process of launching ambitious new electronic ventures and developing delivery software which would allow the online publication of richly encoded SGML data over the web.
At the same time, our discussions with these publishers gradually revealed several differences of strategy and attitude which emerged as potential obstacles to collaboration. First, their approach to pricing was fundamentally different from ours, based on an extremely cautious assessment of the potential market. Assuming low sales for what they felt to be a niche product, they also assumed a need for correspondingly high prices in order to make back the investment, which in turn made higher sales still more implausible. Their approach would have recouped costs, but could have jeopardized the wide dissemination which was our fundamental aim. In addition, as we considered what we wanted to achieve by online publication, we found ourselves more and more concerned about retaining control over the data, setting our own schedules for upgrades, and allowing free research use. These were goals to which commercial publishers—given their business model and fiduciary responsibilities—quite reasonably felt cautious about committing themselves.
Our eventual decision to publish the collection ourselves, and to develop our own delivery system, turned out to be a welcome necessity. Two of the publishers’ online delivery systems did not materialize as planned, owing to delays in programming, and began to look less likely to fit our criteria of functionality when completed. In particular both publishers seemed more interested in CD-ROM publication than in online distribution over the internet, a choice which seemed to us to go directly counter to our conception of how the collection would be used and distributed. Working with Chadwyck-Healey would still have been possible, but for a number of reasons we decided that we did not want to be absorbed into Chadwyck-Healey’s Literature Online collection. For one thing, we found that the level of functionality which Chadwyck-Healey had made standard for their collection was considerably below what we had envisioned for RWO and Women Writers Online, and would not be likely to exploit our markup fully. This problem was compounded by the fact that the interface to RWO/WWO would necessarily be made homogeneous with the rest of the Chadwyck-Healey collection, leaving us with little scope for the kinds of special features we felt users would want and that we could provide. In addition we would lose control over how our work would be sold and represented. Self-publication would allow us a degree of autonomy and oversight which would be more valuable than we first imagined. And although (as a number of publishers pointed out to us) it would put the burden of marketing on our shoulders, we felt that our unusually close relationship with a substantial existing audience would give us an advantage in that area.
Our licensing model has in principle always been based on a desire to make the collection broadly accessible to an academic audience, and possibly in the future to the general public. We also understand that the materials in our collection are not necessarily well-known or (at least at present) regarded as an essential part of the academic mainstream, and that electronic publication is still in a marginal position. As a result we need to adopt a strategy which encourages experimentation, and minimizes the risk or cost to a potential user. In other words, we have everything to gain by appearing surprisingly generous and effortless, and everything to lose by imposing even small obstacles in the way of purchase and use. Although this approach appears altruistic, in fact we believe it offers us the best chance at financial viability.
The subscription model which we have developed is entirely motivated by these concerns, although our understanding of how to accomplish these goals has matured considerably over the past two years. Several different points have been at issue:
Sale or license?
Our choice to license rather than sell the WWP textbase emerged from two motivations. The first is largely altruistic: because we envision the textbase as an expanding resource, and because of the constant development of encoding technology, we wanted to make it available in a form which would be most conducive to regular updating, and which would discourage purchasers from getting by with out-of-date versions. The second reason is more self-interested, though it derives from the first: we wanted to be sure that the WWP would receive a steady revenue stream that would allow us to continue our research and our updates to the collection. Selling access—albeit at a lower rate—would even out our revenue stream and keep our ongoing work integrated with the development and use of the textbase. For the user, it would also insure prompt, regular improvements to the collection and enable us to respond to user feedback.
Institutional license fee, individual license fee, or fee-per-use?
We wanted the decision to purchase the WWP collection to be independent of any moment-by-moment assessment of immediate necessity. Because of the marginal position of women’s writing, we suspected that if an individual instructor had to motivate a purchase, only those who already had a substantial commitment to women’s writing would actually follow through. The effort and cost would appear too great to make it worth while if one only wanted to include a single text, or wanted to provide background reading. (Other online content providers have made the same assessment; see Guthrie 1999, 139.) This would be particularly true if one tried to set the per-use fee high enough to recoup a reasonable cost; it would be difficult to avoid entering a self-reinforcing upward spiral in which fear of insufficient revenue motivates a cost which proves prohibitive. On the other hand, if the institution had already subscribed and received unlimited access for all of its members, there would be no obstacle at all even for instructors or students who had only the most casual interest. We also suspected that the material itself—once users started exploring—would be sufficiently compelling to motivate further, more intensive uses. Our goal, therefore, was to sell annual institutional subscriptions by appealing to the librarians’ understanding of the importance of women’s writing, and to encourage faculty to use what for them would be a free resource. Our expectation is that the resulting levels of use will fully repay and justify the institutional subscription fee.
Tiered or flat pricing structure?
We were surprised to find that several of the commercial publishers we approached were strongly in favor of a flat fee. This may have been partly because they were favoring a purchase rather than license arrangement, and might base the purchase price on the number of simultaneous users allowed. However, we preferred not to place limits on simultaneous users, a practice which is particularly disadvantageous for resources that anticipate simultaneous use in classrooms. A tiered pricing structure based on institutional size appeared a better alternative. Many online resource providers use this approach, and librarians regard it as fair, particularly if some flexibility is built in to accommodate larger but poorer schools. Our pricing tiers are as follows:
Group I: more than 25,000 students (undergraduate FTE)
Group II: 10,000-25,000 students
Group III: 5,000-10,000 students
Group IV: 2000-5000 students
Group V: 500-2000 students
Group VI: fewer than 500 students
We also offer an individual rate, and a graduate student rate.
Basis for institutional ranking
We considered several possible rationales for assigning institutions to a particular size group. Enrollment was an obvious choice but using Carnegie classifications was another alternative which some electronic publishers have adopted. We decided on the former because it was easily ascertainable (whereas some institutions belong to more than one Carnegie class). We also decided to use total undergraduate enrollment, rather than the size of the humanities departments, or of the entire institution. We felt that the collection was sufficiently broad-based to be used in a variety of disciplines and in survey courses as well as those for concentrators, so that keying it to the total undergraduate enrollment was reasonable (as this would be comparable from school to school). We also allow for variances to accommodate special cases such as technical colleges with a tiny humanities division.
Actual price points
The actual pricing was the most difficult decision, largely because unlike the other issues it was not susceptible to a common-sense solution, but required a certain amount of guesswork. Our original thoughts (influenced by the pricing of scientific journals and the like) were quite high, at $4000/year for the largest institutions and $1000/year for the smallest, and with fewer separate pricing tiers. From this we came down by degrees, after conferring with librarians and other projects. We also came to feel that with the small resources we could commit to marketing, we wanted to make the purchase decision as easy as possible, to minimize negotiation and caviling, and to maintain the good will we have been fortunate enough to receive from our user community. Consequently our final pricing structure is as follows:
|Group I: more than 25,000 students:||$1500/year|
|Group II: 10,000 - 25,000 students:||$1000/year|
|Group III: 5,000 - 10,000 students:||$750/year|
|Group IV: 2000 - 5000 students:||$400/year|
|Group V: 500 - 2000 students:||$250/year|
|Group VI: fewer than 500 students:||$100/year|
In addition, we decided to offer aggressive discounts (30% for 3 years, 50% for 5 years) to encourage people to commit for the long term (thereby demonstrating our likely longevity) and also to bring in advance revenue. Thus far, subscribers seem to regard the pricing as low or reasonable in relation both to the value received and to the pricing of comparable products.
The final problem we needed to address was the question of what subscribers receive for their money, and to what extent we should be concerned about protecting ourselves against willful or accidental harm (for instance, unauthorized copies of texts entering circulation or being resold). Reviewing contracts offered by other resource providers, we found that most contracts are quite restrictive in their provision for reuse of digital materials and printouts. Our position, we felt, was different from that of other providers, in the sense that we were selling access to a collection whose chief value lay in its size and scope as a collection, and its electronic functionality (searching, textual analysis, and so forth). The circulation of individual copies of printouts or HTML files would not have a substantial impact on the value of the collection (and certainly would not discourage institutional subscribers), and it was clear that easy access to such materials would be a strong selling point and a considerable convenience to users. The only real risk would be if an individual recreated the entire resource using illegally downloaded files, and succeeded in passing it off as a competing resource. This seemed unlikely, but also remediable by legal means if it should occur, and we decided that it was a risk worth taking. As a result, our access terms permit unlimited printing and downloading of files for curricular or research use. We also allow institutions to use our materials for interlibrary loan. Finally, in the event of the WWP going out of existence, or being unable to continue to support the textbase, subscribing institutions will receive a full copy of the source data to use as they wish (subject to the terms of the license agreement). This serves to protect the subscriber’s investment and provide reassurance.
The RWO project represented for the WWP an opportunity not only to add an important group of Renaissance materials to our online collection, but also to test and refine our encoding system with a corpus of earlier texts in a wide range of genres. During the period covered by the RWO grant, the WWP not only made substantial improvements on our encoding system based on our research with RWO texts, but we also streamlined our encoding infrastructure and added tools which increase the speed and accuracy of the encoding process. We also developed a customized online delivery system which provides a search and browsing environment suited to teaching and scholarly research.
The WWP uses the Text Encoding Initiative Guidelines for Electronic Text Encoding and Interchange (TEI), with TEI-conformant modifications as necessary to accommodate the idiosyncrasies of early modern texts. These modifications have been carefully documented and will be submitted to the TEI for possible inclusion in the next release of the TEI Guidelines. They fall into several categories:
- cases where TEI prescribes too strict a limitation on where a given textual element may appear, or what it may contain. For instance, TEI does not provide for handwritten annotations on the title page of a document, but the WWP has often needed to transcribe handwriting on title pages of our early texts. Another example on which we spent considerable research time is notes, which in TEI have a very strictly prescribed structure, but which in our collection take various forms not envisioned by TEI.
- cases where on the contrary TEI is more permissive than we wish to be about how a textual element is recorded, though these are less significant.
- cases where the TEI does not provide for certain kinds of information which we feel are important to record. For instance, we wish to record the name and gender of each person who contributed to the production of the text, for retrieval purposes (author, translator, editor, printer, publisher, engraver, etc.), but TEI provides no convenient place to group this information. Similarly, TEI deliberately does not provide an explicit, effective means for recording the original rendition of the text, and we have had to develop such a system ourselves.
In addition to the general structural markup of the text itself, the WWP finds it important to record and mark up various kinds of information which are essential to scholars working with primary sources in digital form, and which help provide a familiar environment for scholarly research. Most important of these is the metadata which preserves detailed bibliographic information on the source text, including Wing and STC number, source library and shelfmark, facts of publication and authorship. This information is recorded in the header for each document and is heavily exploited in our search interface. In addition, we add further documentation about the condition of the source text, including any areas which are damaged or illegible.
Encoding support systems
The WWP’s encoding staff use a Unix-based environment with an SGML-aware text editor (Emacs with psgml) for our text encoding work. This basic environment provides constraints which guarantee that the encoded texts conform to the TEI document type definition, and it also provides guidance for the encoder by offering a list of legal TEI elements at any given place in the text. Encoders begin their work with a blank template which already includes standard information and a framework for creating a full TEI header for the document. In addition, the WWP has written several tools which assist the encoder by streamlining the encoding process or by automatically tagging certain kinds of textual features. These tools include:
- automatic tagging and regularization of early use of i/j/u/v/w
- automatic tagging and regularization of Biblical citations (under development)
- automatic tagging of page number sequences
- automatic checking and flagging of errors in collation, encoding of personal names, encoding of rendition (all of which are errors not caught by the standard SGML parser/validator)
As creators of richly encoded SGML data, the WWP is one of a number of projects currently facing the same problem: the fact that SGML publication software is still scarce and designed for industrial production settings rather than academic projects in the humanities. Tools for publishing SGML content on the World Wide Web (such as INSO’s—or, since late 1999, Enigma’s—DynaWeb) are even scarcer and are also not designed with scholarly uses in mind. The advent of XML is widely predicted to be a possible solution to these problems, but at the time the WWP was planning our initial publication, we had the choice of customizing an existing application or of designing one ourselves from scratch. Although the latter option would theoretically have given us more flexibility and control over the resulting product, there were a number of potential concerns. The expense of software development was first among these, particularly because the actual cost of creating a functional system from scratch was difficult to estimate with precision. We also knew that although we could probably develop an SGML-to-HTML transformation system fairly easily for our specific texts, we would not be able to make it general enough to allow for easy expansion, nor could we easily support the rapid content-based indexing provided by commercial software. Finally, creating a new application ourselves would necessarily be an all or nothing approach—we risked being caught with no delivery system at all if we encountered any serious problems. We had already experimented with DynaWeb and although its default interface and functionality were ill-suited for our purposes, we thought we could build a customized interface with most of the functionality we sought. The advantages of this approach were that we would be able to start using the system in its uncustomized state almost immediately, and add improvements as we developed them. Furthermore, if the project turned out to be a long-term success, we could design a custom application ourselves later on, possibly taking advantage of the arrival of XML-aware software and support systems.
Accordingly we decided to build a custom interface and based our delivery system on DynaWeb. In DynaWeb the underlying infrastructure of indexing, searching, and processing the encoded data (which is performed by DynaText, an SGML search engine) is separated from the display of this data on the web. The latter works by a system of style sheets which dynamically translate SGML data into HTML for web display. From the user’s point of view, the data is simply HTML which can be viewed with a standard web browser. However, searches and word- or structure-based functions are passed back to the DynaText engine and performed on a preprocessed form of the SGML data, allowing for the exploitation of specialized markup. Thus for instance the user can limit a word search to verse drama, even though HTML has no ability to represent or flag particular genres. The advantages of this general solution for us were considerable: the user would not need any specialized software or skills, the purchasing institution would not need to install anything locally, and the value of our SGML encoding would not be lost by down-translation to HTML (as it would be in a static, one-time translation system). Also unlike systems like SoftQuad’s Panorama, which downloads an SGML text to the user’s computer and allows specialized processing to occur locally, DynaWeb can search and selectively display information from the entire corpus. Panorama requires custom software to be installed locally and can only really handle one document at a time, both disadvantages which ruled it out for the kinds of uses we wanted to encourage.
On top of this basic system, we created a custom interface which provides several important features:
- Keyword-in-context (KWIC) display of search results, crucial for viewing large result sets. This display lists search hits with about 10 words of context surrounding each hit word, allowing the user to browse the hit list and quickly identify the hits of interest. This list is also sortable by author, date, and other categories, so that the user can get a quick profile of where the hits occur, or (if sorted by date) of the changing usage patterns over time for a given word. This feature is rarely available in standard industry text delivery systems, although academics have used them in highly individual or customized systems for a long time, and in print concordances for even longer.
- Advanced search interface: Our search interface offers the user the ability to do word and phrase searches (including Boolean operators and wildcards), proximity searches, and context-sensitive searches which exploit the text’s markup to narrow a search to particular textual features specific to this collection. In addition, the user can search based on bibliographical information, such as Wing or STC number, source library or shelfmark, length or size of the book, and facts of publication, based on metadata encoded in the TEI header. To these categories we will be adding genre and subject keywords as well within the next year. In our next upgrade, we will be offering the ability to combine these different kinds of searches (for instance, to find the word “wit” within ten words of “love” within dramatic texts written between 1670 and 1680). We will also be offering the ability to save a search, either to requery it more specifically, or to reuse it in a later session.
- Navigational features: The challenge in delivering any large electronic collection is to ensure that the user never feels lost within the structure of the collection, or within any given text. Our customization offers intuitive navigation from text to text, and from section to section within a text; it also provides a clear sense of where the user is within the collection at all times. The system is able to take advantage of the structure imparted by SGML for display and chunking of the text, but at the same time it saves the user from the need to be always aware of the hierarchical structures imposed on the document by the encoding.
The early response to Renaissance Women Online and to Women Writers Online has been very positive (see Appendix A for reviews and quotes). Since the population of consumers is divided in most cases between purchasers (the acquisition librarians and consortia who make the actual purchase decision) and the end user (faculty, researchers, and students who read the texts), it may be useful to distinguish between their responses.
The response of purchasers has in the first 5 months of publication been very encouraging. As of February 1, 2000, 70 licenses have already been purchased (see our subscriber list for details), with 91 more currently in negotiation or in a trial period prior to subscribing. Over half of the purchases (39) have been multi-year licenses, indicating a confidence in the future of the collection as well as a desire to take advantage of our discounts. We also have 66 individual subscribers (of which 15 are paid and 51 are provided gratis to RWO contributors). In our negotiations with purchasers, we have also had the opportunity to hear more specific responses to the RWO/WWP collection. Some points are worth noting in detail.
Our pricing model has met with general approval both for its low cost and for the fairness of its structure. It provides six levels of pricing, based on undergraduate FTE (see above, with the top level set at $1500 per year, and the lowest level at $100 per year. In addition, we have offered an aggressive discount program, which provides a 30% discount for three-year licenses and a 50% discount for five-year licenses. This pricing system has had the effect of bringing the RWO/WWO resource within the reach not only of the poorest colleges and universities, but also potentially of public libraries and secondary schools. This affordability has been attested both by the feedback we have received at conferences and in the actual range of institutions which have purchased licenses.
Our access terms have similarly been met with approval on several counts (see our license page for a copy of our license). Purchasers have been glad to find no arbitrary restrictions on use of the texts in the collection; we allow unlimited browsing, searching, printing and downloading of texts within the licensing institution, which means that faculty and students are free to experiment with the texts and with ways of using them. We also place minimal restrictions on who may use the collection at a licensing institution: we allow for walk-in use at libraries and public clusters, proxy server access from remote sites, and use by visiting faculty and other temporarily affiliated persons. From the purchaser’s viewpoint these provisions also mean a diminished burden of security and a more trusting, collegial relationship between the licensing institution and the resource provider, a relationship which is in general seen as an adversarial one at best.
License modifications: In some cases, purchasing institutions have requested modifications to the license agreement, and these are of interest for what they tell us about how online resources are seen by institutions and in particular by their legal departments. In almost all cases, the proposed revisions revolve around issues of potential litigation: indemnification, the venue of any such litigation, limitation of liability, warranty, and the conditions of termination and payment. These requests reflect the increasing oversight of university legal departments over contracts for online subscriptions, and an increased sense of potential legal risks incurred by such contracts.
User feedback has been slower in coming, since we have not had the same direct contact with faculty that we have had with purchasers. However, we have received email comments, comments in papers delivered at conferences, and oral feedback at conferences, from both our beta-test group and from a range of users (tenured and untenured faculty, graduate students). For details please see Appendix A. In particular, the system has been praised for its powerful search interface, which is generally regarded as going far beyond what is available with other digital collections. The general user interface is seen as intuitive and elegant (again compared with other products). Faculty report that undergraduates have found the contextual materials helpful and pitched at an appropriate level of sophistication. The collection itself receives considerable praise, as always, for its breadth and its contribution to teaching and research.
To help us understand and project the needs of scholarly users, the WWP surveyed a group of scholars and librarians using a survey instrument designed by Daniel Odess, Karen Murphy, and Catharine Hall (all from Brown University), with consultation from Professor Carole Palmer of the University of Illinois. The survey asked 38 questions about the respondents’ use of and attitudes towards electronic texts, the cost and extent of their research-related travel, and their use of primary sources materials in research and teaching. We subjected the results of this survey to both qualitative and quantitative analysis, described below.
Survey Design and Aims
The survey was sent to 330 people, all of whom were prior users of WWP text printouts. This group was chosen not at random, but with the aim of representing a range of geographic locations, degrees of access to online technology, and professional positions—the latter largely academic, but also including a sample of librarians and independent scholars. Our response rate was 21% (69 responses).
Our aims for the survey were two-fold. First, we wanted to judge the practical economic impact of electronic texts on academic research, and test hypotheses about the long-term costs of using rare texts in research and teaching, as compared to the use of primary source textbases such as the WWP’s. To this end, we wanted to collect some concrete data on the kinds of costs researchers incur when using rare texts, and the sources of funding that typically support such research.
Our second, broader aim was to understand the range of attitudes and concerns that academics currently feel towards the use of electronic resources, as compared with other forms of textual material. Such knowledge, we felt, could provide important insight both into the way electronic resources are likely to be adopted in research and teaching, and into the needs that such resources must serve in order to become a useful and habitual part of scholarly life.
To elicit the more specific data on costs, we designed a set of detailed questions with multiple-choice answers to limit variation and ensure consistency among responses. For the broader questions about attitudes and text usage, however, our approach was necessarily more complex. We wanted to guide or limit the responses as little as possible, to capture the full range of respondents’ opinions. For this reason, these sections of the survey asked for open-ended comments, inviting the respondents to speak as frankly and fully as possible.
As we discovered, this approach to survey design has advantages and disadvantages. The advantages, as stated above, are that one does not artificially limit what can be said in response to the survey by anticipating a certain range of responses and foreclosing others. However, open-ended questions make it more difficult to derive quantifiable conclusions from the data. To overcome this problem, we created a set of codes that represented the specific themes and issues that we wanted to track in the responses. Each response could then be coded to indicate which themes were present, thus providing a way of identifying patterns and expressing conclusions more precisely.
Qualitative Survey Results
The Women Writers Project has already reported on the qualitative analysis of the survey results in “Scholarly Habits and Digital Resources: Observations from a User Survey,” presented at the Digital Resources in the Humanities conference in September 1998 and published online here. A copy was sent to the Mellon Foundation shortly thereafter, and an excerpt with our findings is also included for convenience as an appendix to this report (see Appendix B). To sum up here very briefly, our conclusions from this analysis were as follows:
1. Attitudes towards online resources were generally positive or at least curious; only a small minority of respondents expressed negative feelings about digital materials or indicated that they would probably not adopt them in their own work. Enthusiasm for digital materials in teaching was evident even for those who were hesitant about using them for personal research. In particular, respondents cited increased functionality (searching, textual analysis, ability to manipulate the text) and increased access as strong advantages of digital materials and research tools.
2. Respondents voiced a number of specific concerns about digital resources (even respondents who reported strong interest in using them) which clearly must be addressed by digital resource providers. These included:
- concerns about the accuracy and reliability of the text, and the integrity of the editorial treatment;
- concerns about the loss of the physical book;
- concerns about technical obstacles or inconveniences surrounding the use of digital resources (e.g. network failure, installation difficulties, ergonomic issues, software incompatibilities);
- concerns about the role of the scholar in producing electronic resources, and how that role is evolving.
3. The current cost of research using rare primary source texts, even when very roughly estimated, is substantial. Nearly half of the respondents reported annual travel costs of $1000 or more, and the average amount reported by those who responded to this question was nearly $1400. Moreover, many respondents indicated that the travel they were able to afford was insufficient for their real research needs, and that their research was hampered by lack of access to rare materials. Although access to digital resources does not altogether obviate the need (or the desire) to consult original materials, it can make this consultation more efficient and productive. It may also render some kinds of research travel unnecessary, allowing available resources to go farther. Many scholars now spend days simply transcribing or reading texts that they have traveled to see, painstaking tasks which limit drastically the number of texts they are able to consult. With prior access to an electronic source, the scholar can perform these preliminary tasks at home and then use precious research time with the original to investigate questions which only the original object can answer (such as watermarks, details of binding, etc.).
The quantitative analysis of the survey results was performed by Professor Walter Freiberger of Brown University with assistance from Vanja Ducik (also at Brown). Our quantitative analysis of the survey results revealed several additional points of interest by allowing us to look at the association between the replies to various survey questions, and between these replies and relevant demographic data.
The most important measure of association between sets of data is the correlation coefficient. If its value is zero, there is no association (or correlation); if it is +1, there is perfect positive, if -1 perfect negative correlation. For each pair of replies, we tested whether the correlation was significantly different from zero (using a p-value of .10 at the cut-off). The pairs for which there was significant correlation are listed and discussed below.
We used the two most commonly used correlation coefficients for our analysis: Pearson’s product-moment correlation coefficient r and Kendall’s rank correlation coefficient tau. The latter is less sensitive to outliers and to assumptions about the underlying distributions. The results of the two analyses were quite consistent (i.e. gave significant p-values for almost identical pairs of data). We also computed Cramér’s coefficient of association (a normalized chi-square) from a cross tabulation of the replies, again with consistent results.
Rank in profession correlates with positive feelings about increased access (p= .022, r=.28), and with negative feelings about technical obstacles (p= .022, r= .28). It correlates negatively with general negative feelings (p= .066, r= -.22), and with use in classrooms (p= .06, r= -.23). That is, more senior faculty are not as likely to use etexts in their teaching, but these faculty are more likely to have positive feelings in general about electronic texts and about increased access to primary source materials in particular. They are, however, more likely to be apprehensive about possible technological problems.
People currently using digital materials are of course more likely to plan to use them in the future, and this was confirmed in our results regardless of whether the current use was teaching or research. Current research use was also associated with increased negative feelings or concerns about ergonomic issues (p=.098, r=.2), which makes sense, given likely first-hand experience with these problems, and with positive feelings about increased access to source materials (p=.09, r=.205) and increased functionality (p=.0001, r=.46). It is significant for us that these are the two areas which seem to emerge most strongly for current users, since these are our two primary goals for the RWO/WWO resource.
Future plans to use digital resources in teaching were correlated with a number of cost concerns: with high travel costs (p=.098, r=.2) and with a sense that research had been hampered by lack of access to rare texts (p=.042, r=.26). These future plans were also correlated with positive feelings about the possibility that digital materials would reduce the cost of access to rare texts (p=.024, r=.247).
Future plans to use digital materials in research were correlated with both positive and negative feelings about these resources. Specific concerns which correlated concerned the lack of scholarly apparatus (p=.08, r=.2); while on the positive side these respondents (like the current users) looked forward to increased access (p=.005, r=.34) and functionality (p=.0, r=.47).
Although these results are locally suggestive, it is difficult to generalize from them to draw larger conclusions beyond what the qualitative analysis yielded. This could be partly the fault of the survey design, which focused more on eliciting nuanced statements about attitudes and preferences; however, it could also be that at this stage clear patterns have not yet emerged.
Although the terms in which the RWO initiative was originally couched proposed a comparison of cost between print and digital resources, and on demonstrating the viability of the digital medium as a tool for scholarship and teaching, it has become evident in the course of our research that this focus was in some respects premature. This is by no means because we have found the digital tools unviable—on the contrary, every indication points to impressive gains and improvements which are made possible by the use of online resources (detailed below). However, a meaningful cost comparison requires a stable product, a comparable kind and quantity of material, and a comparable infrastructure of use. At present these conditions do not yet obtain for a useful comparison between printed texts and the electronic texts which represent the most promising direction of digital resources for academic use. The RWO collection is still, in effect, an experimental tool, a technology which must still be regarded as a research effort.
The terms “experimental” and “research” deserve further explanation, since they seem to imply something which is unfinished or not yet functioning. In fact, as our subscribers are finding, the RWO/WWO site is functioning very well, and the collection itself has reached a state of critical mass where it can be truly useful for research and teaching, although of course it will continue to expand for years to come. What we mean by “experimental” is that this technology does not represent the endpoint of some line of research, but rather its midpoint. Far more than HTML, richly tagged TEI/SGML/XML is a technology with tremendous untapped potential, from the standpoint of both delivery systems and data creation. Software which can fully realize this potential, and the substantial collections which will make online research truly natural and efficient, are still in the process of development. To an extent, even the encoding methodology is still a matter of ongoing research. The still-experimental nature of these matters is not evidence of their intractability, but rather of their scale and complexity, which is directly related to their promise as the foundation for a truly functional digital library system.
If, therefore, we would like to claim that digital resources and tools do offer substantial gains over conventional media, in both cost and function, this claim can most convincingly be made by pointing to the long-term future prospects for these technologies, once the appropriate infrastructure has been established. It is tempting, given the apparent wealth of material currently available on the web (in HTML and in more sophisticated systems), to assume that this material actually represents the achievement and the fulfilled promise of digital technology for the academy. In fact, it represents the mistakes, the wrong turnings, the failed or partially successful experiments, the compromises made in the process of working towards that achievement, and the immense usefulness of these materials is a lucky by-product of experimentation. We can also see from the proliferation of online materials that the issue is not primarily one of viability but of which methodologies are most successful: which ones enable people to make fundamental improvements in their work.
No firm conclusions, we feel, can really be drawn about the future profile or cost of online technology from what is currently considered representative, and least of all from ongoing research efforts, which is how the Women Writers Project still envisions its work. However, what we can describe are the areas in which our research shows online resources offering potential long-term cost savings, substantial changes in the ways that research costs are incurred and met, and (perhaps most importantly) substantial improvements in the efficiency and effectiveness of research and teaching.
Faculty responding to our survey reported average annual travel costs of $1400 to consult research materials at rare book libraries. Since a substantial proportion of this travel is necessitated by the unavailability or limited availability of these materials in other formats (e.g. scholarly editions, facsimile editions, microfilm), easy online access to these materials would obviate a considerable amount of these travel costs. There is currently a problem of scale here, since most rare texts are not in fact available online in a research-friendly form, but there are large-scale initiatives like the WWP’s (notably Early English Books Online) which may be a start.
Online access can also make research travel more efficient, and allow the researcher to accomplish more for each travel dollar. Many scholars spend valuable research time simply reading or transcribing texts or making notes on their content, which equally can be done from an online source. Moreover, scholars often travel to visit collections without knowing exactly which texts they want to consult, and waste time skimming through materials which prove irrelevant. Online resources allow scholars to search large collections and identify materials of interest in advance, allowing them to make the best use of their time in the archive.
These are clearly advantages which are purchased for the scholar by the institution. Clearly the shift in cost (up or down) will depend on whether the institution is paying for the travel to begin with, and on how many of its faculty travel for research. Over the long term, however, once the startup cost of digitization has been met and a system of online research has become natural to the research community, the net effect will be that access to research materials will be much more evenly and widely distributed. Money which formerly supported individuals’ research will now also benefit faculty who would not have qualified for research grants, or who could not afford to subsidize their own travel. The institution may pay the same, or more, but it will get more research for its money.
Curricular xeroxing, course packets, and textbooks
Although we do not have estimates of the amount spent by students on course-specific materials, the costs are substantial, on the order of $100 per semester, and many of these materials are useful only for a particular course—textbooks and anthologies of which only small excerpts will be used, specialized course packets, xeroxes of reserve materials. Replacing these with centrally accessible online resources would allow efficient reuse, in addition to the obvious and frequently cited benefits of round-the-clock universal access to reserve materials.
Important changes in costs
Clearly most of the cost savings described above are in fact the result of shifting costs from one level of consumer to another.
- shifts in the funding of the preparation of textual resources: from individual scholars and editors (funded by salaries, small grants, and fellowships) to scholarly digitization projects (funded by large grants, faculty release time)
- shifts in purchasing: from individual volumes (purchased by libraries but also by faculty and by students) to larger collections (purchased by the library only)
- shifts in the allocation of research funding: from travel grants to visit libraries (by departments, for research faculty) to digital collection purchase (by library, for entire campus)
- shifts in the allocation of funding for teaching: from textbook and photocopy purchase (by student) to digital collection purchase (by library)
These shifts are accompanied by changing expectations about the independence of institutions, and a new sense of the need to harness networking and the metaphor of networking as a model for more efficient, more effective use of resources. Indeed, the current high cost of digital resources may perform a backhanded service by forcing institutions to think globally in both the creation and the purchase and dissemination of these resources.
It is important to note that our understanding of digital resource creation is still limited and growing very quickly. The current costs of developing these materials do not represent the ultimate cost of resource creation; rather, they are the costs of discovering how to do resource creation. As a result, generalizing from the current costs can be misleading, and a desire to see an immediate reduction in costs as a result of moving to a digital platform is unrealistic.
Amid the radical changes in the nature and cost of basic research tools—primary texts, library collections, curricular materials—we see the potential for dramatic changes in how research, teaching and learning take place. A student or researcher given access to a substantial full-text collection with a powerful search interface, such as RWO/WWO, can accomplish far more in a given interval than with the same texts in print form. Compared with a textbook or course packet, the online collection offers the student entry into a far broader intellectual landscape; compared with a library, it dramatically reduces the logistical overhead of identifying and getting access to the desired text. And since we also provide for the printing of individual texts, the slow individual work of reading a text in detail is still possible—indeed, easier, because one is working with one’s own printout rather than a fragile original. The combination of these two factors results in a reading and research experience which is profoundly different: one in which the text is always contextualized by the other documents in the collection, and in which avenues of reading may be pursued within a single text, a group of texts, or the collection as a whole. And for students as well as faculty, the advantage of access to rare and fragile texts opens up avenues of research which were simply not possible before. The issue for these people is thus not how to make digital resources a cheaper way of accomplishing the same thing, but rather how to make the most of their potential as a new technology.
Moving even faster than these changes are the expectations users bring to their work with these tools. Contrary to the assertion of Robinson and Taylor quoted at the beginning of this report, we see faculty developing higher expectations for the kinds of research they perform, and imagining new strategies for teaching. One has only to consult the abstracts from recent ACH/ALLC, and DRH conferences to see the immense amount of creative work in this area. And again, these faculty are increasingly looking to digital resources not to provide the same thing more cheaply, but to provide something new, something which expands the horizon of expectation. As little as five years ago, the WWP had difficulty getting faculty to imagine what online research would be like, or how they might use online tools. Within the past year, we have found that increasingly they are asking for features which not only exploit the digital medium, but far outstrip our current capabilities and look ahead to future technologies.
What conclusion should we draw from these rising aspirations, particularly since they represent (at least at present) a dramatic rise in the cost of providing an adequate research environment? It is tempting to call for a limitation on these costs, and perhaps a requirement that they be justified by clear, immediate gains in productivity or cost savings elsewhere in the system. At the WWP, our understanding of the current state of digital tools indicates that this is precisely the wrong moment to call for such a reckoning. These tools at present are experimental, and the technologies which show the most promise—rich text encoding, database technology, XML, and similar tools and methods—are only now reaching a point where their practical benefits are being explored and realised. The research costs associated with their development also appear somewhat high, although this point deserves more careful scrutiny. We need to see these costs in their true proportion: supporting not only the development of the individual project or product which has provided a local testbed (as in the case of the RWO collection), but also the much larger context of present and future products which are made possible by the development of these tools and methods. In ten years time the system of rich text encoding which the WWP and other projects are working to perfect may support an entire digital library of online full-text resources in the humanities. High as the development costs for these essential technologies appear, then, they must be amortized over both a long productive life and over a very wide range of projects and products.
Whatever the cost, though, the possibilities these tools offer only just keep pace with the expectations which faculty and students—the research community of the new century—are now articulating. If the university is to remain a place which responds to the intellectual needs of its community, it bears the responsibility for supporting the tools and resources that meet those needs. In a sense, the university has begun a long and difficulty odyssey by embarking upon the path of new technologies: it is not a path on which one can either turn back or go only a few steps and stop. It may not appear easy or cheap to continue, but again, we need to think of the real potential scope and longevity of the benefits we stand to gain; in the end the results will surely be worth the work. In assisting with the Women Writers Project’s research, and with the development of resources like RWO and Women Writers Online, the Mellon Foundation has contributed greatly to achieving those results.
Library Journal, November 15, 1999
“The Women Writers Project, located at Brown University, has been working since the late 1980s to bring underrepresented women’s writing into the technological realm. Its elegant solution is now available on the web. From the earliest listed work, Margaret Roper’s A Devout Treatise Upon the Pater Noster (1526), up through A Legacy for Young Ladies by Anna Laetitia Barbauld (1826), this constantly growing (178+ entries), full-text database is a treasure trove of lesser-known texts by women. The searchable collection is culled from original texts at 26 major research collections, including the Huntington Library, Bodleian Library, a variety of British libraries, and university libraries at Cambridge, Harvard, and Brown.
“The interface can be complex for sophisticated searching, so users should expect to spend some time with the system to exploit the site to its full potential. But this amazingly rich collection is also immediately accessible using a simple search box. There are many access points into the file, including author, date, publication location, publisher, text size, text length, source library, and Wing or STC numbers. The content is plain text: no images are included (although there are notations where flourishes do appear). Page breaks are delineated, and the text is organized accordingly. A very sweet feature is the ability to choose whether to use frames or not (would that more systems gave us this option!).
“The designers of this resource practice full disclosure, a rare feature in a time when so many other full-text commercial sites fail to describe adequately the extent of their contents. You can read about the WWP’s project history, join its listserv, and keep up via newsletters. There is also ongoing upgrade information: in one instance, an excellent help screen points out a bug in the sorting command while at the same time announcing that it will soon be repaired. The encoding methods employed, including transcription and editorial principles, are easily available via the site index. A bibliography, calls for papers, related sites, etc. are all easily located within the database.
“The Bottom Line: Women Writers Online is a specialized database; any library serving literature and history researchers will be interested in accessing the early female written voice. Highly recommended for large public and academic libraries.”
Names have been withheld for privacy reasons.
“The project is a remarkable contribution to scholarship and teaching.” (professor at Framingham College; email sent in August 1999.)
“I love the website, and I think the contextual materials are particularly well done. They also suggest really helpful starting points for additional research — I’ve already taken some of the site’s ‘advice,’ and I can imagine referring my students to these pages as well.” (professor at Stanford University; email sent in May 1999.)
“We have had an excellent response from the faculty here. One came running up the stairs to tell me how excited he was that we might have access to WWP. So, I don’t think we need to try it any more—we’re sold and would like to subscribe.” (librarian at Mount Holyoke College; email sent in October 1999.)
“We definitely want it [RWO/WWO access] and want to go ahead and get the five year license. It is a wonderful resource.” (librarian at Sweet Briar College; email in July 1999.)
“I received an unprecedentedly enthusiastic response from Syracuse University faculty, so I am taking steps to subscribe.” (librarian at Syracuse University; email sent in August 1999.)
(Full text of original report here.)
Use of Electronic Resources
Our first focus for the survey was to determine the extent to which humanities scholars currently use electronic texts for research and teaching, and the extent to which they plan to do so in the future. 46% of respondents described some current use of electronic texts or online resources in their research. 40% described some use of electronic resources in their teaching (including use of printouts generated from online resources). Altogether, 59% of respondents described using electronic resources in some manner in their current work as scholars and teachers.
43% of respondents described some sort of future use of electronic resources in their research. 52% of respondents described future use of electronic resources in their teaching. Altogether, 67% of respondents described some intention to use electronic resources in the future, in their teaching or research.
The teaching responses were divided fairly evenly between junior and senior faculty with a slight preponderance of the latter, but research use showed nearly twice as many senior faculty as junior faculty. These ratios were also evident in the responses about future use.
As part of the coding process, we distinguished between ‘basic’ use of electronic resources and ‘advanced&lsquo use, where ‘basic’ use involved simple on-screen reading, basic searching, and using printouts for research or teaching. ‘Advanced’ use involved activities requiring more sophisticated framing of research questions, or a more complex interaction between user and text. Advanced classroom activities might involve online editing by students, or computer-assisted textual analysis, or the creation of a class web site. Advanced research activities might include anything from detailed searches using metadata, to the creation of an electronic edition. Of the 28 respondents who indicated that they used electronic resources in their teaching only 6 described what we would categorize as ‘advanced’ use; the same percentage was evident in the descriptions of future classroom activities. Of the 32 research respondents only 4 described advanced use, and this percentage dropped (2 out of 30) when looking at future research use.
Besides the ‘basic/advanced’ axis of comparison, we also noted the level of detail in which respondents described their work with electronic texts, to gauge their knowledge of electronic research and teaching possibilities (although with the caveat that absence of detail in a short-answer survey does not necessarily indicate lack of knowledge). 14 respondents described their work with electronic resources by mentioning a specific research activity which corresponded to a specific set of research questions: for instance, ‘search for pamphlets and relevant commentaries on primary texts’. The more frequent response was of a more general nature, at the level of ‘get access to rare materials’.
The range of research projects and activities was surprisingly broad. Respondents described stylistic and grammatical analysis, manipulation of texts within a corpus using metadata to assist discovery and analysis, comparative work with textual variants, and use of concordances, as well as specific searches for words, phrases, references, etc. There was also considerable interest in using finding aids and metadata to assist with the early stages of research: discovering resources, getting information on provenance, material history, and so forth.
A similar breadth was evident in the teaching projects described. The most frequently mentioned teaching use was the creation of online textbooks or print anthologies drawn from online sources, which confirms the WWP’s own experience. In addition, however, respondents mentioned having students compare different editions online, having them do online editing projects, teaching them literary research methods with online resources (both primary and secondary sources), performing structural analysis of narrative, and teaching analytical search methods. Some also mentioned developing class web sites, whether to provide access to online course materials, or as a showcase for students’ work.
Above all, however, respondents in both categories mentioned access to rare materials and information about them as the primary advantage of electronic texts, with teachers giving particular emphasis to the inclusion of these materials in course anthologies. This interest in access is discussed in more detail below.
If we compare the kinds of electronic research activities described with conventional research activities, we can see a range of diminishing correspondence which complicates efforts to compare the use of print and electronic texts; some basic activities match their conventional counterparts very well, but the more complex activities seem in many cases to be profoundly different. At the level of discovery, where the researcher is seeking textual data (either looking for relevant texts or for particular passages), the activities correspond quite closely, although the process is much easier with electronic resources. Similarly, discovery of textual patterns at an observational level (what is termed close reading in literary studies) is quite similar; with electronic assistance the user can speed up the process or render it more precise and exhaustive, but the process itself is more or less unchanged. The quantitative study of textual patterns, however, would be nearly impossible with conventional texts (although not unknown; the New Shakespeare Society of the 1870s put a great deal of effort into this kind of work). Discovery of textual patterns using the kinds of structural encoding which electronic texts make possible is again almost inconceivable with conventional texts and has no real counterpart in traditional literary study.
Uses of electronic resources in the classroom similarly display a diminishing correspondence with use of traditional materials, as complexity of work increases. In providing access to rare materials, electronic texts provide an easier alternative to photocopies, although their added readability (through control over font size, layout, and other aspects of display) is a feature which cannot be duplicated in photocopied originals. The electronic medium also provides for student collaboration, but teachers have long known how to foster collaboration without electronic assistance; the new medium simply provides a convenient infrastructure whose convenience may or may not produce better results. Computer-assisted textual analysis, however, is likely to be quite different from the work most students do with conventional texts, and the quantitative difference in the amount of data one can generate with these methods amounts to a qualitative difference in the research that is possible: instead of focusing on the process of observation and data-gathering, it allows the student to focus on the textual patterns revealed and the kinds of analysis that produce them. In short, it gives an entirely different kind of insight into the way texts work.
Scholarly attitudes towards electronic resources
The survey responses revealed a number of interests and concerns which are of particular interest because they could have a substantial effect on the adoption of electronic resources. These attitudes are also of interest for what they can tell us about users’ needs and what they expect from electronic resources. Six of the most salient are discussed below.
1. General attitudes towards electronic texts
24 respondents (35%) expressed generally negative attitudes towards electronic texts (not just expressing specific concerns, but indicating a fundamental dislike or distrust of electronic texts). Of these, 5 expressed a very strong negative impression.
30 respondents (43%) expressed generally positive attitudes towards electronic texts, including a desire to learn more about them, a desire to use them in the future, or a general optimism about their role in academic work. Of these, slightly over half expressed a very strong positive impression.
There is a slight overlap between these groups: 7 respondents expressed both positive and negative overall impressions. Of these, 3 expressed a strong positive impression and a mild negative impression. None expressed a strong negative impression.
22 respondents (32%) expressed neither a positive nor a negative overall impression, either because their responses were too vague or incomplete to express any emotional color, or because they had a neutral opinion of electronic texts.
2. Desire for access
52 respondents (75%) indicated that they regarded increased access to textual material as a strength of electronic texts, and interest in issues of access was apparent throughout the survey responses. However, respondents used the term with a range of meanings that deserved closer attention. The domain of the word ‘access’, first of all, seems to split between the idea of gaining access to specific materials, and the broader notion of making materials generally easier to use. In the former category, rare texts and non-canonical texts were of paramount interest. Scholars indicated that one of their chief reasons for using an online textbase would be to increase their access to materials that are difficult to find in print form for teaching and research. The concept of access here engages with the boundaries that physical scarcity, cultural positioning, and fragility have placed around these texts—issues that affect women’s writing to a disproportionate extent. In its broader frame of reference, on the other hand, the term ‘access’ (or ‘accessibility’) expresses a desire that textual materials of all sorts be made available in more useful forms. For student use, this could mean providing a modern-spelling version, or including annotations or glossaries; for research purposes, it could mean the ability to view color images of manuscripts which reveal invisible details, or the ability to compare variant editions of the text.
Access, in the survey responses, had additional connotations as well. For some, it was tied to democratizing the use of rare texts: reducing costs and removing obstacles for students and researchers of limited means, and opening up access to these materials to a much broader audience. Others raised technological issues, noting the increased importance of a networked environment for research and the growing ability to work with textual materials from home, office, or while traveling.
Finally, networked access raised the issue of the ‘accessibility’ of digital resources in the sense of their user interface and functionality.
3. Concerns about accuracy and reliability
24 respondents (34%) cited concerns about the accuracy or authority of electronic texts; for about half of these, concerns amounted to a profound distrust and a potential or actual reason not to use electronic texts. All of the respondents who indicated a strongly negative overall impression of electronic texts also indicated a strong concern about their accuracy.
While the trustworthiness of the text would seem to be of self-evident importance, what was less predictable was the particular forms that concern for accuracy took in the survey results. Anxiety about the accuracy of electronic texts was so acute that some respondents discussed it even in answer to questions on other subjects, and it clearly represented the single largest obstacle to general scholarly use of electronic texts.
Having said this, however, several other points of interest should be noted. First, concern over the accuracy and reliability of electronic texts was much greater in scholars’ discussion of their research work than in their teaching. Very few respondents cited concern for inaccuracy as a reason for not using electronic texts in their classrooms, whereas a far greater number said that they would either not use electronic texts in their research, or that they would use them only where they could also check them against the originals.
Second, survey respondents expressed two quite different kinds of distrust of electronic texts. The first was an essentially pragmatic sense that currently available texts are not yet being produced to scholarly standards: an opinion based on observation and use of these materials. Respondents in this category seemed able to imagine the possibility of reliable, accurate electronic texts becoming available in the future, and to imagine using them when this happened. Alongside this group, however, was another that tended to express concerns about the fundamental nature of electronic texts. Respondents in this group tended to speak of a gut-level sense of the ‘book-ness’ of books and their aesthetic qualities (‘aesthetic’ here meaning the physical and cultural properties of the book form as well as its attractiveness)—these issues affecting the perceived trustworthiness of the electronic text and its cultural positioning rather than its actual transcriptional accuracy. Concerns of this nature seem more likely to persist even if electronic texts do become a standard and reliable form of scholarly resource. Scholars often attributed these reservations to the nature of their scholarly training, giving these concerns an intrinsically greater durability than the more pragmatic worries described above.
Another frequently cited issue of textual reliability was the question of editorial integrity, including both the textual choices made (choice of edition, editorial treatment) and the scholarly credentials of the editors involved. Most frequently, scholars noted that these choices are not sufficiently foregrounded in most electronic texts, leaving the reader with no sense of what sort of text he or she is using.
4. Interest in functionality
26 respondents (38%) cited function as an advantage of electronic texts. Of these, 10 expressed a strong enthusiasm for this aspect of electronic texts. For many of these respondents, functionality meant the ability to search the texts for words and phrases, or to search collections for particular works. This general sense of the usefulness of basic retrieval is widespread, and is supported by the growing use of the World Wide Web, in which this kind of retrieval has become quite natural and expected. In addition, respondents described more specific kinds of retrieval, including keyword searching, reference searching, and searching which relies on textual markup (for instance, searching for personal names or Biblical references).
Respondents also mentioned more advanced functions which they either use currently or anticipate using in future electronic resources. These include using electronic concordances and performing statistical vocabulary studies, working with textual variants and parallel versions, and supporting the development of other online research tools and projects (for instance, databases or hypertexts).
5. Concerns about the loss of the physical book
20 respondents (29%) cited concerns about electronic texts centering on the absence of the physical book. These concerns included a need to refer to physical details of printing and binding, or an aesthetic desire to touch or see the book. While to some extent (as with concerns over reliability) this more aesthetic concern can be thought of as a conceptual gap between the user and the text—evidence of a paradigm shift which may take time to become naturalized—we also need to take seriously the need for information about the physical object to support certain aspects of research. The WWP currently encodes information about the collation, pagination, lineation, and layout of the original, as well as identifying the source copy. We are also exploring the possibility of including images of title pages, illustrations, and perhaps in some cases whole texts.
6. The role of the scholar
Closely tied to the question of editorial integrity above is the role of the scholar in the preparation of electronic texts, and how that role is imagined and evaluated.1 Concerns about the trustworthiness of the text were quite often framed in terms of a desire for a recognized name and the authority it carries, as a way of assessing the value of a given edition. The lack of such authorities was cited by some as a crucial difference between the cultures of electronic and print media.
Even more prominent in the survey responses, though, was an engagement with the issue of scholarly intervention and its role in the creation of textual resources. A few respondents said that they valued the electronic medium particularly where it offered easy access to an unedited, unmediated version of the text from which they could draw their own conclusions. However, a number also said, on the other hand, that they valued the input of a scholarly mind, and preferred to use texts edited by a trusted expert rather than the original source. This desire for scholarly intervention, though, was offset by a concern for the untrustworthiness of editing in many electronic texts.
These are issues that arise with equal potency in the realm of print texts, but in the context of the electronic medium they take on a peculiar force, since they affect the adoption of the medium as a space for scholarly research. The role of the scholar is particularly important because as yet few scholars are actually involved in the creation of electronic texts. This is both the cause and the effect of the current institutional positioning of work in the new medium: professional assessment does not yet give much (if any) credit for the preparation of electronic resources, and scholarly publication still has a sense of awkwardness about the citation of electronic materials. As a result they remain peripheral to the mainstream of academic work.
Finally, there was some concern—though not as much as might be expected—about specific scholarly choices: about theories of editorial method, which copy-text is chosen, methods of transcription, etc. On the whole people seem willing to trust the preparers of textual resources as long as they feel them to be trustworthy. Presumably this trust could also be extended to digital resources once these become a more familiar part of the humanities landscape. Interestingly, no mention was made of the integrity of the encoding or electronic preparation of the text. From this it appears that scholars are not yet aware of the kinds of specific choices that go into the preparation of an electronic edition, and the effects these may have on its quality—both as scholarship and as a digital product.
At this stage, any conclusions based on the survey data are necessarily provisional. However, some general points emerge and are worth mentioning. First, it is clear that a substantial number of humanities scholars and teachers are now using electronic resources, and still more plan to use them in the future. At present this use is primarily concentrated on simple activities, and is by no means central to most scholars’ work. However, the dependence of humanities research on primary sources has meant that this area has lagged behind other disciplines in adopting digital materials, since electronic primary source texts in the humanities have been scarce until recently, and their reliability and adequacy for scholarly research remains an issue (see Pavliscak et al., 1997). The responses to this survey indicated that even most of those who did not currently use electronic texts at all—whether through lack of materials in their area or through concerns about accuracy—would be interested in using them if they could find accurate, useful materials in their field. The problem to address thus seems to be one of supply rather than demand: if source materials which respond to scholarly needs appear and are affordable, it seems likely that they will be used.
Judging from the responses in this survey, scholarly needs are neither unpredictable nor difficult to meet. First of all, source materials should be minimally mediated and should make explicit their editorial treatment. Hand in hand with an interest in direct access to textual data is a sense of the enduring importance of scholarly expertise and judgment, and clearly one challenge for resource providers is to negotiate this duality: to include and distinguish the human contribution in a way which responds to the user’s needs. Similarly, source materials must be accompanied by careful, detailed information about the text and its relationship to the original source. This is not only for practical reasons such as citation, but also because without this correspondence the electronic text lacks a crucial element of bona fides for the scholarly user. Finally, the intellectual ergonomics of the digital resource must respond to the actual activities involved in research and teaching.
In short, the gap between the culture of use and the available technology should not be impossible to close, since even in scholarly perceptions it does not appear to result from a fundamental incompatibility between information technology and scholarly needs. On the contrary, many of the online resources now being published emerge out of long-term scholarly research, with their roots in profoundly felt desires for access to textual materials and information. They respond at a deep level to scholarly needs, and it would be strange indeed if they were repudiated by the very culture that brought them into being.
- Guthrie, Kevin, 1999. “JSTOR: The Development of a Cost-Driven, Value-Based Pricing Model,” in Technology and Scholarly Communication, ed. Richard Ekman and Richard E. Quandt (University of California Press, 1999).
- Hall, Stephen, 1998. “Literature Online—Building a Home for English and American Literature on the World Wide Web,” Computers and the Humanities 32:4 (1998), 285-301.
- Pavliscak, Pamela, Ross, Seamus, and Henry, Charles, 1997. Information Technology in Humanities Scholarship: Achievements, Prospects, and Challenges: The United States Focus. ACLS Occasional Paper, no. 37 (American Council of Learned Societies, 1997). http://www.acls.org/.
- Robinson, Peter, 1996. “Is There a Text in these Variants?” in Richard Finneran, ed., The Literary Text in the Digital Age (University of Michigan, 1996).
- Robinson, Peter, and Taylor, Kevin, 1998. “Publishing an Electronic Textual Edition: The Case of The Wife of Bath’s Prologue on CD-ROM.” Computers and the Humanities 32:4 (1998), 271-284.