Scholars and their tools: building the digital archive of BIQ

By Adam McCune

When I began working on the digital archive of Blake/An Illustrated Quarterly, the vast scale of the project—over 2000 articles in multiple formats (PDF and either XML or HTML) published across nearly 50 years—made systemic corrections and adjustments very difficult. For example, I found that we needed to format each issue’s table of contents differently from regular tables, but the tables of contents were not labeled or distinguished in the XML in any way. I went through nearly 200 issues, labeling each table of contents by hand.

To take an example on an even larger scale, when I joined the project, each XML file represented an entire issue (not an individual article), and no tag or label distinguished an article from smaller units (a section of an article) or from larger units (a section of the issue including, for example, book reviews). This not only meant that an article could not be displayed on a separate page from other articles in the same issue, but that search results could only tell you if keywords appeared a particular issue of the journal, not the title of the article in which the keywords appeared. When we decided that this was not an …read more