This week’s special guest post comes to us from a familiar face: Stephanie Pettigrew, whom you may remember from this year’s CHA Reads! I’m very excited to share this guest post from her, which is based on her work on the upcoming British North America Legislative Database. This database, which is hosted by the University of New Brunswick under the direction of Elizabeth Mancke, collects together all legislation passed by the Pre-Confederation colonies of eastern British North America, including Nova Scotia, Cape Breton, PEI, New Brunswick, Upper Canada, Lower Canada, the United Canadas, and Newfoundland. The database is still under construction, but once it is complete, it will be an invaluable resource to historians of the eighteenth and nineteenth centuries as well as anyone who teaching Pre-Confederation Canadian history. It seeks to, among other things, remedy some of the searching problems found in other databases, like Early Canadiana Online (ECO). So without any further ado, enjoy!

Stephanie Pettigrew

Stephanie Pettigrew is a PhD candidate at the University of New Brunswick studying the history of witchcraft in New France. She is also the project coordinator for the British North America Legislative Database (bnald.lib.unb.ca), which seeks to digitize all the pre-confederation legislative acts from the provincial legislative assembly.

 

Building a Sustainable, Research-friendly Digital Database from the Ground Up

Digitization is the rage these days, and it’s not hard to see why. Beyond the initial investment of digitizing archival collections, it is much easier and much cheaper to maintain a digital archive than a physical archive. Don’t get me wrong – I am in no way advocating for the mass shut down of our physical archives. There will always be a need for the conservation of physical documents. Despite the major advances of technology, there are thousands of things that can go wrong with an online, digital archive: servers can die, lack of proper backup can mean one server and all of its content can be taken out by one virus, and poof! There goes hundreds of thousands of dollars’ worth of work. If you’ve been so careless as to destroy the originals, there goes all possibility of recuperating your content, too. However, in a climate of fewer and fewer travel grants, digitization of primary sources makes the lives of many a researcher (and graduate student) much easier, more cost-effective, more environmentally friendly, and allows for easier collaboration across geographic spaces.

Home page of the BNA legislation database

Home page of the Eastern BNA legislation database.

It is with all of this in mind that we set out to create our own archival database, the British North America Legislative Database (bnald.unb.lib.ca). The purpose of our database is to not only digitize all of the legislative acts of the legislative assemblies of the British North American colonies from 1758 to 1867 (in other words, the Canadian provinces from the start of their respective legislative assemblies to Confederation), but also to make them one hundred percent text searchable on a site that is as low-maintenance as possible. We’ve made all the acts downloadable in both their original and transcribed form. Oh, and it’s free. No university subscription required, and no pay wall.

So what makes us different from other web sites that have digitized Canadian legislation? Fair question. We actually source many of our PDF documents from Early Canadiana Online (ECO), with their permission. There are several differences. Firstly, you can’t search within legislative documents on ECO until you’ve found a specific document. You can only search the tags that have been posted by Early Canadiana, or by the titles posted for specific documents – which is extremely problematic. For example, when searching for the pre-Confederation legislative acts of Nova Scotia, you’ll need to use four different search term (basically different small variations on the words “Nova Scotia Legislative Assembly”) before finding them all. {Good luck with that…} Secondly, the collection of acts held by ECO are not actually complete. They’re pretty good, but ECO is missing many acts from the early legislative assemblies of Nova Scotia, all of the ordinances from Cape Breton when it stood as its own province, a few legislative years from the province of New Brunswick, quite a few years from Prince Edward Island, and so on. So what we’re building is a more complete collection, and has its own search engine that can search every single word of every single act. And what’s more, our internal search engine allows you to search all terms from every single text of every single act, all at once.

 

The Digitization Process

example of an entry in the database

Sample entry  from the database.

In order to achieve this, we’ve taken images of the original acts – either downloaded from ECO, LLMC Digital (a conglomerate of several legal libraries), or used original photos we’ve taken ourselves from the Colonial office archives in the United Kingdom – and transcribed them. They’re separated into individual acts, categorized by province and year passed, and both the transcribed and original versions are made available on individual nodes (entries) on the database, as you can see in the image above.

The down-and-dirty coding of the database is done by the extremely talented people at UNB’s Centre for Digital Scholarship, who have put up with my insane requests for modifications for almost four years now. Thanks to the structure they’ve built for us, all we have to worry about is the actual digitization.

There are several steps to this, depending on what kind of source you’re starting out with. The first thing that is needed in an image of the original document. Once this image has been secured, it is then cleaned up in Photoshop (to fix any problems with the document, like smudges or lines) before being processed by an Optical Character Reader (OCR). Finally, each resulting transcription will be reviewed and compared to the original document to ensure its accuracy.

Once the transcription is complete, we load the entirety into the node in a text box that isn’t visible to the public – its sole purpose is to populate the internal search engine. This internal search engine is what makes the database searchable by word, phrase, place name, or any other criteria that you could possibly think of. Except regnal years. We’re still having an issue with that. Sorry legal historians.

 

The Highs and Lows of Metadata

Image of the main search page.

This is the main search page for the database.

If you’ve used online archival databases before, you’ve no doubt encountered the metadata search. This is the list of terms or categorizations where you need to choose which term best fits with what you are researching, and hope that the term you’ve selected will return the results you’re looking for. For image archives, metadata is absolutely necessary – you have to categorize images in some way; unfortunately, the importance of images will always be, to a certain extent, subjective. However, we did not want to constrain researchers to metadata terms when it came to research. Just because we think a concept within a certain legislative act was important doesn’t mean someone else will think the same way that we did. After all, document research is completely subjective, and we are attempting to recognize that by making text searchability the focus of the database.

Which is not to suggest that the database is completely devoid of metadata. As you can see in the examples pictured above, we still have lots of metadata. There are some categorizations that will always be useful. For example, you can search by province, year, jurisdictional relevance, and yes, even by concept. But the way that we’ve tried to use our concept tags is either to group big ideas together, or in order to solve research problems that we’ve identified while classifying documents.

Results of the Acadia Tag

Here are the results of the Acadia tag.

One of the first examples of using metadata to problem solve was our “Acadians” tag. The first province we completed was New Brunswick (naturally – we are, after all, located in New Brunswick), and there were a large number of documents that had completely misspelled Acadian family and place names (think “Bodro” instead of “Boudreau” and “O’Coin” instead of “Aucoin”) that could be historically significant. While we did our best to make sure that place names had their modern-day equivalent somewhere in the document to make it more easily found, there was nothing we could do about the family names. So we created the “Acadians” tag, to help historians and genealogists find those particular documents.

The 1837-38 Rebellion Tag

Searching for the 1837-38 Rebellions.

Similarly, the Upper Canada legislative assembly of 1838 and 1839 passed a number of acts relating to the Rebellion of 1837. However, there were a variety of terms used to describe both the Rebellion itself, from “insurrection” to “the incident last fall” to “the unfortunate individuals.” In order to simplify research relating to the rebellion of 1837, we created a tag for it.

 

Fun with Words!

Results from the search for "moose".

Searching for moose (mices?)

But what gets really interesting is when you search for random words and look through the results. For those of you who have listened to me talk about this project before, you’ll have to bear with me as you put up with my favourite example once again, which is “moose.”

As you can see in the image above, there are 28 results for the word “moose,” with no other criteria selected. The earliest act is from New Brunswick in, 1786 – it’s an act that includes a clause which creates the parish of Moose Island. The second one from the same year is actually an act for the preservation of moose, a very early attempt at conservation.

Random word search results for "seduction."

Searching for the term “seduction”

Another one of my favourites is “seduction.” A search for this term results in five hits; the second one is an act which allows for the parents of young women who have been “seduced” and subsequently impregnated to sue for parental support from the father. In this case, the definition of “seduce” is an archaic term that means “have sex with after having promised marriage, then abandon.”

Searching words in this fashion isn’t just fun – it allows researchers to see trends across multiple provinces. One of our MA students used the database to study how the Maritime provinces shared the cost of lighthouses across multiple jurisdictions. A cursory search of “widows” in the database shows how Upper Canada and New Brunswick treated widows of militia members who participated in the war of 1812.

In the list of provinces, you’ll notice that there is a category called “Imperial Acts.” This is a relatively new addition to the database. We created the category when finishing up the edits of the Upper Canada legislative acts, when we noticed a number of acts passed by the Imperial parliament in London had been appended to the acts of the legislative assembly but had not been passed by the provincial assembly itself. Some of them are complete, and some of them are only clauses which directly impact the province which appended the act in question. We’re really looking forward to seeing how many other provinces include such acts, and comparing them once we have them all included in the database under one heading. Even with only a few provinces complete, it’s clear that provinces did not follow a standard practice when it came to Imperial acts.

 

Conclusion

We are far from being done. Of the nine provinces to be completed, we have only completed two — Upper Canada and New Brunswick — and we’ve been working on this project since 2013. We’ve partially completed Cape Breton and United Canada, and have made a start on Lower Canada and Newfoundland. We have only barely started the transcription process for Prince Edward Island. We have yet to begin on Nova Scotia and Vancouver Island. The database is very much in beta mode, and I’m excited to hear your suggestions. We have a procedure for submission of errors, and I’m hoping to start a beta test with researchers in the near future. If you’re interested in participating, please let me know .

One final note: there have been 20 individuals, both graduate and undergraduate, who have participated in the database’s creation (not including the fantastic staff of UNB’s Centre for Digital Scholarship who have been invaluable in the concept design and programming), every single one of whom has been a paid for their work – we don’t pay people in experience or exposure. We don’t shy away from the fact that these projects are enormous in both scope and expense. Done right, we believe it’s worth it.

 


A big thank you to Stephanie Pettigrew for showing us the British North America Legislative Database! I’m totally nerding out over this, and think it’s an absolutely fantastic project! I hope you enjoyed this blog post! If you did, please consider sharing it on the social media platform of your choice. And don’t forget to check back in on Sunday for our regular Canadian History Roundup! And there is only one more week left in this course, so soon we’ll be back to our regular content! See you then!

Liked this post? Please take a second to support Unwritten Histories on Patreon!