Oh man, was this project a doozie. For this post, I am going to be reflecting on my digital project that I made for my course in Digital Public Humanities out of George Mason University. To find that project, you can click this link or find it on the top of the tabs–it has its own page on my blog!
First, let me start by saying that this project–and my first semester–have been a blast. I have learned quite a bit and have even surprised myself with how well I caught on despite having practically no experience with either Digital Humanities as field or the tools and practices associated with the field. This project really helped me bring it all together to use a tool to engage with a topic I am both interested and invested in. For my project, I utilized Voyant Tools, a free online software for text analysis that let’s you upload files and use a variety of tools to explore the text and have it visualized in ways that are not possible through a close reading of the material. My project was comprised of over 125,000 words in a total of 115 documents. That might not be the biggest number out there, but it wouldn’t be possible to read with the time we had for the project. So the software I had at hand was perfect and it got the job done.
I chose to do a comparative analysis of American Indian treaties for this project. This topic means a lot to me as an American Indian and for my people. I’ve been studying treaties for years now and while I can’t say I was too surprised by my results, having the tools to observe and present it in new ways was quite fascinating and I see that it will be helpful for teaching others–that is what was most exciting for me. I didn’t necessarily discover anything new about my sources, per se, accept that I now have an easier way to visualize what I picked up on in my close readings of these items. I see how it can be used to get others to the same point in a quicker fashion than the years it has taken me. I suppose I am trying to say that this tool allows a much more friendly approach for those who are new to studying documents such as these.
The journey for this project, however, wasn’t easy. As I went through the project, I wrote down notes of my experiences and how my process for gathering, analyzing, and interpreting the data shaped how my project ultimately turned out. At first, I was concerned about not having enough text for my analysis because I was initially only going to do the treaties for my Tribe. After discussing things with my professor and getting student feedback, I knew I had to expand it. But when I started incorporating more treaties, I had to cut unnecessary text to finish on time. As such, I began cutting signatories as it wasn’t going to contribute to what I was trying to observe in the text. However, my peers did offer a good idea on including signatories for future projects to see how they are included in the trends or maybe even isolating them for analysis.
I later also decided to remove witnesses, footnotes, and non-treaty related references. This helped to cut down on the frequent appearance of “unique” words that didn’t really describe much about the language of the treaties because they were superfluous.
Probably the biggest hurdle for this project, however, was the transcription database I was using. The Oklahoma State Library has digitized the official law books that house the treaties and most of them (most–because I found that out the hard way) have a transcription associated with the digital image. It appears that it was written with an OCR and it has not been corrected for mistakes. So in copying the transcriptions, I had to go through each treaty that I was making into a text file and edit out the junk text, errors, and reformat it so it looked nice in the “Text Reader” box. This took an absurd amount of time to do and drastically cut into the time necessary for just making the text files. I really got a feel for why digital projects are laborious and how the funding would be necessary to pay someone to do that all day long (or how useful crowdsourcing can be if you can make it mutually beneficial and non-exploitative for those who choose to contribute). In making these corrections, I also combined hyphenated words for an easier analysis (assuming it would impact Voyant).
For my actual corpora of treaties, my first set of treaties go from the earliest date as recorded in the law books (1788) through 1830. From there, the second set continues from 1831 onward. I chose this date for the division because it was in early 1831 that Cherokee Nation v. Georgia was decided that altered the relation between Tribes and the U.S. I figured this will be reflected in the treaties and I believe I was correct!
I came to find out that some transcriptions are missing from the database after the treaty with the Teton in 1815. However, by that treaty, I surpassed my 50,000-word minimum. So then I began pulling one treaty per year up to 1831. In Group A, the treaties are consecutive from 1788 through 1815. It then tracks one treaty from each year from 1816 through 1830. In Group B, the treaties are consecutive Oct. 22, 1832 through the rest of 1833. It then tracks one to two treaties each year from 1834 through 1859.
By this point, I also had to change my end date for the treaties. I originally began with 1871 as the end date (when the treaty-making process ended, though I think the last recorded treaty was in 1868). There was just too much work and not enough time, plus I was accumulating way more words in Group B than in Group A. I needed to adjust my Group B corpus to reflect a more comparable word count to Group A. My first Group B corpus had approximately 30,000 more words and this was skewing the frequent word tally and painting an inaccurate comparison. I decided to move the window range of my treaties further out so I could keep the later treaties in my corpus because I believe they reflect a higher difference in language that I was initially looking for anyways.
So, suffice to say, there were some roadblocks and it did prove to be challenging at points. But I had a lot of fun with the project and I am encouraged to continue this endeavor even outside of class assignments. I appreciated the feedback I got and the developmental process of my first iteration of this idea to what it ultimately became. There is still a good bit of work to be done on it before moving onto other variations of it, but I think I accomplished what I set out to prove and it helped me demonstrate what I have learned so far in this course.