For this post, I will be discussing another assignment of mine from my course in Digital Public Humanities (DPH, though I also use just DH). In my last post, I discussed crowdsourced aspect of Wikipedia and provided a guide on how to evaluate the content of a page conjured up by editors of the world wide public. In this post, I want to explore the benefits and limitations of crowdsourcing as exemplified in several DH projects presented in this course.
Crowdsourcing – Why?
The reasons why a project team might want to crowdsource–in full or in part–their endeavor will vary depending on what the desired outcome is and if crowdsourcing aligns with the goals and values of the team. If the project is undergoing a laborious (and even monotonous) step in bringing a project to life, such as the digitization of numerous documents and/or transcription of said documents, it could prove to be the life or death of a project. Most professional projects rely on funding and funding will not last forever. Thus, if a period of the project involves a sometimes monumental task of digitization or transcription, this could easily eat through the available funding by forcing project staff to take on responsibilities that do not fully utilize their capabilities that might otherwise be useful for other tasks.
For example, in one of the projects I studied, the the Transcribe Bentham project utilized crowdsourcing for transcribing handwritten manuscripts from Jeremy Bentham, philosopher and founder of Utilitarianism, but found it necessary to assign project staffers to moderate the submitted transcriptions. From this project, salaries were the most expense cost to the project.¹ In the end, crowdsourcing did not quicken the transcription process, but it was surmised that while one of the experts doing this moderating could likely transcribe 10 manuscripts a week by himself (in addition to all other responsibilities), he could moderate 100 transcriptions in a week if they were provided by voluntary core contributors.
This example demonstrates how crowdsourcing can prove invaluable, at least when considering financial cost, for digital projects. Other reasons include hastening the progression of a project, incorporation of unknown knowledge possessed by the greater public, and expanding the cultural heritage and building relationships with the greater community (the most important reason, in my opinion).
Crowdsourcing is, in many ways, an expression of values held by those in the Humanities and those engaging with Digital Humanities. Mia Ridge² explains this by saying:
Crowdsourcing in cultural heritage benefits from its ability to draw upon the notion of the ‘greater good’ in invitations to participate, and this may explain why projects generally follow collaborative and cooperative, rather than competitive, models.
When projects are based on collaboration and cooperation, it becomes innate to want to involve the public–those who will be directly accessing the information–in the process of bringing this knowledge to them.
Public Involvement – How?
Once one has a case to crowdsource a project or part of a project, the next questions are about the tasks themselves–how are you going to involve members of the public and what tasks can they accomplish?
The four projects I studied in this assignment presented me with a variety of tasks to be accomplished, though they were all of a similar nature. Crowdsourcing seems to be largely dependent on a key concept: accessibility. Users must be able to access both the materials they will engage with and the tools they will use to do the engagement. In this sense, accessibility is not just about being able to find and see the materials, but also being able to readily adapt to and use the tools they are presented with. Hence, tasks cannot be overly complicated such as to require formal background knowledge or technical expertise to participate in the project.
As again described by Mia Ridge, crowdsourcing tasks generally “involve transforming content from one format to another … describing artefacts … synthesizing new knowledge … or producing creative artefacts.”³ The projects I studied involved the transcription of documents and manuscripts; correction of text; and categorization of data entries. These tasks relied on my ability to read, rewrite, and format text and identify consistent features among different items. These tasks, and subsequently my abilities, do not need to rely on extensive training and can be easily accomplished by members of the public who are likely to have no formal experience. The main thing to worry about in this case is for the project organizers to provide the tutorials and instruction necessary for members of the public to actually accomplish the task to the degree the project requires.
Making the Contributions – Who?
As the old adage goes, “you can bring a horse to water, but you can’t make it drink.” In our scenario for crowdsourcing, it would be more like, “you can have the water to drink, but where do you get the horse?” In order for crowdsourcing to be effective, it is necessary to draw in the crowd in the first place. Within that, it is necessary to attract those that will actually want to contribute to the project.
For example, another project I observed was known as Trove. This project is meant to have users correct the text in digitized Australian newspapers going back some 150 years. This project has been widely successful ans has been able to sustain community engagement. But why? Though the reasons could be numerous, we do know that one reason is because the content is highly relevant to the userbase, which is what has attracted the sustained interest in the first place. Many Trove users are those with an interest in local history and family genealogical work.4 Helping to make these digitized versions of newspapers has thus helped people to conduct research in other areas of interest because the items have been more searchable.
Another example is from the project Papers of the War Department. Many of the users of this project, which is concerned with transcribing handwritten digitized documents to rebuild a lost collection from the War Department of the United States that burned down in 1800, are those who are interested in Early American History. This material is highly relevant to specific groups, such as Native American communities and those studying American Indian History, because many of these documents shed light on early American relationships with Native American Tribes during this time period, relationships that were primarily handled by the War Department.5
So this provides us with one of three key elements for attracting an audience, the first being relevant material. When the content of the project is of interest to those of specific demographics, you will draw them to the project. Knowing your potential audience is key to generate the crowd necessary for crowdsourcing. If they feel like they have a personal investment in the project, they will be more likely to contribute. Furthermore, if their contributions are being valued and appealing to their sense of the “greater good,” this involvement can be sustained.
This touched on the second element for attracting–and keeping–and audience. Their contributions need to be valued and users need to see that what they are doing is progressing the project. This can be done simply by showing the work they’ve done in a more final format (such as what the interface for Papers of the War Department does) or tallying the contributions one has made (such as in another project, Building Inspector).
The third element is one I have somewhat touched on in this post so far: accessibility, but in the form of the interface of the project itself. Interfaces need to have a balance of simplicity and diversity, with a feeling of being intuitive. If an interface–the means with which the user interacts with the tools and materials–is overly complicated or lacking in options, users could be turned off from the project or lose enthusiasm to engage with it. In order to sustain the commitment of users, the interface needs to meet the need of the projects so users can accomplish the task without burden, but also simple enough that it doesn’t appear like the user needs background expertise.
Though there is evidently a lot of benefits for crowdsourcing, there are some obvious drawbacks to be aware that can affect the decision to invite members of the public to contribute. First off, you are relying on the untrained labor of the public to contribute to a project. While the idea is to make it so that doesn’t necessarily matter, there are times when it does matter. Amateur skills could lead to poor transcribing, misappropriate categorization, and erroneous text corrections.
Going back to the example of Transcribe Bentham, the reason why the pace of transcribing articles wasn’t as successful as hoped is because the transcription process had a high degree of moderation and necessary accuracy to accept transcriptions. Thus, some users reported feeling intimidate by the workload and requirements, which then led to a downfall in submissions and a speculation that had the experts been paid to only do transcriptions, they could have done five times the amount of work that was produced by the limited number of contributors.
Crowdsourcing needs to be guided. If the project staff fail to do this, public efforts might amount to little advancement for the project and could prove to be folly for the time, effort, and funds dedicate to the crowdsourcing efforts.
1. Causer, Tim, Justin Tonra, and Valerie Wallace. “Transcription Maximized; expense minimized? Crowdsourcing and editing The Collected Works of Jeremy Bentham.” Literary and Linguistic Computing 27, no. 2 (2012), 112.
2. Ridge, Mia. “Crowdsourcing Our Cultural Heritage: Introduction.” In Crowdsourcing our Cultural Heritage, edited by Mia Ridge. Ashgate, 2014.
4. Ayres, Marie-Louise. “‘Singing for their supper’: Trove, Australian newspapers, and the crowd.” Paper presented at IFLA WLIC, Singapore, July 31, 2013. National Library of Australia.
5. Leon, Sharon M. “Build, Analyse and Generalise: Community Transcription of the Papers of the War Department and the Development of Scripto.” In Crowdsourcing Our Cultural Heritage, edited by Mia Ridge. UK: Ashgate, 2014.