Crowdsourcing in the Arts and Humanities

wpid-PastedGraphic3-2013-06-6-08-44.tiffwpid-PastedGraphic1-2013-06-6-08-44.tiff wpid-RunCoCo_logo_square.JPG-2013-06-6-08-44.jpg

On 9th April, 2013, the University of Oxford Internet Institute held a one-day workshop on Arts and Humanities crowdsourcing projects paying close attention to how they affected users, digital curation, public engagement, and knowledge sharing.

According to Dunn and Hedges (2012), crowdsourcing is “the process of leveraging public participation in or contributions to projects and activities.” Another definition is offered by Brabham (n.d.), “crowdsourcing is an online, distributed problem solving and production model.”

Does this strategy rely on the wisdom of crowds? In the arts and humanities, the answer is multi-layered. Crowdsourcing depends on participation from “interested and engaged members of the public” (Owens, 2012). Stuart Dunn offered that participants gained specialist knowledge during the crowdsourcing process. He added that the subject matter is what attracts contributors. However, several speakers demonstrated that a small percentage of participants were subject matter experts. One can argue that crowdsourcing distributes time-intensive tasks over several people AND harnesses the expertise users bring (or eventually attain) to the table. Most of the case studies presented at this workshop bore out these claims.

Alice Warley (Your Paintings, Public Catalogue Foundation) and Andrew Greg (University of Glasgow) presented their case study of Your Paintings, a project hosted on the BBC domain that has placed over 200 thousand images of oil paintings online for the public to view and tag with descriptive keywords. As of this blog post, over 23 thousand paintings have been associated with more than 4 million tags by over 9 thousand users (http://tagger.thepcf.org.uk). At the onset, contributors were being unduly influenced by pre-existing keywords that had previously been associated with works of art. This was mitigated, in part, by hiding previously added tags. Users now have the option to reveal these keywords if they wish. Warley also commented that obscure paintings were less likely to be tagged. Greg described their solution; assets were randomly selected for users to tag.

Building an accurate and trustworthy process was a central topic in Kate Lindsay’s (Manager for Engagement & Education Enhancement, University of Oxford) study of the Oxford Community Collection Model. One project that used this model, The Great War Archive, encouraged users to contribute physical (to be scanned or photographed) and digital objects while adding descriptive metadata. However, to keep up with the volume of submissions, the initiative chose to prioritise archiving the material over spending time creating exhaustive metadata.

Dr Laura Carletti (University of Nottigham), presented on the Tate Art Maps, a crowdsourcing project that aimed to engage the public by asking them to add geographic metadata to part of the Museum’s collection. Results of a participant survey demonstrated that the process enriched the way artworks were experienced. During several stages, the project challenged people to explore the area between the Tate Britain and the Tate Modern and associate inspirational places to works of art.

Kimberly Kowal (British Library), speaking about a georeferencing project at the British Library, was the first to discuss the value of volunteer motivation. Kowel explained that the top five crowdsourcing contributors were listed on the project’s main web page and were awarded prizes. One person was consistently named as the top contributor. This was the first indication, during the workshop, of the existence of super-users. These individuals typically devoted more time to the project than any other participant and were more likely to roll over from one phase of the project to the next.

wpid-Model-2013-06-6-08-44.jpg

My representation of a model shown during Stuart Dunn’s presentation (2013)

Peering through the lens of the National Library of Austin’s TROVE project, Mia Ridge also spoke of participant motivation. She framed this aspect in seemingly contradictory terms; having to do with altruistic and selfish reasons. The project’s goal was to entice active engagement by offering users the opportunity to add to a community’s knowledge while making the process entertaining. To deepen the levels of engagement, TROVE focused on four engagement principles: attending, participating, deciding, and producing (somewhat similar to Dunn’s model shown above). Ridge finished her presentation by crediting technology for lowering barriers to access, but acknowledged that tech does not create engagement.

Chris Lintott (Department of Physics, University of Oxford), who runs Zooniverse, argued for the creation of different levels of user engagement, showing that half the data from his crowdsourcing project came from a small percentage of dedicated participants while the other half was produced by many casual contributors.

The final case study was presented by Tim Causer, who runs the Transcribe Bentham project. He echoed Lintott’s findings; a minority of the participants contributed to a majority of the crowdsourced work. Initially, crowdsourcing produced less data than a few full-time staffers until one or two super-users became involved. This flowed well into the workshop’s roundtable discussion.

How do you recruit super-users? Marketing these crowdsourcing projects was cited as one way to attract interest, while others suggested meeting with taggers, even conducting interviews. However, selecting ideal candidates becomes problematic because the contributors are self-selected. Furthermore, most would agree that lowering barriers to engagement (i.e., requiring registration) is an important aspect of any crowdsourcing campaign.

Final Thoughts

Unsurprisingly, most arts and humanities crowdsourcing projects are text-based; they transcribe original text, tag images with keywords, or disambiguate terms. The creation of human-readable metadata to foster aggregation and discrimination of data is at the forefront of most of these initiatives. However, how can crowdsourcing in the arts and humanities be combined with automatically generated metadata (i.e., image or speech analysis) to enrich metadata sets for users to exploit? This question deserves further study. Another aspect touched upon in Q & A, but not given much attention was the impact of crowdsourcing on levels of employment. Many cultural heritage organisations, faced with dwindling budgets, have been forced to seek other ways to maintain output levels while laying off a portion of their workforce. Crowdsourcing, arguably an extension of volunteerism, is an attractive option. Finally, arts and humanities crowdsourcing would benefit from more participant studies, especially ones that target different types of collections and cultural sectors (galleries, museums, archives, and libraries). Knowing who is more likely to participate and what motivates and engages them would increase the success of crowdsourcing initiatives.

Leave a comment

I’m Ian Matzen

Welcome to Tame Your Assets: a blog about digital asset management. I am a Senior Manager (Automation Programs) with a Master of Library and Information Science degree and experience working in higher education, marketing, and publishing. Before working in DAM I post-produced commercials, episodic television, and corporate videos. Recently I wrapped up an automation project for Coca-Cola.

Let’s connect