A crowdsourcing project can create joy, inspire curiosity, foster new understanding, encourage wider social connections, promote learning, and help people cultivate new skills and expertise. To help illustrate those possibilities, this chapter offers an introduction that lays the groundwork for other chapters. We define and discuss where crowdsourcing in cultural heritage sits in relation to other fields. We provide an overview of key concepts, including types of crowdsourcing projects, and some of the complexities you will encounter.
Any account of this large and growing field is necessarily broad. It encompasses a wide range of human knowledge and understandings. To reflect that diversity, we strive to be capacious in our definitions, discussion, and examples, although we have largely left aside the kinds of crowdsourcing projects in the sciences and for-profit sectors. We expand our discussions, when possible, with case studies and examples that might diversify, complicate, or add extra nuances to the larger narratives in this book.
Each crowdsourcing project is driven by a larger purpose. Those purposes can be as diverse as the people who build crowdsourcing projects. We might start with a research question, for example, as when the Zooniverse crowdsourcing platform began with a single project asking people to help classify distant galaxies.1 Along with collectively driven inquiries, many crowdsourcing projects seek to expand access to a given collection by enlisting people’s help with transcribing, tagging, or other tasks. Converting scans into searchable text alone can make items in large collections much easier to find for specialists and casual visitors alike. Crowdsourcing can open up new possibilities for research that could not have existed otherwise, fundamentally changing authoritative practices, definitions, and classifications.
Sometimes, when we talk about a crowdsourcing endeavor, colleagues or friends might ask “aren’t you just asking for free labor?” That is a common enough question, usually coming from people who have not had a chance to appreciate the sense of shared purpose and community that a crowdsourcing project can inspire.2
Crowdsourcing projects can be a lot more than just augmenting a digitized collection of materials. Crowdsourcing can be transformative. While plenty of sites on the Web 2.0 (or “Social Web”) invite users to contribute content, crowdsourcing sites differ in that there may be even greater value to be had in undertaking the task, rather than restricting value to the content produced. The process of starting and running a project can present the opportunity to invite people to contribute to the store of knowledge flowing through your organization or institution. Seeing crowdsourcing projects as more than the sum of their tasks or end products allows us to imagine new horizons for our institutions, organizations, and affiliated communities. At its best, crowdsourcing increases public access to cultural heritage collections while inviting the public to engage deeply with these materials.
We learn these lessons from the participants who join our projects. We hear that crowdsourcing projects provide valuable outlets for all kinds of interests and investments. A given project might provide opportunities to volunteer without leaving the house, or provide a way to pass the time while riding on public transit. Others might help a marginalized community reconnect with its scattered or previously inaccessible cultural materials. For stewards of cultural heritage collections, a crowdsourcing project can provide more benefits. When an expanding group of people immerses themselves into a digitized collection, the process can lead to new discoveries and fresh perspectives. Participants can teach us a great deal about our collections, especially in the passion and enthusiasm they bring to so many different kinds of materials.
Even so, we know that crowdsourcing projects do not start with a blank slate. Our institutions and organizations inherit the historical legacies of empire, slavery, prejudice, and other inequalities. Those inheritances determine what has been preserved and who gets represented in our collections. Such legacies, of course, shape the digital lives of our collections. Crowdsourcing projects are far from a cure-all. Digital exclusion persists around the world. Deeply set social structures do not change in a day, nor a single project.
Yet clear commitments to certain values can help create spaces for more diverse voices. The British Library’s In the Spotlight project sought to offer spaces for engagement. The Colored Conventions Project’s principles invite people to reflect critically while browsing the recovered histories of Black lives and culture.3 Others, such as Zooniverse, ask project creators to agree to provide open access to the results of their crowdsourcing projects.4 As we discuss here and elsewhere in this book, these intentional approaches to centering certain values in our projects are invaluable. Such approaches do not just provide spaces for dialog — they make for stronger projects that can establish deep and lasting relationships with our communities.
Defining crowdsourcing in cultural heritage is an unending, constant task. When applying for funding to write this book, Ridge, Ferriter and Blickhan defined crowdsourcing in cultural heritage as “a form of digitally-enabled participation that promises deeper, more engaged relationships with the public via meaningful tasks with cultural heritage collections,” and said that, “at its best, digitally-enabled participation creates meaningful opportunities for the public to experience collections while undertaking a range of tasks that make those collections more easily discoverable by others.” Broadly speaking, crowdsourcing tasks may involve some kind of work on an item presented via a digital interface, such as transcribing text or audio, describing it through tags or longer texts, or collecting items for inclusion in a project.
In the quest for less unwieldy terms, acronyms such as GLAM (galleries, libraries archives, and museums), and LAM (libraries, archives, and museums) may be used. Cultural heritage organizations may be referred to as “memory organizations.”
Crowdsourcing has helped to provide a framework for online participation with, and around, cultural heritage collections for over a decade. It will continue to change, particularly as technologies such as Artificial Intelligence (AI) and Machine Learning are more closely integrated within systems that combine the work of humans and software into human-computation infrastructures.
The term “crowdsourcing” can be complex. On one hand, “crowdsourcing” is a commonly used term. It is legible to a wide range of people, including across disciplines that may approach or define this area of work in different fashions. On the other hand, it has limits. A “crowdsourcing” project might be tremendously successful with only a handful of participants — hardly a crowd. The term also has roots in the word “outsourcing,” a word to describe an often-exploitive practice that takes advantage of cheap labor. We expect that debates will continue to seek suitable alternatives. In the meantime, it helps to refer to digitally-enabled participatory projects with the relatively more succinct term of “crowdsourcing.”
Projects in Europe and the sciences often use related terms such as “citizen science,” “citizen history,” or “citizen research.” One organization with roots in the natural sciences called the Citizen Science Association defines “citizen science” as “the involvement of the public in scientific research, whether community-driven research or global investigations.”5 When citizen science is used in the European context, “science” includes a wider range of disciplines than the STEM-linked term in English, but it can be read as excluding humanities and cultural heritage projects. We also recognize that “citizen” and “citizenship” are not neutral terms and can exclude or discourage participation from migrant and precarious communities.6 In a related manner, the word “heritage” can carry unproductive connotations in certain contexts, such as in the United States where it is a term used by white supremacist and nationalist groups, a more malevolent usage than the relatively more common and generally understood meaning of “cultural heritage” in the United Kingdom and Europe.
Other related terms include digital or online volunteering, online collaboration, and digitally-enabled participation. Niche-sourcing recognizes that many projects have a narrow appeal or may require specialist skills. Microtasking refers to dividing up complex tasks into smaller (“micro”) tasks suitable for crowdsourcing. Ultimately, we want to recognize the importance of intentionally choosing your project terminology and, crucially, recognizing where these choices can result in barriers for your project participants, whether intended or not.7
While crowdsourcing is often framed as a novelty, its methods and sensibilities pre-date the advent of the Internet. Some people argue that crowdsourcing was born in the mid-2000s when the web made digitally-enabled participation more widely available, following Jeff Howe and Mark Robinson’s coining of the term in a 2006 Wired magazine article.8 For example, Brabham writes, “Although crowdsourcing rests on long-standing problem-solving and collaboration concepts, it is a new phenomenon that relies on the technology of the Internet.”9
In contrast, Ridge has argued that crowdsourcing in cultural heritage was transformed by networked digital technologies, but not created by it.10 As a field, it draws on antecedents from a range of different disciplines in which not-for-profit projects have asked people to collect and compile information and objects. These projects have their roots in the nineteenth century, if not earlier.11 Oft-cited examples include the Oxford English Dictionary’s mid-19th century appeal to the reading public to find instances of specific words in early texts.12 This approach was so successful that appeals continue to this day on social media. Other early models include the Green Book, a guide compiled in the 1930s-40s by and for Black people traveling in the United States,13 arguably a proto-crowdsourcing wiki-style information-gathering effort, and the “co-operative indexing” of 19th-century censuses in the 1980s.14
Versions of crowdsourcing continue to multiply. The field of digital humanities has driven an increasing diversity of projects, as Melissa Terras has described.15 Local historical societies and small museums have devised their own approaches, such as holding gatherings to transcribe records held in local record offices.16 These examples serve as a reminder that, while crowdsourcing is often associated with digital technologies, it retains strong ties to in-person volunteering.
The growth of these varieties of crowdsourcing has advanced amid the shift of the wider Internet to a Web 2.0 era, characterized by websites hosting user-generated content. Along with the more widely known social media platforms, influential milestones in the advance of crowdsourcing include the experiments on “social tagging” through steve.museum17 and the launch of Flickr Commons with the Library of Congress in 2008.18 In the past decade, an increasing number of museums have explored the potential of co-curating exhibitions through public selections of artworks19 or asking the public to share their knowledge on web-based wikis,20 or by contributing metadata to help enrich and remix online collections.21
Factors that make cultural heritage crowdsourcing projects different from other forms of crowdsourcing include the purposes behind their creation, motivations for participation, project content and data output, as well as theories of cultural heritage crowdsourcing. Many of these topics will be illustrated with examples in other chapters.
Crowdsourcing projects can be categorized in different ways.22 Other chapters of this book (including the “Aligning tasks, platforms, and goals” and “Choosing tasks and workflows” chapters) will approach task types in greater detail. This section will provide an overview and several options for how you might group task types together.
One approach is to focus on the activities performed by the contributors and look at the type of tasks performed. Other options include starting from the type of data processed (e.g., text vs images), size of the project (small, volunteer-led project vs large, institution-supported initiative), or the aim of the project (improve the accessibility of collections, generate or process research data, re-balance focus of collections).
Projects will typically be based around a particular set of data, and the tasks will relate to processing this in one way or another. Projects may also be focused on generating data, for example by collecting existing resources (images, stories) or creating new material (text, pictures, metadata). Many projects feature a range of tasks, perhaps moving from one to another as the project progresses, and many tasks involve more than one process, which could be performed simultaneously or in sequence, by one person or more than one. These combinations of processes, tasks, and people are often called crowdsourcing workflows.
Even though it may be difficult to classify projects by type of task alone, it can nevertheless be useful to consider the main data-centered processes employed in crowdsourcing projects. Common examples include:
Correcting/improving digital material — for example, proofreading texts that have been manually or automatically transcribed (OCR), or cropping pictures to remove blank margins, e.g., the New York Public Library’s Building Inspector.23
Improving the discoverability of material — for example by transcribing handwritten text or adding or enhancing information about the material by adding or correcting metadata, e.g., the Adler Planetarium’s Tag Along with Adler.24
Finding information about a source — for example, identifying an object in a picture, classifying a text, finding a proper name, e.g., the Conway Library’s World Architecture Unlocked.25
Generating data/material/collection — for example, by creating new material or collecting and bringing together existing resources, e.g., Wikipedia26 or The Great War Archive.27
What all the above processes have in common is that they rely heavily on the abilities of humans to identify and process information, for example, to quickly identify items in a picture, or read and understand the content of a text. Broadly described in this book as a spectrum of collecting-analyzing-reviewing, the individual contribution could be summarized28 as:
“type what you see” — transcription tasks, typing out handwritten text, e.g., the Library of Congress’ By the People project.29
“describe what you see” — adding tags that describe an image, e.g., Flickr Commons.30
“share what you know” — adding factual information based on your knowledge or research, e.g., Wikipedia.31
“share what you have” — uploading a picture or memory to a collection, e.g., Ford’s Theatre’s Remembering Lincoln. 32 33
“validate other inputs” — for example, checking and correcting text that has been transcribed, e.g., OCR correction in Trove.34
Tasks can also be grouped by size. Ridge suggests that microtasks are small, rapid, self-contained tasks that can be completed in one or a few clicks, such as classifying items by a small set of categories, identifying pictures that contain a particular object, or adding tags to an image.35 Macrotasks are more complex and may involve more than one action, such as identifying a specific item on a page, marking, classifying, and transcribing it. As such, they will take longer to complete. In addition to micro- and macrotasks, projects can also include other activities, such as supporting contributors, taking part in analysis and evaluation, or developing and maintaining platform functionality. Such metatasks are not necessarily tasks that can be easily defined as having a particular size or scope but are important for the successful running of the project.
This chapter provided an overview of crowdsourcing in cultural heritage contexts — what it is, how it came to be, broad types of projects and tasks, and some of the complexities you will encounter.
The following chapters build on this to expand on why you might work with crowdsourcing in cultural heritage, how values related to the missions and motivations of cultural heritage organizations and contributors can or should inform your projects, who participants are and what motivates them to contribute, and provide practical advice on managing and running projects (tasks, platforms, project management, data, evaluation).