5. Designing cultural heritage crowdsourcing projects
by Mia Ridge, Samantha Blickhan, Meghan Ferriter, Austin Mast, Ben Brumfield, Brendon Wilkins, Daria Cybulska, Denise Burgher, Jim Casey, Kurt Luther, Michael Haley Goldman, Nick White, Pip Willcox, Sara Carlstead Brumfield, Sonya J. Coleman, and Ylva Berglund Prytz
Published onApr 29, 2021
5. Designing cultural heritage crowdsourcing projects
Designing cultural heritage crowdsourcing projects
In this chapter, we provide a guide to high-level project planning that helps anticipate common issues and fit with long-term strategies that include crowdsourcing and public engagement in libraries, museums, archives, and research projects. We include questions designed to create clarity about your goals in considering a values-centered crowdsourcing project, introduce frameworks for thinking about tangible and intangible results and objectives, and help you understand the strengths and weaknesses of your organization in relation to public engagement through crowdsourcing. The chapter also contains prompts for thinking about the resources and technologies available to you, encourages an iterative approach that includes pilots and usability testing, managing risks, and working with stakeholders.
The earliest steps of a crowdsourcing project can be the most challenging, and this chapter provides guidance based on our experiences and understandings of the field. The relevance of these ideas to your situation depends on your project’s goals, stakeholders, scope, institutional setting, partners, and other considerations. This chapter is not intended to be a To-Do list of required steps, and we acknowledge that the ideas introduced here could overburden a modestly scoped project with a short timeline. However, the features of this planning section will be relevant to even small, short-term projects.
You might notice that topics in this chapter are borrowed from other structured activities, including strategic planning, implementation planning, business planning, sustainability planning, and risk management. Full-blown versions of each of those activities might benefit your project now or at a later stage, but we do not attempt to mirror particular protocols here. We instead arrange activities in an arc that might make sense for you as you design a new project.
Articulate motivations and describe target data
If you have arrived here, you likely have a crowdsourcing project already in mind. Some projects may have a research question; others may be focused around some kind of content, with the hope to improve its usability or reach in some way. Depending on the scope of the content, you may need to design and implement several projects to arrive at your goal.
It can be helpful to articulate your challenge and inform your broader project design by asking questions. They can be as simple as, “Why are we doing this?” With an example answer being, “To get a transcribed version of handwritten texts.” Other questions we have found useful at this point in the process include:
What is the greater value that you seek to produce?
Why is crowdsourcing the best approach? What does it bring that you would not get otherwise?
What is the problem you are trying to solve?
What does this work/project enable?
What does the output data look like?
What values are aligned with this challenge and may guide the ways you undertake it?
What are example/possible research questions that could be answered using what is created through your project?
What are the second level, but still motivating, questions you want to define?
Who and what is represented in doing your project in this way?
Who is missing or excluded if you undertake this project in the way you imagine it now?
Considering the usefulness, shape, and scope of your challenge at the beginning of the project can help you identify steps to be refined in your project design. This work will also help you identify points for decision-making within your project. Additionally, reflecting on your challenge may help you highlight interested parties and determine which forms of communication and engagement can help you solve your overarching problem.
Once you have given shape to the challenge you would like to address or the problem you are trying to solve, you can identify more explicit project goals and objectives.
Working with data
Data plays a central part in any crowdsourcing project. Some projects create new data, for example, Wikipedia1 where contributors can create new encyclopedia articles, or community collection projects such as the ones run by Europeana,2 where contributors are invited to add their pictures, stories, and objects to an online collection. Others may focus on generating alternative versions of existing resources (such as transcribing a handwritten document) or enhancing existing collections, for example by classifying and cataloging material.
When designing a project, it is important to take into account the form and format of the data being used and created by the project. You should also consider early on how the material will be used, preserved, and shared.
Data will probably be central to your project design and you will be considering the type of data you will be working with, sources of existing data (if used), the data format going into your platform, and the type and quality of the data coming out. The choice of data to work with is tightly linked to the aims and values of your project, and may even be the primary reason for your project. It will be a central part of any aspect of your project, and as such we have created a dedicated “Working with crowdsourced data” chapter that includes more detailed discussions of the concepts introduced here.
Expand your goals — it is not just about the data
Crowdsourcing might appeal to you for its potential ability to produce data at a scale not otherwise available to you, but it should not be forgotten that it affords many additional opportunities. You will find many examples of these throughout this book to hopefully inspire and guide you as you design your project.
For example, an additional goal beyond the creation of a dataset might involve informal education. Take a moment to identify your additional high-priority goals for the project. These may include:
How can you provide an educational opportunity for participants?
How can you engage historically underrepresented communities?
How can you offer mentoring or insight on career paths?
How can you build support for your institution or partner institutions you may be working with?
A common distinction that might be helpful is between goals and objectives. Goals are the highest-level articulations of ideas, and these are typically somewhat abstract. Objectives articulate concrete actions that advance a project toward goals and are easier to translate into work plans and evaluate progress towards than goals. However, goals are often what make it into communications — they are more of a rallying point. Focus on your goals initially.
If you are seeking examples of goals and objectives, a good place to start might be your organization’s current strategic plan (if one exists). Starting there might give you a jump on thinking about how to pitch your project to your decision-makers. Watch for anchor-points in your organization’s strategic plan — ideas that you can echo or complement.
What are your superpowers? What is your kryptonite?
SWOT (sometimes called TOWS) analysis can further help you identify goals by guiding you through a systematic consideration of your project’s expected Strengths, Weaknesses, Opportunities, and Threats. Strengths and weaknesses are features internal to your project (or to your organization, given that your project is still in the design phase); opportunities and threats are external to your project or organization. Strengths and opportunities are potentially helpful to your project; weaknesses and threats are potentially harmful to your project. At this stage, strengths and opportunities might most directly inform the description of goals, but weaknesses and threats will help you identify risks to manage and assign significance to those risks in later steps.
Some common strengths encountered in cultural heritage crowdsourcing contexts include:
Subject matter expertise, content specialists, deep knowledge of curatorial and archival practices
Interested and engaged audiences, especially educators, lifelong learners, students, and local communities
Opportunities to engage other potential contributors from otherwise non-participant/underserved communities
Similarly, existing projects and collaborators have shared these common weaknesses:
Limited financial resources and staff time to sustain new projects
Limited software development support for iterative design and maintenance of projects
Projects considered “not core to the mission”
Here is an example of how this analysis can be undertaken for a GLAM transcription project, starting with defining goals and challenges and moving to the SWOT analysis.
Challenge — our digitized archival collections are being made available online, but can only be searched using the descriptive metadata, which is, in some cases, minimal
Goal — Invite the public to help create data in the form of text that can help improve access and search of digitized collections in ways that help participants learn about the contents and collections
Objective — Develop thematic focus and supporting materials that help existing contributors and external partners (educators) work together to complete “projects” (in support of the main project) within a specific period
Objective — Acknowledge participation and share progress through monthly project summary and blog post (or webinar or printed material or social media)
One of the most popular and effective approaches to empower this process is to use the SMART system. SMART is an acronym used to create objectives that are: Specific, Measurable, Achievable, Realistic, and Time-bound. SMART goals help clarify your purpose and make it easier to use your time efficiently and productively. In the following sections, we show how to apply the SMART approach to translate a crowdsourcing project design into practice.
Project designs need to be converted into specific units of work. No matter how comprehensive, a project design will need to provide a series of specific steps that will allow people on your team to report progress, celebrate successes, and proceed onwards. Often, a project design might have been written sometime before project launch, or by people who have since left the team. No matter the needs, when the rubber meets the road, it can help to have markers to track how things are going, if changes need to be made, or where unexpected success might deserve more resources.
Crowdsourcing requires lots of flexibility, so it is imperative to have a plan in place. Often, a crowdsourcing project’s launch will lead to an initial rush of participation, followed by irregular bursts of activity. These patterns might be predictable, or they might not. Once a project launches, earlier aspirational goals might need to evolve or be recalibrated. A project management plan will help ground all of those adjustments over the life of a project.
Create specific components for managing a project by spelling out the Five Ws: What, Why, Who, Where, Which. This first stage will refine the project design. Outlining these Five Ws upfront will give you an initial overview, helping you to drill further into the specifics as you progress through the remaining SMART goals.
What — what data transformations and communities will this project help create?
Why — why does this crowdsourcing project matter?
Who — who will be invested in this project’s materials, goals, and community?
Where — where are this project’s collections and activities going to take place?
Which — which people, resources, funding, and other needs will be vital to the success of the project?
You may also want to consider these additional questions:
Does the project meet the stated needs, such as improved access, advancing research questions, etc.?
How do we determine an appropriate scope for the project?
Given the often-organic timelines of crowdsourcing projects, how do we determine what kinds of resources are needed?
What will happen if our project gets media coverage and attracts a surge of new participants?
How will we plan to support new participants’ inquiries and other needs?
What other risks might arise specific to transcribing, tagging, translation, or other kinds of crowdsourcing projects?
Once you have established the Five Ws, the next step is to refine this into measurable goals against which you can track your progress. Measurable goals help you to focus on the basic building blocks of your project — how much, or how many — concentrating on the necessary resources needed and defining when these will have been achieved. Projects will be composed of both little and large goals, and it is helpful to break these into much smaller chunks or tasks that all contribute to larger milestones. Tasks can then be assigned to the appropriate members of your team (see below) help the team stay motivated as tasks get completed and everyone progresses towards the goal. There are numerous project management web applications to help you and your team keep this on track, such as Basecamp or Trello, ensuring that you have the requisite resources needed for success.
This is where your thinking needs to focus honestly with the facts on the ground — and in particular, do you have the necessary budget to achieve your aspirations? In a best-case scenario, you may be able to achieve considerably more than you previously expected, by identifying other resources that your organization may have to hand. In addition to the internal resourcing consideration, this goal should also focus on external factors — both financial and non-financial contributions to your project. Crowdsourcing projects draw on the non-financial contributions of participants. Are there enough potential participants to achieve your goal? Do you have any stretch goals in case your crowd is far larger than expected? If your project is funded by a grant, the payment schedule may be linked to the completion of specific milestones, and this may come sometime after the costs for completing those tasks have been incurred. Linking your tasks and milestones to your payment schedule, and articulating this through a project program, or scheduling overview such as a Gantt chart,3 is an essential method to ensure that the project is successful within these financial constraints.
Setting realistic goals for yourself and your team is very important. If your goals are too ambitious, you risk failure and damaging team and participant morale. If your goals are too modest, neither your team nor participants will feel challenged. Staging the benchmarks you hope to achieve is a good way to establish incremental goals.
The selection of source material can be a challenge, especially if you run an ongoing project of many collections rather than a discrete project focused on a single collection. The appropriate source material will depend on decisions made in the project design and platform selection. It can be tempting to shoe-horn additional content into a successful project, but content should be selected that is a fit for both platform and participants. For instance, structured data from a form should not be transcribed in an open text field (which is more appropriate for letters and other manuscript materials) — the output will be less useful and the participants will be more frustrated and likely to quit. Participants who are very interested in a certain subject may not easily change gears to engaging with drastically different content. Avoid alienating participants with content that belongs to a separate project. For additional information on setting realistic goals related to the capabilities of the current technology, see the “Choosing tasks and workflows” and “Aligning tasks, platforms and goals” chapters.
Finally, SMART goals have clear start and end dates. Project managers spend much of their time coordinating everyone else’s time, whether moderating team meetings or adjusting workflows to meet expectations from participants about when new collections will be available for their work. The dates for milestones and achievements set during planning will shift and the project manager will adjust schedules and expectations to keep the project on track.
While many of your outcomes may surprise you, it helps to develop your project with certain outcomes in mind. You might want to see the following outcomes:
Identifying a success story (or stories) to ground your project for use in communicating with institutional leadership
Opportunity to demonstrate values
Facilitating connections between external partners and participants
Learning ways to improve, and iterating as you go
Raising interest in your next project
As you have set out your challenge and then refined your goals and objectives, you may have a clear sense of what you would like to do. However, you might not have clearly defined how you would like to do it. We challenge you to include time to articulate values within your project design. Here are just a few examples of how values can inform your project and task design:
Respecting contributor privacy by integrating practices that responsibly steward participant data and system components that obscure precise identification
Equity (exchange of knowledge, access to enhanced information, acknowledgment)
Commitment to advancing participants by keeping them informed about their continuing opportunities for engagement with the project
Please see the “Identifying, aligning, and enacting values in your project” chapter for more information.
You can enhance your project design by determining early the key results which will provide success and the ways you will measure your progress toward those results. Your project design will likely include points at which you will assess your progress. We encourage you to intentionally undertake iterative practices; in other words, include regular points for assessment and evaluation and include time to adjust your project’s design or integrate feedback to improve upon it. As you review your goals, objectives, and values, you can consider what types of key results — evidence, information, or data — will show that you have achieved them.
Wade into evaluation
An invaluable part of planning your crowdsourcing project is evaluation, and perhaps counterintuitively, it is something you want to consider right from the start of the project. It is crucial to allow yourself at various project stages to 1) reflect on the work you and your team have done; 2) check that your work still aligns with your project values and success metrics; 3) solicit feedback from people outside your direct team; and 4) iterate based on your reflections and feedback received. This process will help you to achieve your goals while ensuring that success in one aspect of your project (e.g., internal goals or research aims) does not happen at the expense of other stakeholders in the project (e.g., participants, volunteers).
At the project design stage, it is important to think about how you will capture data and metrics, how often you will capture it, where you will store it, and identify methods to evaluate the data collected. Once you have evaluated the data you should allow the realizations, feedback, and criticisms to refine your goals and your project.
Crowdsourcing projects thrive by establishing open lines of communication. We encourage you to create methods of eliciting feedback from your range of participants and stakeholders. Furthermore, we suggest that you prepare to embrace and reflect upon criticism. Early critics are often your biggest fans if you take their input seriously and improve your project based on their feedback.
You can find a lot more information about evaluation, and useful suggestions for how to incorporate it into your project, in the “Evaluating your crowdsourcing project” chapter.
Recognize, assess, and combine resources
The goal of this section is to help your team make informed decisions about strategic planning based on an assessment of your resources. The resources you need to run a crowdsourcing project depend on your goals, but you can also design your project to utilize and draw on resources that you have.
The key resources for any crowdsourcing project — indeed for almost any project — can be summarized as:
Each of these will be discussed below within the contexts of crowdsourcing projects.
People and skills
The key to any crowdsourcing project is the people involved. These include the people who facilitate the project and the people who participate in it. In some cases, these may be the same people holding more than one role. In addition to the people who are directly involved in the day-to-day project activity, others that can be important for the project are external partners, funders, data owners/creators, platform designers, and many more. In this sense, crowdsourcing projects are very similar to other projects. It is important to recognize early on who your stakeholders are. These can include participants and contributors, organizers and collaborators, colleagues, data owners, funders, external partners, and others.
Once you have identified your stakeholders, you can use a variety of techniques to understand their needs, including preliminary participant research and requirements gathering. For example, some projects will interview prospective participants from a target community early in the project design process to get their initial ideas, and then continue to involve this group as beta-testers as the project team starts prototyping. Some use exercises such as Friend or Foe4 to articulate stakeholder dynamics.
In parallel to considering project stakeholders, it can be useful to consider the skill sets that you want to involve in your project, whether that is for planning and management purposes or specific project tasks. The skill sets your project will require are ultimately dependent on what kind of project you run. Your project goals, the data you process and create, the technology you use, and the resources you have will all influence your project design. Creating a skills inventory can be useful when drawing up your project plans. The inventory should list the skills you need for your project and allow you to compare that to what is available to you. Identifying gaps in the skills inventory can help you determine whether you need to budget for hiring additional team members and/or identify partnerships with others that possess those missing skills. It may also show you skills that you could draw on, but that you had not thought to include, which can allow you to consider how these could benefit your project.
Focusing specifically on crowdsourcing projects, the section below outlines some of the most common skills that we have found to be useful for creating and maintaining healthy projects. The skills are not necessarily confined to one specific individual or merit a full-time role, i.e., a single member of your team may possess multiple skill sets and fill different roles, or a role can be shared between two or more people.
Content specialists bring knowledge of the form and content of the data you are working with, for example, a historian who specializes in a specific period, a linguist with certain language skills, or someone who knows about working with specific data formats. This can be a single individual, or it could require multiple perspectives, depending on the range and breadth of the data. Content specialists can help to set project goals and determine appropriate task and workflow design based on their knowledge of your target data. They can also help with data output questions and offer guidance around quality control and usability of results. Content specialists may also communicate with contributors at all stages of the project, including responding to content-related questions during the active data collection/processing phase (e.g., engaging on project forums or answering questions via email or social media), and working with communications specialists to create public-facing content.
Case study: the role of content specialists on the Great War Archive
In the Great War Archive project, based at the University of Oxford,5 members of the public added their objects and stories from the First World War to an online archive. The contributions were reviewed by subject specialists who could enhance the contributions by providing additional information or details, for example, dates of a specific event, or explanation of a term mentioned in a document from the time. Content specialists would also identify material to feature in blog posts and communications, such as particularly noteworthy objects or material that fit with a particular theme, for example, Christmas-related stories in December or love tokens around Valentine’s Day.
Crowdsourcing projects do not differ significantly from other projects in that they require good project management. Project managers need to be aware of features that are specific to crowdsourcing projects, such as the involvement and importance of the crowd for the successful completion of the project. We discuss this in detail in the “Managing cultural heritage crowdsourcing projects” chapter.
Each community coordinator may have their own approach and skill set. This role will become the public face of the project and having strong written and verbal communication skills are essential. Less quantifiable interpersonal skills such as patience, empathy, and active listening are what enable the community coordinator to forge authentic relationships with participants. Deescalating tension is another useful skill, since participants may experience frustration with the crowdsourcing platform, the source material, each other, or another aspect of the project. Genuine passion about the subject, task, or goal of the project can inspire participants. Specialized knowledge is not required and often a general knowledge is sufficient when paired with a willingness to learn and comfort with not having all the answers. See the “Connecting with communities” chapter for additional information and framing of this role.
A key component of any crowdsourcing project is communication. If you cannot get information about your project to the relevant parties, people will not know about your project and you will not get any participants. At any stage of your project, communication skills are essential for communicating elements of the project to those outside your core project team (these skills are essential for internal communication as well, but the focus here will be on public communication skills). In addition to helping to recruit participants, communication specialists can help with your community outreach, identifying and organizing stakeholders, reviewing your project content for clarity and readability, and creating content updates like blogs and social media posts, newsletters, and calls to action. Individuals with communications skills can help your team to organize and communicate specialized events (e.g., transcribeathons, workshops, or other community-oriented events) and communicate project results.
Case study: the role of communications specialist on Lockdown 2020
Lockdown 2020 is a small project that was set up to capture experiences and reflections at the University of Oxford during the 2020-21 COVID-19 pandemic.6 The project was done on a “best-effort” basis, using existing resources (platform and staff) only. When the team received a small grant, this was used to engage a project member to focus specifically on raising awareness of the project and attracting contributions, in particular from students who until then had been underrepresented in the collection. She promoted the project through existing mailing lists and newsletters, which drew it to the attention of certain groups (mainly staff), and posted news about the growing collection, with featured stories and pictures, on various social media platforms. The greatest effort was spent on identifying potential contributor groups, mainly but not exclusively on social media, and then engaging directly with them. This led to a rise in the number of contributions overall, and a considerable increase in the proportion of student contributions. At the end of the six weeks during which the part-time communications expert had been involved, the number of contributions had more than doubled and included material where students, as well as staff, described their lockdown experience.7
Data organization and management skills are useful at each step of your project’s lifecycle. This can include guiding and/or carrying out any required pre- and post-processing steps to get data into your platform for contributors to work on, or to get it out once it has been processed, possibly converted to a format that can be accessed, used for research, or analyzed in some way. People with these skills can also assist in task and workflow design, and help review early project output for usability.
Data analysis skills can be useful when processing data that has been created by your project. The type of data you work with and the needs of your project will decide what data analysis skills will be required and when they are needed. It can be good to involve data analysis specialists at an early stage in planning your project, to advise on the choice of platform and design of tasks so that you can be sure to get the relevant information out of your crowdsourced data.
Design skills are useful at many different points in the project lifecycle. Team members with user experience design skills can help with your task and workflow design, or offer other input on participant-facing infrastructure. This is important also for making your project accessible to a wide range of participants, for example, by making sure the platform interface is easy to navigate. If your project requires bespoke development, design skills are necessary as part of the development process. Additionally, design skills can be valuable when communicating results, including creating slide decks for talks, data visualizations, or other methods of presenting results to the public.
Web development skills are most frequently needed for projects that require the creation of bespoke infrastructure, but they can also be useful for creating attractive and accessible communication channels, such as project websites, blogs, and other web-based resources.
Crowdsourcing projects can include many technologies, from project management software to data management tools to website infrastructure. The target data you use for your project will have its own technology (a database or collections management system), and additional technology may be required for pre-processing your data before it is ready to be uploaded to your project and presented to participants. You will also need to choose what tool (or tools) you will use to build and host your project. This can be a difficult decision to make, as it will depend on many factors, including your skills inventory, budget, and timeline. In this section, we outline some of the main considerations. The “Aligning tasks, platforms and goals” chapter has more information about evaluating technology to determine what is appropriate for your project and your team. You may also want to look at the “Working with crowdsourced data” chapter in this context.
Existing tools vs. bespoke tools
As part of your strategic planning, you will need to choose whether to use an existing tool (or tools) for your project or to create your own bespoke infrastructure. A third option is to adapt, repurpose, or customize an existing tool.
Existing tools are a great option if you are new to crowdsourcing, or if you do not have the time or resources to create a bespoke solution. They are less expensive (and often free to use), allow you to build on the experience of others, and can be fairly quick to set up. Participant communities and guided tutorials are often a great way to explore these resources before committing to a single one (see the section below on usability assessment for more information).
The trade-off of using existing tools is that you may need to adapt your objectives based on the features that the tools offer, as the options for customizable features may vary. Additionally, you will need to consider what other tools you plan to use, as interoperability may be an issue as well.
Case study: explore crowdsourcing with the Zooniverse Project Builder
Zooniverse is a crowdsourcing platform that hosts a large number of projects and includes an existing community of more than 2.2 million participants. Through our open-source, browser-based Project Builder,8 anyone can design and run their own crowdsourcing project for free. Tools such as the Project Builder can be a useful way for you and your team to quickly set up a test project to explore workflows and task design options in a low-risk environment.
If your project requires specific features that are not available in an existing tool, your team may consider creating (or hiring someone to create) bespoke tools. The benefit of bespoke tool creation is that you can get exactly what you want for your project. However, this process is often prohibitively expensive and difficult to sustain. It is important to weigh your options and consider just how necessary your desired custom features may be.
For scholarly projects, a common approach is to apply for external grant funding to support collaboration with technology developers, i.e., a company that specializes in building cultural heritage software, or academic computer scientists who may be interested to create or modify software for their own complementary research interests.
Case study: building custom projects with the Zooniverse platform
The Zooniverse platform also supports custom project and tool development in cases where the Project Builder toolkit is not sufficient for a project’s needs. In these cases, our team will work with external partners to apply for grant funding to support the custom development effort in conjunction with larger project goals.9 For example, in 2020, we joined The National Archives (UK), the Royal Botanic Garden Edinburgh, and the Royal Museums Greenwich to apply for funding from the Arts and Humanities Research Council (AHRC) to support the creation of a new indexing tool that will allow participants to choose what items they want to work on within a project. This work is being carried out as part of the larger Engaging Crowds: Citizen research and heritage data at scale research effort,10 one of eight foundational projects in the AHRC’s “Towards a National Collection: Opening UK Heritage to the World” program.11
As the range and availability of existing tools grow, the chance of finding the right tool for your project also increases. However, it can often be the case that an existing tool is not quite right, perhaps lacking a particular feature, but is otherwise perfectly fine. When that is the case, you might consider customizing an existing tool to meet your requirements. For some tools, this is the main way it is developed and supported. See, for example, the open-source tool Omeka,12 where users and developers help to enhance the tool and support the user community.
Case study: tool customization at the Newberry Library
Newberry Transcribe is an ongoing project to transcribe diaries, letters, and journals.13 We are attempting to serve two user groups: transcribers, who need to find (and transcribe) new documents, and researchers who are interested in searching and referencing documents that have already been transcribed. No tool supported both of these approaches, so I customized our tech stack. I use a custom front end (written in Node.js & React) over Omeka and its Scripto14 plugin, which itself uses MediaWiki15 to store its transcription data. I pull from the Omeka API to filter and sort the items and use the MediaWiki API to power the search of the transcribed text.
Piloting and usability assessment
Once your team has made a technology choice, it can be useful to pilot test a scaled-down version of your project before committing to a full-scale launch. This will allow your team to evaluate the technology you chose and gather feedback from a limited group of participants. It can also help you to refine your task and workflow design, as well as ensure your project will produce usable results and maximize impact and engagement. Piloting a project can also provide a proof of concept to stakeholders, in particular as a way to justify the need for increased resources and to demonstrate the value of your project in a low-risk setting.
Consider designing your pilot so that the output is useful even if the full-scale project cannot take place (for example, if a funding application is unsuccessful). If you want your project to transcribe ten years of records, you may select a sample covering one month, one area, or one source so that the output will constitute a small sub-collection that can be useful as a stand-alone resource. It will help if your pilot data is representative of the range of material you will be working with, to allow you to test your approach on a realistic sample.
When designing a tool or implementing a workflow, it is important to consider usability. Usability testing looks at how easily participants can perform a particular task.16 The test considers not only how well the tool works but also how well the task was designed. Does the order of actions make sense? Is the workflow easy to follow? Is the task matched to the data and expected outcome? Usability assessment can also feed into the design of support materials and training, and to further development of the platform or the task.
Case study: beta testing to support iterative design at the Boston Public Library
Anti-Slavery Manuscripts at the Boston Public Library was a custom-built Zooniverse project that invited participants to help transcribe more than 12,000 letters from the Boston Public Library’s Anti-Slavery Collection.17 Before launching the project to the public, we beta tested a new text transcription tool that asked participants to click between each word in a line of text. The idea behind this approach was that the positional data created through these clicks could be used to more accurately tokenize strings of text submitted by multiple people, or match up strings of text accurately by word, which would make the process of text aggregation easier for researchers working with data output from the project. However, the Zooniverse participants taking part in the beta test disliked how the new feature made an already time-consuming task take even longer to complete. Based on this feedback, we redesigned the tool to only require participants to mark the start and endpoints of a line. As it happened, the aggregation methods developed for use with the original “mark each word” method proved to be just as effective with the simplified tool, which meant that the task could be completed more easily, and in a shorter amount of time, without negatively impacting the usefulness of the project output.18
As in all projects, the budget required to run your crowdsourcing project will depend on many factors such as scale, timeline, technology choices, and team size. The important thing to remember is that just because your project may involve public participants who are donating their time (i.e., volunteers) it does not mean there is no cost to your organization. You may not be paying for the effort of the participants as such, but crowdsourcing projects still require resourcing. You need to consider how much “care and feeding” will go into sustaining a healthy crowdsourcing project for the entirety of its lifecycle and budget accordingly, otherwise you run the risk of depleting your resources mid-project.
If grant-writing is part of your strategic planning process, a financial assessment can help to determine gaps in resources and justify costs to potential funders.
Crowdsourcing projects do not exist in a vacuum. Like other projects, they have a range of dependencies and have to fit in with institutional conditions, cultural contexts, and the range of stakeholders involved. Recognizing the contexts for your project is an important part of the (pre-)planning stage for any project. Where crowdsourcing projects may differ here, is in the relation they have to the crowd — a potentially very large number of people from different backgrounds, with varying skill sets and motivations. The project cannot exist in isolation but must relate to and interact with its crowd. This is discussed in more detail in the “Supporting participants” chapter. You may also find that the “Connecting with communities” chapter can help when designing your crowdsourcing project.
Common risks and mitigations
Risks might be thought of as events with the potential for significant impact on your ability to accomplish your goals and objectives. These events could arise externally (these could have been recognized as threats in a SWOT analysis, described in detail earlier in this chapter) or internally (these might be missteps taken by you or other members of your team). We have identified the most frequent risks and potential means of reducing the likelihood of those risks being realized based on our collective experience (see the illustration below). Every project is likely to have its own constellation of relevant risks determined by its goals, stakeholders, organizational context, and other factors. You might find it useful to create a similar table for your project at the project’s initiation, with regular re-evaluation during your project’s lifetime.
Toolkit: frequently encountered risks in crowdsourcing for this domain and potential steps to mitigate each
Inadequate resources assigned to the project.
Careful project design and, later, stewardship of resources.
Turnover of a key team member.
Creation and implementation of an information management plan that permits a replacement hire to be adequately oriented in a reasonable timeframe.
Adequate accommodation of a replacement event in long-term planning.
Inopportune downtimes for the project web presence.
Consider the past performance of any prospective technology partners, including any Service Level Agreements.
Data loss due to server failure or similar event.
Creation and implementation of a data management plan that involves regular data backups.
Consideration of Service Level Agreements when selecting service providers.
Loss of support from a sponsoring or partnering organization.
Adequate engagement of decision-makers in the planning and execution of the project, including the agreement on what success looks like and an adequate reporting schedule.
Adequate engagement of decision-makers in changes to project scope.
Prospective contributors choose not to participate.
Adequate engagement of contributors in the planning and execution of the project, potentially involving formal or informal roles.
A clear and engaging articulation of the higher value of the project.
Contributors become unhappy with the use (or lack thereof) of the data created.
A realistic articulation of likely project outcomes.
Eventual loss or corruption of content created during the project.
Creation and implementation of information and data management plans that adequately address archiving.
Sustaining long-term activities
If you intend for your project to endure longer-term, two topics could be valuable to consider in the project design phase: what might your revenue sources eventually look like, and what organizational framework could produce resilience in a changing operating environment? For example, it might be reasonably foresighted to write a Memorandum of Understanding (MOU) with partners to accommodate eventual fee-for-service activities. Decision-makers in an organization might like to see the project take advantage of incubator support services, if the project could reasonably be a first step in the spin-off of a non-profit organization. Long-term plans for your project might also enter into team member recruitment decisions.
Whatever you can spend on sustainability planning at the outset, we encourage those with a project likely to extend longer-term to schedule a sustainability planning exercise a year or so into the project, once it is beyond a pilot phase and realities of the operating environment have been more fully appreciated. The more time the project has to build a foundation for funding transitions, the better.
Sharing and archiving
An important aspect of a crowdsourcing project is to consider the data that the project is producing, whether that is new resources shared by participants, such as First World War memorabilia kept by members of the public, versions of existing material, such as transcriptions of hand-written ship’s diaries, or enhancements of digital records, such as tags added to pictures of paintings. The intended uses of the project output will influence the project design (“What kind of data are you planning to produce?”). It is important to consider not only the form and format of the output but also how this will be preserved and shared, and this is something that you need to include in your project plan. The “Working with crowdsourced data” chapter contains more information about data management planning for crowdsourcing projects.
Communicating and advocating
Your project design process will likely include one or more occasions where you will be required to advocate on behalf of your project. The “Connecting with communities” chapter covers this in more detail, but here we will briefly discuss some common circumstances that may require advocacy and communication as part of your strategic plan.
Sell the idea to decision-makers
If your (or a partner’s) institution requires approval for new projects, you may find yourself in a situation where you have to “sell” your project to the person or group of people who hold the power of approval. This could be an Executive Team, Board, Council, or other institutional leadership body; it could be a grant review panel or other resource allocation committee, or a group of peers with whom you want to collaborate.
Ultimately, in this phase it is up to you to demonstrate to the decision-makers that 1) your project has value; 2) you have identified the skills necessary to achieve your metric for success (and, if there are skills gaps, you have a plan for recruitment/training); and 3) you have considered the risks involved and taken any necessary risk mitigation steps.
It may help, in this step, to point to outcomes from projects of similar size and scope as a way of framing your work alongside known success stories and justifying any resource requests.
Case study: pilot testing and scaling up in the Davy Notebooks Project
For the Davy Notebooks Project, which invites volunteers to help transcribe the manuscript notebooks of Sir Humphry Davy (1778-1829), project leads Professor Sharon Ruston and Dr. Andrew Lacey used pilot testing to “sell” the idea to a funding agency. Through this process, they successfully received two rounds of funding from the UK AHRC: first, to support a pilot project, and then to support a scaled-up version. The pilot project was created and launched on the Zooniverse Project Builder in 2019.19 The pilot aimed to transcribe just five of Davy’s notebooks, though participants ultimately met that goal and additionally transcribed three additional notebooks, all within a matter of months.20 The team then used the results of the pilot to bid for another round of AHRC funding to support a larger project that will aim to transcribe the remaining set of Davy’s notebooks (~70) with funding earmarked for the Zooniverse development team to create additional custom features identified as being necessary based on the results of the pilot testing, including additional options for text markup and the ability to transcribe pages of a document in sequence. The expanded project will launch in the spring of 2021.
If you are including prototyping work in your project, this step may come before or after the prototyping process (or both, in the case of applying for follow-on funding). In either case, it will help to point out how this early testing aligns with your project values and will help to ensure that the following phase(s) of your project will benefit from the lessons learned during prototyping.
It can also help to consider early on in the process how you plan to recruit participants for your project. This is another form of advocacy, as you are communicating to potential participants why they might want to take part, why you need their help, and what benefit they will receive from participating. See the “Supporting participants” chapter for more information.
This chapter closes where all good crowdsourcing project designs should begin — with a laser focus on your participants, and clear consideration of how each step in the project moves towards deepening your connection with your community. This is particularly important in the final stages of a project, where the simple act of acknowledging, thanking, and recognizing the people who contribute to your project can help establish trust and connection. Project designs should clearly articulate how this will happen at the end of the project, with common strategies to express gratitude including:
Thanking contributors and participants directly via email, publication, forum discussions, and social media
Listing collaborators and external partners on the website(s), in printed materials, in grant applications, and reports
Acknowledging participants in project publications including datasets and journal articles, and, when appropriate, inviting them to collaborate as co-authors
We began this chapter by saying this was not going to be an exhaustive To-Do list of all the required steps in an ideal Project Design, particularly as this level of detail might be inappropriate for a modestly scoped project with a short timeline. However, acknowledging your participants is the exception — if you do one thing only, always do this!
At the core of the project planning process are the motivations you have for running your project. These will permeate what you do and the decisions you make. The way you design your project will not only have an impact on the project while it is running but also affect its outputs and legacy. A successful crowdsourcing project can have an impact that reaches beyond the project as such.
In this chapter we have provided an overview of approaches and considerations that we have found useful, or even critical, when designing crowdsourcing projects with cultural heritage collections, and organizations. Many aspects of planning and designing a crowdsourcing project are similar to what you would find in other projects. You need to define your goals and objectives, identify and mitigate risks, and acquire and manage resources. You have to plan and make decisions about the choice of technology, data, communication, testing, evaluation and much more. Here we have focused on showing how this can be done in the context of cultural heritage crowdsourcing. We have included suggestions to help you think about the resources you may need and how you can approach the planning and development of your project to make it fit within the context of your organization and support and promote your values. The topics and themes raised here are discussed more extensively in the following chapters — it will be helpful to dig deeper into technology choices in “Aligning tasks, platforms and goals,” designing contributor tasks in “Choosing tasks and workflows,” and overall project management in “Managing cultural heritage crowdsourcing projects.”
I would highlight its better to anchor the step for more community engagement in institutional policies that are drafted ever so often. This relates to the “Why” question in an earlier chapter.
Perhaps something about making sure who owns the intellectual property of contributions?
Perhaps differentiate between platforms (Zooniverse, Wikipedia etc.) and tools (f.i. Stand alone software to be used for transcribing texts).
I feel another point to be made here is about leveraging the power of an already existing crowd connected to a third party platform. GLAMwiki projects, for instance, have direct access to a massive communities of wikipedians.
I completely second that idea!
Again ‘data’ but above also ‘source material’ was used (in addition to ‘content’)
what does “several PT colleagues” mean on the illustration?
Good question - I assume it’s ‘part time’! The process of commissioning illustrations was very decentralised so the jargon might have been missed.
just to add to my previous comment – aren’t we usually considering data as the crowdsourced (user generated) input created around the content our project is about?
Couple of paragraphs above the term content is used (“Some projects may have a research question; others may be focused around some kind of content, with the hope to improve its usability or reach in some way. Depending on the scope of the content, you may need to design and implement several projects to arrive at your goal.”) and now it seems data is used in the same sense.
Both are valid options but still maybe it’s better to be somewhat more consistent?
A successful project can’t help but have an impact that reaches beyond what its organisers have anticipated. That is what makes a crowdsourcing project truly successful.
I really love this comment.
+ 1 more...
100% agree with this statement!
If I could put this in capitals, highlight it in gold and have fireworks go off around this sentence, I would. So so so agree with this.
This is also an important skill to have when reporting back to the funders of the project.
I SO agree with this statement! It is such a pivotal part of a crowdsourcing project in so many ways. This is a fabulous section.
“Access” can be so narrowly defined. As a volunteer I don’t just want to “access” what I’ve helped to create, I want to be able to reuse that content. If I’ve transcribed a journal, I want to be able to not just see that transcription on a website, nor just to be able to download that transcription, I would like to be able to freely reuse that transcription in any manner I see fit.
Who will be able to reuse the project’s collections, data and output. Will volunteers be able to reuse the content they create? How will they be able to reuse?
To me as a volunteer, crowdsourcing projects often seem to be so narrow in their goals and focus. I recognise that clear key objectives are needed to ensure the project delivers on its primary aims but as a volunteer I so wish organisers would also think big and have a much wider view of the potential impact of their project. Again it comes back to reuse. In the example given I would include in the goal that the text should be openly licensed so that it could be reused by anyone for anything. So the resulting online descriptive metadata provided by volunteers could be put to reuses that the project organisers may not ever have considered. Examples that immediately spring to mind - reuse of descriptive metadata in an artwork or piece of clothing eg Andrea Wallace’s metadata skirt https://www.create.ac.uk/blog/2017/03/13/create-postgrad-rijks-award/ or the reuse of the openly licensed text in other platforms such as Wikipedia articles. If organisers are too narrow in their focus they can loose sight of the possible wider impact of their project and as a result organise their project in such a way as to hinder that wider impact.
Also when we’re talking about text it’s not only about access and search but completely new ways of analysis (and thus new knowledge creation) open up when text becomes also machine readable and usable for natural language processing, statistical analysis etc.
I’m also of the opinion that project creators should be guided by the mission statements of their organisation. That the values stated in the institutions mission statement should be consistent with the goals of the project. As a volunteer I have come across crowdsourcing projects from institutions that have mission statements to “collect, connect, and co-create knowledge” or “to serve their communities, to enrich lives and inspire discoveries” or “to connect through sharing stories”. Although these may sound very abstract, if project creators take such values to heart it can have a positive practical effect on crowdsourcing projects. For example Auckland Museum uses their mission statement of “connecting through sharing stories of people, lands and seas” to empower their policy of “open by default, closed by exception”. This in turn extends to all the work product of the institution, crowdsourced or otherwise. Ensuring it is licensed for reuse openly. And as a volunteer, reuse is what I care about most.
Related to this is “who will be able to reuse what is created through your project?” I say this because I have come across crowdsourcing projects whose mission statements say they want researchers to reuse the results of crowdsourcing but appear to have a very narrow definition of who those “researchers” are. I also believe copyright licensing plays a vital role in either restricting or widening who can reuse what is created through a crowdsourcing project. I’m of the opinion that the dreaded “non commercial use” license is a much bigger dampener on not just who will volunteer for a project but also who will reuse of what is created through a project, than most project organisers are aware.
thank you, thank you, thank you for this!!!
these questions are incredibly helpful, thank you for spelling them out, particularly as a resource to help communicate challenges and needs to leadership who may not understand the full scale of what is entailed when creating these projects.
and how will it be managed and stored? (as a possible extension on this)
Nor are differentiated skills necessarily confined to the institution. Would it be useful to illustrate how there is often a mix of skills amongst the crowd too?
We can’t change the image but we can update the text so that it’s clear we think skills can exist anywhere. Thanks!
+ 1 more...
Are the examples in the diagram below the wrong way round? Creating indexed, searchable text sounds pretty concrete to me, whereas creating understanding of audience is much more abstract.
That looks like a question for Meghan, I’ll ask her to check
Thank you for writing with humor and relatability!
Thanks for noting it - it really helps to hear that!
We should add links to portals like SciStarter to help people find other projects to try when thinking about design and motivations