Skip to main content


Published onMay 04, 2021

Affordances in the field of human-computer interaction (HCI), are a characteristic of a user interface element that helps the user perceive what actions they can perform with that user interface.

Aggregation refers to a stage of the crowdsourcing process where the project team must bring together the many different contributions of participants into a meaningful whole. In some cases, this can refer to displaying multiple annotations for an item, or the process of determining group consensus if a single outcome is desired.

Agile is a process of iterative design, originating in the world of software development, that seeks to get a minimum viable product (MVP) up and running as quickly as possible so that real-world feedback can be received and incorporated into the next version.

Appropriation refers to the ways that users adapt or modify a piece of software to fulfill a need that was not anticipated by the software creators. As defined in the field of human-computer interaction (HCI), appropriation describes well-designed software that is flexible enough to allow users to appropriate it for new, unexpected purposes.

Artificial Intelligence (AI) is the recreation of human intelligence by computers or machines. It is sometimes used to refer to machine learning.

Arts and Humanities Research Council (AHRC) is a UK-based funding council, part of UK Research and Innovation; working in partnership with universities, research organizations, businesses, charities, and government to create the best possible environment for research and innovation to flourish.

Citizen science/history/research refers to science undertaken by non-professional participants, and should both contribute to knowledge and advancement while also increasing the public understanding of science. Citizen history or research similarly engages non-specialists, empowering them to build on their interests and contribute knowledge.

Community of Practice is a group of people who come together to fulfill both individual and group goals through their participation in a shared activity relating to a common concern, a set of problems, or an interest in a topic.

Community building is a deliberate strategy to create or extend a geographically or subject area-based community of interest (or a combination of both) to help sustain your crowdsourcing project for its duration.

Community outreach is a way for organizations to reach out to communities and individuals beyond their institutional limits to provide learning or participatory opportunities.

Community representation refers to the important role of representing a community’s viewpoint within a crowdsourcing team, to ensure a range of different concerns and experiences are reflected.

Content specialists are expert storytellers and create engaging, inspiring articles, photos, videos, or mixed media designed to raise awareness and encourage participation with your target audiences.

Contributors in the context of this book are used to refer to the organizing individuals who make up your project team, rather than participants, who are the “crowd” in your crowdsourcing project.

Cultural competence is the self-reflective ability to be aware of your own potential bias and privilege, in order to better communicate and include people across cultural boundaries.

Data-centered processes prioritize the management and manipulation of data above other considerations, often used in contrast to human-centered design.

Data paper is a journal publication describing a dataset, as opposed to a synthetic publication that might go beyond facts and figures to draw hypotheses, arguments, and conclusions about the research.

Digitally-enabled participation or sometimes tech-enabled participation is a term that has entered the academic literature to describe collaboration across a digital network, augmenting or replacing practices that were previously undertaken by a physically present team.

Digital Object Identifier (DOI) is a unique combination of letters and numbers that can be used to permanently link to and identify a digital document, such as an article or e-book.

Ethics committee is an internal unit with an organization (e.g., a university) that is responsible for protecting the rights and privacy of human participants in research studies conducted by that organization. The ethics committee typically reviews proposed research studies and has the authority to approve or reject them.

Event hashtag is a short descriptive title or slogan that helps to increase awareness and visibility of an event or project by making it easier for people to find photos and view participant interactions on social media.

Five Ws are the basic building blocks of information gathering: who, what, when, where, why.

Formative evaluation is undertaken while a project is live, to report back to the project team in real-time so that they can respond to feedback, potentially changing direction if results are not favorable or expanding activities that yield positive results.

Geocoding is the process of translating a human-readable location description (e.g., a place name) into a precise location on a map. Reverse geocoding is the opposite — determining the place name or address for a given map location.

GLAM sector is an acronym for “galleries, libraries, archives, and museums” and is generally used as shorthand when referring to institutions in the cultural heritage sector.

Historically Black Colleges and Universities (HBCU) are a formally recognized group of colleges and universities in the United States. These schools were established before 1964 with a principal mission of educating black Americans.

Human Computation Systems is the use of human intelligence to accomplish computing tasks that are difficult or impossible for computers. For example, in crowdsourced cultural heritage projects, human computation leverages the sophisticated human vision system to transcribe handwritten documents.

Human-centered design is a creative problem-solving method that prioritizes empathy and a deep understanding of human needs.

Institutional Review Board (IRB) — See Ethics committee.

In-person pairing in crowdsourcing events is a collaboration technique where participants are paired up to work together on crowdsourcing tasks.

Livestream enables participants to watch, share, and comment on a live video feed (such as Zoom); this can be restricted to invited attendees or open to all, with recordings usually made available after the event.

Machine Learning (ML) is the training of algorithmic models which learn to make increasingly accurate predictions from sample data (usually provided by the crowd) without being explicitly programmed.

Memorandum of Understanding or MoU, is a non-legally binding documented agreement between project partners clarifying the responsibility, purpose, and actions of each partner.

Microtasking is a common practice in crowdsourcing where larger, more complex tasks are divided up into very small tasks (microtasks) that can typically be completed by a participant in a short amount of time (e.g., a few seconds).

Micro-attributions are a way to recognize scholarly contributions smaller than article co-authorship. Crowdsourced contributions, like annotations or transcriptions, can be credited to participants via micro-attributions.

Named Entity Recognition is a computational technique where the proper names of people, locations, and organizations are automatically located and categorized within a text document.

Niche-sourcing refers to engaging a smaller, specialist crowd (sometimes called expert-sourcing).

Natural Language Processing (NLP) is a branch of artificial intelligence focused on computationally analyzing and understanding text written and spoken by humans.

Pareto principle refers to a general rule of thumb that approximately 80% of your results come from 20% of your efforts.

Participants in this book refer to the “crowd” element in your crowdsourcing project, as distinct from contributors who may be part of your wider project team.

Partners, partnerships refer to an organization-level relationship with an institution or body that will have an integral role in your crowdsourcing project, including by resourcing (financial or benefit in kind) or contributing staff and expertise.

Personas in the field of human-computer interaction (HCI), are descriptions of fictional people who represent the various target user groups for a software project.

Personally Identifiable Information (PII) is information about software users or study participants that can individually identify them and may be protected or sensitive, such as their name, email address, home address, or phone number.

Predominantly White Institutions (PWI) is the term used to describe organizations where white people account for 50% or more of the student body, usually in higher education but could just as easily be applied to the GLAM sector.

Pre-processing is a form of data cleaning to remove unreliable or irrelevant information and sometimes to assist in the creation of a training data set in advance of Machine Learning (see below).

Post-processing in Machine Learning methods can refer to the recognition and removal of false positives, leading to the refining of the ML model to result in better quality predictions.

Project lifecycle is a project management term that collectively refers to all the different stages of a project (usually five), including initiating, planning, executing, monitoring and controlling, and closing.

Project team is the group of people who on a day-to-day basis are responsible for the design, management, and maintenance of a project; it may consist of a mix of paid or voluntary, temporary or permanent, junior or senior members with a distribution of responsibilities.

Qualitative data/methods are usually unstructured, nonnumerical data obtained by a researcher through interviews, focus groups, and observation.

Quantitative data/methods focus on the collection of easily measurable data, including metrics such as age, gender, or demographics — data that represents who, what, when, where — but is merely a starting point toward answering the broader “why.”

Raw data or primary data is at the pre-processing stage and therefore may contain errors (see pre- and post-processing above).

Roundtripping is a process by which data is exported from one system, imported into another system where it is then improved; then the enhanced version is reimported back into the original system.

SMART is an acronym used to guide the formation of project objectives. It is typically meant to be Specific, Measurable, Achievable, Realistic, and Time-bound.

Stakeholders can refer to organizations or individuals who could be both affected by and affect the outcome of your crowdsourcing project and therefore need to be explicitly recognized and managed.

Summative evaluation focuses on assessing the outcome of a crowdsourced project at its conclusion, with success and efficacy judged against the original project aims.

Sunset, sunsetting is the process of planning and implementing a plan for closing down a crowdsourcing project, ideally in a way that is respectful of participants and preserves their contributions.

SWOT is an acronym for Strengths, Weaknesses, Opportunities, and Threats, providing a planning tool to help assess the likely success of a crowdsourcing project.

Task is a unit of work to be completed by a human or computer within a crowdsourcing project.

Transcribeathon (or transcribe-a-thon) is a physical or virtual event in which participants gather for a predetermined period to work together on a crowdsourced transcription project.

Usability is the extent to which a product can be used by specified users to achieve specific goals. It may include attributes such as learnability, efficiency, memorability, errors and satisfaction.

User-Centered Design (UCD) is an iterative design process that foregrounds the needs of end-users (rather than, e.g., software developers).

User experience (UX) design is a design process that emphasizes creating a product that is not only usable and functional but also provides a high-quality, enjoyable experience.

User-Generated Content (UGC) can refer to any digital content created by your crowdsourced participants and published online, from as little as annotations to full video and audio files.

Values are a clear articulation of your ethics, determining your project priorities, and how you ultimately choose to work with the crowd according to these principles (see the “Identifying, aligning, and enacting values in your project” chapter).

Virtual help desk enables your project team to assist participants who might be encountering technical or other issues through a dedicated virtual chat channel or similar space for submitting queries.

Workflow in crowdsourcing projects is a sequence of tasks, typically defined by the project staff, that are performed by a combination of humans, computation, or both.

Kurt Luther:
”Human Computation Systems” should be simply “Human computation” as this is the standard name of the technique/field. See:

Mia Ridge:

Thanks Kurt!