Marco Rendina: Eirini, let’s start from the basics. What is crowdsourcing?
Eirini Kaldeli: Crowdsourcing is the process of distributing a task to a group of people, who usually contribute through their work online. In some cases, contributors receive material compensation; in others, their participation is voluntary, with rewards being immaterial, such as personal satisfaction, cultural contribution, or knowledge acquisition. Within the cultural heritage sector, crowdsourcing has long been used to address various challenges, from content collection and transcription to collections tagging and the detection of biased terms in the description of cultural heritage objects. Depending on the nature of the task, participants may need specialised skills or knowledge.
MR: What are the benefits of crowdsourcing for the cultural heritage sector?
EK: A responsible and meaningful crowdsourcing project can bring mutual benefits to cultural heritage institutions and participants. First and foremost, crowdsourcing should be approached as a means to engage citizens with heritage collections. On one hand, participants have the opportunity to interact with cultural heritage and connect with it; learn useful information about items and topics in a playful way; share their perspectives and knowledge; co-shape how collections are presented; and collaborate with fellow citizens in a participatory experience. On the other hand, cultural heritage institutions can improve the quality of their collections and make them more discoverable and accessible; raise awareness about their cultural heritage assets; reach out to new audiences; and gain deeper insights into how their collections are perceived by communities.
MR: Spyros, could you tell us a few words about the CrowdHeritage platform you are working on in the context of the AI4Culture project?
Spyros Bekiaris: CrowdHeritage is an open platform for organising online crowdsourcing campaigns that mobilise people to improve the quality of cultural heritage collections. This could be in relation to different aspects, from multilingual coverage to semantic tagging. Participants are invited to enrich digital collections, either by producing new information (e.g. adding geo-locations) or by evaluating and validating automatic outputs produced by digital tools (e.g. automatic translations or detection of biased language).
CrowdHeritage has been extensively used to stimulate participation in educational environments and citizen science settings by engaging communities including students and pupils, culture lovers, cultural heritage professionals and the general public. The platform has so far been used to organise 40 crowdsourcing campaigns with more than 970 unique contributors, generating around 112,000 annotations and evaluating more than 16,000.
MR: Crowdsourcing is rooted in distributed manual effort while AI4Culture is about AI technologies. Can you explain the relationship between the CrowdHeritage platform and AI tools?
SB: CrowdHeritage was originally designed to support campaigns that invite users to add new annotations from scratch. Within the last few years, we are seeing an increasing interest in coupling CrowdHeritage with AI tools. Such tools offer remarkable opportunities for automatically improving the quality of digital cultural heritage collections at scale and with minimum manual effort, from optical character recognition and machine translation, to automatic subtitling and image classification.
However, resorting to purely automatic methods has also revealed several issues that have to be dealt with. We need ways to assess whether the results of AI algorithms are accurate enough for our standards and to compare how different algorithms behave on specific data and based on certain criteria. In this context, crowdsourcing is an excellent means to harness collective human intelligence and collect useful insights. The accrued feedback can help us filter out incorrect automatic results, apply appropriate filters for maintaining what we consider good-quality results and spot certain shortcomings of AI algorithms. In this interplay with AI, the CrowdHeritage platform is also helpful for producing ground-truth datasets that can be further exploited to adapt AI tools in relation to cultural heritage data.
MR: Eirini, can you provide some concrete examples of how CrowdHeritage has been applied in combination with AI tools?
EK: I can provide many! In the context of the Europeana Translate project, we ran a number of campaigns where participants evaluated the results of a machine translation algorithm trained on Europeana metadata (developed by our AI4Culture partner Pangeanic). This feedback allowed us to improve the quality of the results and also led to the creation of open datasets published on the ELRC-SHARE repository, which gathers language resources across the EU.
In another case study about selecting optimal Super Resolution (SR) models for different image types (which you know very well, Marco!), we set up a campaign in collaboration with the European Fashion Heritage Association (EFHA), where participants were asked to compare and rank a sample of images upscaled by different SR models. The results of this campaign enabled EFHA to select and apply the best SR algorithm depending on the image characteristics.
In the framework of the CRAFTED project, a series of campaigns were organised to evaluate colours automatically identified by AI colour detection algorithms. The analysis of the collected feedback led us to the conclusion that the automatic algorithms repeatedly identified some specific absent colours and missed some existing ones, something that helped us to improve our filtering approach and select the best algorithm setup.
In the DE-BIAS project, we are in the process of setting up a series of campaigns where communities will inspect and evaluate terms flagged by an automatic bias detection tool as containing derogatory language.
MR: That’s really interesting, Eirini, but will cultural heritage institutions be able to use the CrowdHeritage platform to set up their own crowdsourcing campaigns?
EK: Of course! Through the AI4Culture project, a new ‘campaign editor’ feature has been made available on CrowdHeritage, which allows anyone to set up and run a crowdsourcing campaign on the platform. Anyone interested can look at this video tutorial or check out the CrowdHeritage documentation to learn more!
Find out more
In September 2024, the project will launch a platform where a set of open tools will be made available online, together with related documentation and training materials. Keep an eye on the project page on Europeana Pro for more details and stay tuned on the project LinkedIn and X account!