About the project
Europeana Subtitled was a Europeana Generic Services project co-funded by the European Union under the Connecting Europe Facility Programme. Running from June 2021 to November 2022, the project aimed to enable audiovisual media heritage to be enjoyed by professional and non-professional audiences as well as to increase its use through closed captioning and subtitling.
Europeana Subtitled gathered seven major national broadcasters and audiovisual archives from seven European countries to provide high-quality audiovisual materials to Europeana. The project combined AI technology and audiovisual cultural heritage to produce high-quality closed captions and English subtitles for local video content, and created a platform to allow organisations to run crowdsourcing campaigns to revise captions using state of the art editing tools.
Europeana Subtitled also supported cultural heritage professionals with the use of automatic speech recognition (ASR) and machine translation (MT) technologies in the cultural sector through an online training suite consisting of video tutorials, documentation and guidelines, and worked with teachers and museum educators to create learning resources with audiovisual content.
Finally, the project engaged audiences through crowdsourcing events and editorial activities on the Europeana website, in particular, through the 'Broadcasting Europe' page and 'Mass-media and propaganda' online exhibition.
Delivering audiovisual content to Europeana
Through Europeana Subtitled, over 8,000 thousand high-quality audiovisual materials related to the theme ‘Broadcasting Europe’ were published on the Europeana platform, complying with the publishing criteria of content Tier 2 or higher and metadata tier B or higher, as specified in the Europeana Publishing Framework (EPF). In addition, the project made more than 12,000 updates to content that was already published. This included quality improvements in the data and rights statements, to allow reuse for educational purposes (InC-EDU).
The audiovisual content was aggregated through the domain aggregator for audiovisual heritage, EUscreen, and was carefully selected by the project partners to highlight the social, political and cultural changes in Europe as seen through television screens in the Netherlands, Slovenia, Greece, Germany, Italy, Romania and Spain.
The Subtitled content is publicly available and videos can be enjoyed directly on the Europeana website, while you can also access freely reusable content with more than 3,000 records in the Public Domain.
Crowdsourced captioning
Europeana Subtitled project partner Translated developed The Subbit! Platform. This audiovisual editing platform allows cultural heritage institutions to set up and run custom campaigns to correct, edit and validate automatically generated subtitles from a variety of languages into English. Subbit! encourages professionals and the general public to work directly with audiovisual archival content and European heritage, highlighting the importance of more accessible audiovisual archival material for multilingual audiences and increasing awareness of audiovisual online collections.
Over the course of the project, crowdsourcing events using Subbit! were held throughout Europe, encouraging students and cultural heritage enthusiasts to engage with audiovisual clips from European heritage collections in a friendly and fun way, while enriching the clips for future use and practicing their language skills. Find out more and use the platform.
Subbit! and the Europeana Subtitled AI Pipeline
The subtitles used in the Subbit! platform are created automatically by the Europeana Subtitled AI Pipeline which generates closed captions and subtitles for videos using automatic speech recognition and machine translation from EU languages to English. By using this tool approximately, 250 hours of AV recording were post-edited by professionals, while 50 hours were post-edited by the general public.
Comparing subtitles post-edited via the Subbit! platform with their original automatic version generated by the Europeana Subtitled AI Pipeline allowed project partners to quantify the quality of the automatically generated subtitles in terms of WER and BLEU scores (two metrics commonly used for the evaluation of respectively automatic speech recognition and machine translation). The results of this comparison showed that the overall target for WER and BLEU were successfully achieved.
Both the Europeana Subtitled AI Pipeline and Subbit! platform use the Europeana APIs to retrieve video content and contribute back the generated and validated subtitles and closed captions to the Europeana platform.
‘Broadcasting Europe’ to audiences
A dedicated feature page on the Europeana website called Broadcasting Europe showcases editorial content curated and published by Europeana Subtitled project partners. The 28 editorial pieces highlighted on the page cover a diverse range of topics using audiovisual heritage from local and national archives across Europe. Using broadcasting and communication as its key themes, Europeana Subtitled partners created stories about European identity, social change, politics and tourism (among others) through the lens of cultural diversity, gender equality, youth empowerment, migration, discrimination and social exclusion.
Featuring audiovisual heritage broadcast at local, national and international levels, Europeana Subtitled told stories about important events and personalities that marked the history of the seven participant countries and of Europe at large.
The Europeana Subtitled training suite
The project developed a training suite for people that want to create and contribute to improving subtitles, who can follow instructional videos on the Subbit! Website to learn how to create and/or improve them. In addition to technical skills and competencies, these resources also present general good subtitling practices.
Professionals that want to get a better understanding of the work involved to subtitle their audiovisual cultural heritage were also supported by the project, and could participate in one of three training events that were organised in winter 2022. The project has converted these events into a training suite which covers:
the steps you need to take to enrich audiovisual collections with subtitles and translation
which variables to take into account to determine the usability of audiovisual collections, from training AI to publication with subtitles and translations and
where to find more information in case you want to start working on subtitles and translation for audiovisual collections.