This website uses cookies to ensure you get the best experience. By clicking or navigating the site you agree to allow our collection of information through cookies. Check our Privacy policy.

2 minutes to read Posted on Thursday January 14, 2016

Updated on Monday November 6, 2023

New media search methods in the Europeana API

Powerful new search features, filtering queries and retrieval of media files within Europeana's metadata.
main image

The REST API allows you not only to search on and retrieve metadata, but gives you also powerful features based on technical metadata. Technical metadata is metadata which is extracted from media files which reside in records, such as the width and height of an image. These features give you the possibility to search for and filter on Europeana records by media information, for instance to only search for records which have extra large images, high-quality audio files, or which images match a particular colour. These features were developed as part of the Content Re-use Framework within the Europeana Creative project.

The media search features as described on this page are part of the existing search API, search facets and the record API response.

Background information

Europeana extracts technical metadata from all media URL's within all the Europeana records (present within the edm:isShownBy and edm:hasView fields) in specific time intervals to verify whether all links still resolve and to extract technical metadata from these media files. This information is then made available for search and included in the record API. This information is updated on a continuous basis.

Cardinality

A Europeana metadata record can contain a reference to zero, one or more media files. When a search is made on a technical metadata property or facet (such as image size), a record is returned if one of the media files present in the record match the search query.

Search

The search API allows searching on the following media parameters:

Parameter Datatype Description
media Boolean Filter by records where an URL to the full media file is present in the edm:isShownBy or edm:hasView metadata and is resolvable.
colourpalette String Filter by images where one of the colours of an image matches the provided colour code. You can provide this parameter multiple times, the search will then do an 'AND' search on all the provided colours. See colour palette.

Facets

The Search API returns a list of media-related facets to tell more about the distribution of media information on the search results. The facets also can be included in search queries to allow for very specific media searches such as querying on image size or audio duration.

The following facets are available in the facets profile in search and can be searched on as well:

Facet name Datatype Media type Description
MEDIA boolean To indicate whether an URL to the full media file is present in the edm:isShownBy or edm:hasView metadata and is resolvable.
MIME_TYPE string Mime-type of the file, e.g. image/jpeg
IMAGE_SIZE string Image Size in megapixels of an image, values: small (< 0.5MP), medium (0.5-1MP), large (1-4MP) and extra_large (> 4MP)
IMAGE_COLOUR boolean Image Lists 'true' for colour images. An alias to this facet is IMAGE_COLOR, note that for non-colour images you cannot provide the 'false' value. Use the greyscale-facet instead.
IMAGE_GREYSCALE boolean Image Lists 'true' for greyscale images. An alias to this facet is IMAGE_GRAYSCALE, note that for colour images you cannot provide the 'false' value. Use the colour-facet instead.
COLOURPALETTE string Image The most dominant colours present in images, expressed in HEX-colour codes. See colour palette.
IMAGE_ASPECTRATIO string Image Portrait or landscape.
VIDEO_HD boolean Video Lists 'true' for videos that have a resolution higher than 576p.
VIDEO_DURATION string Video Duration of the video, values: short (< 4 minutes), medium (4-20 minutes) and long (> 20 minutes).
SOUND_HQ boolean Sound Lists 'true' for sound files where the bit depth is 16 or higher or if the file format is a lossless file type (ALAC, FLAC, APE, SHN, WAV, WMA, AIFF & DSD). Note that 'false' does not work for this facet.
SOUND_DURATION string Sound Duration of the sound file, values: very_short (< 30 seconds), short (30 seconds - 3 minutes), medium (3-6 minutes) and long (> 6 minutes).
TEXT_FULLTEXT boolean Text Lists 'true' for text media types which are searchable, e.g. a PDF with text.

Sample use-case: large openly licensed images of paintings

The following section will help you build a simple application based on the media search and retrieval capabilities of the REST API. For this use-case we will construct API queries to retrieve openly licensed large and extra large images of paintings, display their thumbnails on a page and then display part of their technical metadata on a separate page for the image. This section will provide guidance on how to use the API in order to fulfil this use-case.

Retrieving large and extra large images

We will start with the search query to retrieve the records. For this, we use the following:

search.json?wskey=xxxx&query=what:painting&media=true&qf=IMAGE_SIZE:large&qf=IMAGE_SIZE:extra_large&reusability=open

To breakdown the search query:
  • wskey=xxxx - API authentication, replace xxxx with your API key.
  • query=what:painting - Search for records where the subject is a painting.
  • media=true - Records where there is a link to a media file present in the metadata and where this links resolves to a working media file. Note that this parameter is not actually needed when you do a query for any of the media facets, which already imply the value of this parameter.
  • qf=IMAGE_SIZE:large - Records where an image is present of a large size (1-4MP).
  • qf=IMAGE_SIZE:extra_large - Records where an image is present of an extra large size (>4 MP), note that the qf parameter can be included more than once and in this case equals to an 'OR' query.
  • reusability=open - Ensure that only openly licensed media is present in the search results.
Test on API Console

Show search results as thumbnails

Now that we have the search query, we need to use its output to render thumbnails of images on a page. First, note that we did not include any sample as for pagination, you need to apply this yourselves. For this you can use the 'rows' and 'start' parameters in the search API. To render thumbnails of the images in the search results, you need the following information from the search response:

  • id - The identifier of the record.
  • title - The title of the record.
  • edmPreview - The URL to the thumbnail image of the main media file.

With this information, you can build a page which shows the thumbnail (edmPreview), along with a title (title) and with a link to a separate page which at minimum should contain the identifier of the record (id). Next, we will help you create that separate page.

Show the large image with its technical metadata

If a user clicks on a thumbnail from the search results, next thing you want is to display a large (or extra large) images along with its technical metadata. For this, you need to retrieve the record information from the record API. An example query to the record API would be:

/record/90402/BK_1978_399.json?wskey=xxxx

As you can see, the only parameter - aside from your API key, is the record identifier. In order to then display the (extra) large images and information from the technical metadata, you need to parse the record API response as follows:

  • Use the URL from the "edmIsShownBy" field in the "aggregations" class as the URL of the image file. This field only appears once.
  • Iterate through the "webResources" in the same "aggregations" class until you find the WebResource element which URL ("about") corresponds with the "edmIsShownBy". In here, the technical metadata is present.
  • Then, render the technical metadata you want to display, for instance the "ebucoreWidth" and "ebucoreHeight" (width x height in pixels).

Other examples

Find all records that match the query ‘Paris’ which are openly licensed and have large images:

search.json?wskey=xxxx&query=Paris&reusability=open&qf=IMAGE_SIZE:large

Test on API Console

Find all records that match the query Paris which have a thumbnail image, are of mime type image/jpeg and have an aspect ratio of 'landscape':

search.json?wskey=xxxx&query=Paris&thumbnail=true&qf=MIME_TYPE:image%2Fjpeg&qf=IMAGE_ASPECTRATIO:landscape

Test on API Console

Find all records where the subject is opera and where the results are sound files with a long duration:

search.json?wskey=xxxx&query=what:opera&qf=SOUND_DURATION:long

Test on API Console

Find all records where one of the images has a (dominant) red colour:

search.json?wskey=xxxx&query=*:*&colourpalette=%23FF0000

Test on API Console
top