Skip to content

Derivatives

Derivatives are files, usually generated automatically from a source file, which may be useful to the repository. Examples of derivatives include:

  • smaller or compressed service files
  • thumbnails or poster images for display
  • preservation files in open file formats
  • files containing technical metadata about a file.

Derivative Models

There are two schemes you can use for derivatives:

  • In the standard model, each derivative is a new Media entity linked to the original's parent node. The "Media Use" field is used to distinguish the roles that the various files and media have (Thumbnail, Service File, etc.).
  • In the multi-file media model, derivatives are added to additional file fields on the original file's Media entity (see Multi-File Media).

Derivatives are Actions in Drupal

Derivative configuration is stored using Drupal's Actions.
This means that all derivative configuration, such as parameters dictating the derivative size and quality can be edited by a repository administrator in the Drupal GUI under Manage > System > Actions.

As Actions, they can be executed on nodes manually using Views Bulk Operations. They can also be configured to run automatically on media save thanks to Islandora's additions to the Drupal [Contexts] module.

Derivative actions will replace existing files

If a derivative action runs but the target derivative (as identified by its taxonomy term) already exists, the new file will replace the old file (leaving the rest of the Media intact).

Derivatives can run automatically with Contexts

With the Contexts module, you can configure derivatives to run under specific conditions. These are set in the "Conditions" section. Islandora provides a number of context conditions including:

  • entity bundle
  • media mimetype
  • term (attached to a media or node, or a node's parent node)
  • whether a node or media "is islandora"

You can set up as many of these as you like, with "and" or "or" logic between them.

In the "Reactions" section is where you can set up derivatives. For standard model derivatives, choose the "Derivatives" reaction. This lists all actions, including derivative actions. Note that multifile derivatives won't work here.

Multi-file media

The multi-file media derivatives can NOT be selected from within the "Derivatives" reactions. From the "Reactions" pop-up window, you must choose "Derive file for Existing Media". This panel lists only Multi-file media-type derivatives.

Derivatives have Types

When creating a new Derivative Action, there are a number of flavours of derivative "types" available. All derivative Actions fall into one of these types.

Derivative type name machine name Supplying module Expected Microservice (software)
Generate a Technical metadata derivative generate_fits_derivative Islandora FITS (roblib) Crayfits (FITS)
Generate a audio derivative generate_audio_derivative Islandora Audio Homarus (FFmpeg)
Generate a video derivative generate_video_derivative Islandora Video Homarus (FFmpeg)
Generate an image derivative generate_image_derivative Islandora Image Houdini (ImageMagick)
Get OCR from image generate_ocr_derivative Islandora Text Extraction Hypercube (tesseract/pdftotext)

Multi-file media

The derivatives types available for multi-file media are the ones marked as "for Media Attachment" e.g. "Generate an Image Derivative for Media Attachment".

Derivatives are created by microservices ("Crayfish")

In Islandora, we generate derivatives using microservices, which are small web applications that do a single task. Each microservice takes the name of a file as well as some parameters. It runs an executable and returns a transformed file, which can be loaded back into the repository. The microservices in Islandora stack are:

Repository Microservice name executable
Crayfish Homarus FFmpeg
Crayfish Houdini ImageMagick
Crayfish Hypercube tesseract/pdftotext
Crayfits Crayfits FITS

Derivatives are created using an external queue

To send orders to the microservices, Islandora sends messages to an external queue, from which the microservices process the jobs as they are able. This is a robust system that can "operate at scale", i.e. can large handle batches of uploads without slowing down the repository.

The queue system used by Islandora is ActiveMQ, and the listeners are part of "Alpaca"

Derivative Swimlane Diagram

The following diagram shows the flow of derivative generation from start to finish. A user saves a Media in Drupal, which may trigger Drupal to emit a derivative event to a queue, which is read by Alpaca and sent to a microservice. The microservice gets the original file and makes a transformation, returning the derivative file to Alpaca, which sends it back to Drupal to become a Drupal Media. Derivative process swimlane diagram


Last update: April 18, 2024