Skip to Main Content

Introduction to Digital Preservation: PREMIS metadata

Subjects: Digital Library

PREservation Metadata Implementation Strategies (PREMIS)

Preservation Metadata Implementation Strategies (PREMIS) is a metadata standard for recording information required for preservation of digital objects. The standard's documentation and metadata schema are hosted by the Library of Congress. Their website states:

"The PREMIS Data Dictionary for Preservation Metadata is the international standard for metadata to support the preservation of digital objects and ensure their long-term usability. Developed by an international team of experts, PREMIS is implemented in digital preservation projects around the world, and support for PREMIS is incorporated into a number of commercial and open-source digital preservation tools and systems. The PREMIS Editorial Committee coordinates revisions and implementation of the standard." 

Aside from associated documentation, there are two main components to the PREMIS standard:

  1. data model (which includes an object model)
  2. data dictionary

To learn more about PREMIS, please visit: http://www.loc.gov/standards/premis/

Data Model

The data model consists of four Entities: 

PREMIS data model with the words Rights, Agents, Events, Objects in grey boxes with blue arrows pointing between each of them in a diamond shape

Objects: An object is a a discreet unit of information that is made up of four potential levels/types: Intellectual Entity, Representation, File or Bitstream. The PREMIS object model describes the nature and relationship between these four types of objects and how they can be expressed in PREMIS. It is up to the organization to decided how to define and model digital objects in its collections. Objects are central to the PREMIS standard, and is the only one of the four entities that are mandatory (meaning it must be included in all PREMIS records). 

Events: Events describe actions which have happened to objects, particularly during their management within a digital repository. Types of events that can be recorded in PREMIS include: virus scanning, file format validation, and file format migration.

Agents: Agents are the actors that perform events on the object. This can include staff working with the digital repository, the organization as a whole, but it can also includes the software used on the object to perform the event (such as virus scanning software).

Rights: Rights relates to copyright, licenses and any other restrictions on what a repository can do to an object. For example, a PREMIS rights statement might be from a donor agreement that states that a particular agent entity is allowed to make a particular object available online.

What is preservation metadata?

Preservation metadata is defined as:

"Things that most working preservation repositories are likely to need to know in order to support digital preservation." - PREMIS Data Dictionary

"Preservation metadata is intended to store technical details on the format, structure and use of the digital content, the history of all actions performed on the resource including changes and decisions [...]" - PADI, The National Library of Australia

Preservation metadata is often used an umbrella term for various categories of metadata which all enable ongoing preservation and management of digital objects. These types of metadata include:

  • Provenance metadata: includes information on the history of the digital object such as, any actions performed on it since entering the repository (virus scans or file format validation), the actors involved in these activities and maintaining the relationship between files, folders and past versions of the digital object.
  • Technical metadata: this includes the technical information that can be extract directly from a digital file such as, file format type, file size, colour depth, image dimensions, bit depth and so on.
  • Rights information: this information dictates what you are allowed to do to the digital object and who has the authority to do it.

More about preservation metadata can also be found in the Glossary section of this Libguide.

What is the benefit of recording preservation metadata?

There are a number of reasons why preservation metadata is important to an organization. It supports the ability to preserve digital objects and helps to maintain access to them in the long-term. But other benefits include:

  • Knowing what is in your collections
  • Supporting ongoing repository management activities
  • Retaining the history of digital object
  • Ensuring trust in digital collections

 

 

 

Data Dictionary

The PREMIS Data Dictionary (currently in v.3) defines semantic units. Each semantic unit is mapped to one of the four entities (objects, events, agents, and rights), which means that a semantic unit is a property of an entity. 

Each entry in the Data Dictionary defines how a semantic unit may be used, and provides practical examples to guide a repository when making implementation decisions for using PREMIS. 

Image of the cover of the PREMIS Data Dictionary version 3

Recording PREMIS

A repository can use different options for recording PREMIS metadata. Common approaches are: