Audio Metadata Primer
This document covers the very small subset of metadata that pertains to audio preservation projects, and specifically the metadata that is used to interface with an audio preservation service provider such as The Audio Archive. We provide both technical references and information, as well as practical advice on how to use the various metadata standards in conjunction with audio files.
BWF Metadata - Background
The Broadcast Wave Format (BWF) was introduced in 1996, and is an extension to the WAVE file format. The BWF format consists of an additional data chunk embedded within the WAVE file header.
BWF files continue to use the .wav extension in the file name (for example, there is no such thing as a .bwf file name extension). The beauty of the BWF is that when an application does not recognize the BWF data, the application will still read the WAVE portion of the file, providing BWF with good general compatibility with non-BWF WAVE applications.
To learn more about BWF, see the EBU BWF standards documents in our Standards section on this website.
For the purposes of audio preservation, there are two sets of headers and associated fields that we need to concern ourselves with in the WAVE-BWF file:
- RIFF INFO_List tags (available to all WAVE files)
- BWF extension (also known as the “bext chunk”)
which will be covered later.
Where to put the metadata
There are two choices on where to store metadata:
- In a separate file, application, or database
- Embedded in the audio file itself
The choice will primarily depend upon your institution and the choices you have made on how to handle metadata. If you provide us with the metadata that you want stored in the audio file itself, we will embed it in the WAVE-BWF file with the appropriate tags and in the proper data chunks.
Good cases can be made for either choice. We've seen some institutions even do both.
Catastrophic Metadata
In general, we do advocate a minimum set of metadata be embedded in the WAVE-BWF file. This is referred to as catastrophic metadata. It allows you to identify your audio files in the event of an IT disaster. If for whatever reason the WAVE-BWF file somehow gets completely disassociated from an external file, application or database, it will still be possible to identify the audio file from the catastrophic metadata.
There are different aproaches to catastrophic metadata. For example, it would be sufficient to simply store a USID (Unique Source Identifier) in the <bext> chunk as the Originator Reference. This may be less human readable, but with the right software tools, the USID is all that would be needed to identify the file.
Another approach would be to embed just enough descriptive information (Description, Originator, Date, etc.) to identify the recording.
WAVE RIFF INFO_List versus BWF <bext> chunk
The INFO_List and <bext> chunks are two different chunks within the WAVE header.
The RIFF INFO_List tags are not recognized as an archival standard, but are used in various commercial audio editing applications as well as some audio players and music distribution systems. Their popularity does not earn them "defacto standard" status, but they are used nonetheless with frequency. Whether to populate these tags is up to the individual archive, and may depend on whether you have applications that can take advantage of these fields.
The BWF <bext> chunk is a recognized and frequently used location for archival metadata. We strongly recommend using the BWF for at least catastrophic metadata.
If you use JHOVE, the JSTOR/Harvard Object Validation Environment, you will find that the RIFF INFO_List tags are unsupported.
BWF Metadata - Compatibility
As a practical matter, there exists a wide range of implementations when it comes to the number of characters found in the WAVE RIFF INFO_List tags, and how many characters are displayed in various applications.
For maximum compatibility across software applications, The Audio Archive recommends that all RIFF INFO_List tags and BWF field descriptions be limited to no more than 64 characters. Should you require more than 64 characters, we recommend that the first 64 characters contain all the critical and essential information, and that the remaining 192 characters (for a total of 256 characters) only contain details knowing that they may be truncated, discarded, or not displayed in some software applications.
BWF Metadata - Providing Metadata Field Descriptions
Depending on the complexity of the BWF Descriptions, either a Microscoft Excel spreadsheet or Word document are basic tools to provide the descriptive metadata. If you have XML representations of your metadata, we can work with that, too.
An Excel spreadsheet is preferred if the character-count for the descriptions is straightforward and obviously no more than 64 characters. Otherwise, the “Word Count” tool in Microsoft Word can be used to check the number of characters in a description, and the descriptions provided in a Microsoft Word document.
NOTE: When using the “Word Count” tool, be sure to: (a) highlight the field description for which you want to count the characters, and (b) look at the “Characters (with spaces)” value because spaces also count against the maximum number of characters.
BWF Metadata - Example: WAVE RIFF INFO_List Tags
One way to use the WAVE RIFF INFO_List tags in conjunction with the BWF fields is to make the INFO_List tags common to a group of recordings, whereas the BWF fields will contain details describing just one unique recording. For example:
Title: John Cage Disc Collection
Subject: Music, 1939 - 1951
Engineer: Eric Jacobs, The Audio Archive
Copyright:
Genre: 20th century music, Piano, Percussion, Electronic, Aleatory
Artist: Cage, Couper, Russell, Roldan, Harrison, Beyer, Cowell, Wolff
Keywords: Piano, Percussion, Music, Experimental, Chance, Radio, Dance
Originator Software: WaveLab 6.10
Creation Date: 2006-09-11
Original Medium: 16-inch Electric Transcription (ET) Disc
There is also a "Comment" field available.
This is how the RIFF INFO_List tags appear in the Cube-Tec Audiocube and Wavelab software (click on the image to display the full size image):
MP3 Metadata - Background
There are many MP3 fields (also known as ID3 tags), but the most common fields for compatibility among most MP3 players are listed further in this section.
The MP3 standard allows for up to 255 characters, but various MP3 players handle the display of these fields differently. For example, an iPod only displays the first 20 characters or so. The number of characters displayed is variable and depends on font kerning and the iPod model screen size. Another iPod limitation is that the "Artist", "Album Title" and song "Title" are truncated when scrolling through lists, but when playing a song, the iPod will display the entire song “Title” (but the “Artist” and “Album Title” will still be truncated).
Windows Media Player, the Apple iTunes player, and other computer-based players will display all 255 characters, although you may have to adjust the window size and column width to see the entire description.
Generally, The Audio Archive advises limiting the character count to 32 characters if you envision the MP3 files being accessed on portable players or other specialized applications, otherwise using the same field lengths as the WAVE-RIFF and BWF are fine.
If you would like 32 character field descriptions, you will need to provide shorter descriptions for the following fields (MP3 ID3 tags):
MP3 Metadata - MP3 ID3 Tags
Our recommendations on how to generrate MP3 ID3 tags if you already use the RIFF LIST_info tags is as follows:
Artist: <USE WAVE-RIFF TITLE>
Album Title: <USE WAVE-RIFF SUBJECT>
Title: <USE BWF DESCRIPTION>
Year: <USE BWF ORIGINATION DATE>
Track Number: <possibly use this to sequence titles in a group>
Genre: <USE WAVE-RIFF GENRE>
AES-X098B Technical Metadata
The Audio Archive is a member of the AES-X098 standards group, and has been contributing to the development of this draft standard for audio technical metadata over the past two years. We expect this standard to be finalized in late-2008, and have been providing customers with data that conforms to this draft standard since 2006.
AES-X098B technical metadata (over 30 fields) capture for each audio item:
- Media type
- Condition of the media
- Processes used to conserve and playback the media
- Playback anomalies (dropouts, noise, speed, EQ)
The benefits of capturing this information to archivists, researchers and listeners:
- Assessment of media condition
- Assessment of recording quality
- Identify chronology, sequence, or other relationships between recordings based on physical or electrical properties (media type and manufacturer, playback speed, length and other dimensions, common noise problems, similar damage or deterioration of the media, format)
- Reduce the need to access the physical collection
Examples of AES-X098B are available from The Audio Archive as Microsoft Excel spreadsheets upon request.
File Naming Schema
There is no single best way to name the audio files. Naming conventions will depend on the collection and how it is organized, as well as the media type and format. Possible elements that can be used in a file-naming schema include:
- Accession number
- Collection name
- Date of recording
- Item number
- Serial number
- Disc or tape side
- Part number (recordings that span multiple media)
- Media type (open reel, coarse groove, LP, cassette)
- Format (sample rate, word length, bit rate)
- Intended use (preservation master, access copy)
- Serial number
The above elements are hardly an exhaustive list, but these are commonly used elements for human-generated file names. Depending upon the nature of your collection, you might want to name files based on artist, instrument, geographical location, or any number of possibilities. If you are uncertain which elements to incorporate into a file-naming schema, we will work with you to identify a reasonable schema.
For maximum compatibility across operating systems and media types, we recommend that file names be limited to a maximum of 32 characters (not including the file extension). ISO standards for file systems often restrict string lengths to 128 characters (including the file extension).
Yet another option is to use randomly (or rule-based) machine-generated file names. Often a database application can generate these names or "handles".