I worked on a project recently highlighting findability issues with unstructured content and the need for appropriate tagging using values from a controlled vocabulary.
At the heart of this project was Digital Asset Management (DAM), a rapidly growing area as more multimedia content is being distributed online, particularly for marketing purposes. The inherent problem with digital assets is the potentially large amount of information about what a piece of content is but the lack of information describing what that content is about. Unlike other content, which may contain text or be located with surrounding textual context, digital assets do not typically contain text, especially any which is structured for discovery by search engines. Any textual and searchable elements must be associated to digital assets through the use of metadata. Metadata describing what the content is, including attributes like video length, number of pixels, and file size, can be associated to the content and is often automatically attributed through business rules.
What the asset is about, however, is not inherent. It must be associated to the content either manually or automatically by loading the content once business rules have been thought out and established.
This can be established for single format assets, but things get a little more complicated when content can be assembled from “pieces” which may or may not share common values once reassembled into something new. In other words, what an asset is about may change once it is combined with other elements to form a new and unique asset.
An additional challenge with digital assets is versioning. Digital assets are frequently large and high-resolution, requiring a lot of storage space and management as versions are created and retained. Since digital assets are expensive and time-consuming to create, reuse and repurposing is common, leading to multiple versions and formats of a single beginning asset.
Though not the single answer to the challenges of DAM, a controlled vocabulary of descriptive terms used in association with common metadata attributes can help to manage the versions and combinations of assets from creation to reuse to final storage or disposition. Since there is no, or very little, textual context for digital assets, the descriptive vocabulary can be tailored to fit existing assets and those to be created, creating uniformity in description and a basis for honing search for asset location.
Tagging assets with terms from a controlled vocabulary provides the descriptive metadata used for tracking and locating assets. Some of the takeaways from a recent DAM project address the above challenges and how taxonomy values would assist with DAM.
The most difficult aspect is how to index what a digital asset is about. Stock photo companies, such as Getty Images, have indexers with established rules and controlled vocabularies in order to provide searchable context for images. The same can be done in any organization but requires an established controlled vocabulary to use for tagging and training of subject matter experts who do the indexing and tagging of content. The SMEs should have common guidelines about controlled vocabulary application to assets to provide consistency.
What we found in our project was multiple and potentially far-flung asset creators uploading assets and tagging with no rules and no controlled vocabulary. Centralized tagging provides the most control over application of terms, but decentralized tagging can be streamlined by creating fixed dropdown values for content tagging or centralized review and retagging of submitted assets.
Another challenge is the “parts is parts” nature of digital assets. Unlike assembling text from multiple sources, assembling multiple assets into a new and unique creation which itself will be versioned, tracked, and stored can lead to a disconnect between the tagging of the parts and the tagging of the whole. Retaining metadata for the individual assets within an assemblage retains the original context, but an additional round of tagging should be applied to the final product to capture the change in function. For example, we encountered assets which had been created in parts for reuse but were all fused into a final 30-second commercial. Those individual elements could be reused in other contexts. Both the elements and the combined whole should have applied terms.
Asset versioning is really a function of the tools used to track and store digital assets. However, applying controlled vocabulary terms early in the asset creation process will also assist with locating individual versions of assets as the content changes. For example, the first draft of an asset should have applied to it the necessary core metadata elements such as date of creation and author. Additionally, the content and context of what the asset is about should also be tagged. As the message of the asset changes in subsequent versions, so too will the terms describing what the asset is about. This retagging will help to clarify and track changes to assets over time.
Controlled vocabulary is just one aspect of managing digital assets, but if done in conjunction with a DAM project, the tracking of assets can be managed before volume and versioning issues cost more time and money to fix later down the line.