Lee Romero

On Content, Collaboration and Findability

Archive for December 8th, 2008

Enterprise Taxonomy – A Vision

Monday, December 8th, 2008

In my continuing coverage of a variety of content management and knowledge management topics, I thought it time to share some thoughts and experiences on managing an enterprise taxonomy for a corporation. I am planning a few posts on the topic – starting with a vision for the taxonomy that we developed at the start of our efforts that have helped to guide us, then moving to covering the management process, some insight on usage of the taxonomy and also a description of what the taxonomy looks like.

One important note – a lot of the initial leg work for the taxonomy was done by Frank Montoya and Meredith Lavine, so credit to them for getting things moving.

When we started out in developing an enterprise taxonomy, the company had nothing in place as any kind of content taxonomy – there was an implicit navigational taxonomy for web sites and there was ad hoc taxonomy in “keywords” type fields in a number of content management systems throughout the company. We knew that to be successful, we needed to have more formality to the taxonomy.

As we set about trying to define what we wanted in the taxonomy, we also realized we needed to ensure we were on a common ground for what we were trying to accomplish – otherwise, it was easy to imagine the taxonomy pulled all over the place, making it hard to achieve meaningful results in the long run. We needed some type of common vision for the taxonomy.

In working with a core group of stakeholders, we came up with the following statements as our vision for the enterprise taxonomy.

The Enterprise Taxonomy will:

  • Be adopted for use in all systems that manage content or documents for those classifications that are defined within the Taxonomy
  • Be used to tag content within those systems in order to ensure consistent language to describe our content
  • Enhance the search experience for users through that tagging
  • Be managed as its own asset, including defining the classifications and the values used within those classifications
  • Use appropriate systems of record when possible to define the set of values used for a particular classification
  • Enable monitoring of changes to the taxonomy values by content managers

One note on this vision – it uses the term “classification” in a number of locations. Within our nomenclature, you can read “classification” as meaning the same thing as a “facet” in a faceted taxonomy.

Some of these are pretty straightforward statements, but I thought I’d share a few thoughts on some of them.

First – part of the vision is that the taxonomy is managed as its own asset – what does that mean? It means:

  • The taxonomy is a piece of content (actually, many pieces of content) subject to the same types of business rules we apply to other content.
  • The taxonomy is subject to workflows for review of changes.
  • The taxonomy is subject to periodic reviews.
  • Changes to the taxonomy can be “staged” in the way other content changes can be staged for review.
  • The taxonomy must be treated as an asset with value.

The vision also notes that it will use systems of record. Our taxonomy is broken into many classifications (facets), several of which overlap with other business entities in the company – product lines, solution, geographies, etc. Whenever possible, we literally (in a system, database sense) integrate the taxonomy to pull data from systems of record for those classifications that have a system of record. This provides many advantages:

  • Commonality across systems for users.
  • Standardized language between content tagging language and language used within business intelligence systems.
  • Changes to classifications that have a system of record can be managed using the appropriate business process in that system of record – the taxonomy review process does not need to include these classification (we assume that the system of records will ensure appropriate reviews are performed).
  • Ownership of the values for these classifications can be kept closer to the business responsible for them. That is, we have enabled a distributed ownership model.
  • This helps minimize which classifications must be reviewed within the taxonomy itself – keeping the taxonomy much more nimble. The classifications that need to have a review within the taxonomy are those that are pretty much purely about content (item type being an example).
  • Eventually, this will enable deeper integration between business intelligence systems and content management systems through direct linkage of business objects (say, a product) to content tagged with that. This linkage can be done using standard database mechanisms. (Something we have not yet implemented, though.)

Given that the taxonomy is managed as an asset, we also felt that it was important that content managers must able to monitor changes within the taxonomy. This means:

  • Content managers have a means to easily find and review all changes being considered to the taxonomy (for classifications managed within the taxonomy – though many classifications managed in a system of record also provide this).
  • Content managers should be able to comment any specific proposed change.
  • Content managers should be able to inspect any entity (classification, or value, etc.) within the taxonomy and view a life history of it within the taxonomy – what was it added, changed, deprecated, etc.
  • Content managers should be able to view all classifications and values – including ones that are no longer “active” within the taxonomy (they have been deprecated).

So there’s a start to taxonomy. Up next, I’ll provide some insight on the details of what the taxonomy looks like.