Lee Romero

On Content, Collaboration and Findability
January 12th, 2009

Enterprise Taxonomy – The Structure

(Editor’s note – I started this several weeks ago and managed to get myself busy with a lot of other things in the meantime and am finally getting back to it now. Apologies for the lengthy pause in the discussion.)

In my last post, I described the vision we developed for our taxonomy and provided a little bit of insight on how it’s managed. I thought some might find it interesting to understand the structure within the taxonomy at a deeper level.

When we initiated our taxonomy effort, we started (as I think most do) by collecting a lot of the language used throughout our enterprise in a big spreadsheet. We went through the language and organized it into a variety of facets and for many of those facets, we organized the values into a hierarchy. We managed the taxonomy in a spreadsheet for a while with some success but there were problems (of course):

  1. It was not possible to actually do any meaningful integration from a spreadsheet into any systems (to use the taxonomy);
  2. It was always a challenge to ensure people had access to the most recent view of the taxonomy;
  3. It was hard to really to meaningfully integrate the taxonomy with source systems that provide many of our labels in the taxonomy (to pull in values from those source systems).

Given this challenge and a developer resource and some good insights about what the taxonomy needed to do, we have created a relatively simple application that has enabled the taxonomy to be much more visible and also much more directly integrated with other systems. Note: It’s very likely that a commercial product would provide what we’ve done and a lot more, but when we set out on this it was not feasible to spend “hard” money on this, so we spent “soft” money in the form of a developer’s time. Perhaps not the best strategy but it’s been successful for our needs so far.

Given the above challenges we had with the “spreadsheet approach”, my primary interest was to solve the problems of access, display and integration and I was not interested in a system that provided a UI for maintaining the taxonomy (that was also supported by the fact that I’ve strived to have most of the taxonomy sourced from business systems and that the management of the other values has primarily been a one-person job and that person was familiar with databases and could update directly).

So, the taxonomy system comprises the following components:

  1. A SQL database (built in MySQL to be specific);
  2. A web application that provides a view of what is in the database – basically a mirror of the database structure which is described below;
  3. A set of processes that run on schedules to pull data from source systems into the taxonomy;
  4. An XML output following a formal(ish) specification to allow other systems to pull values from the taxonomy.

In my next post (possibly later today, even), I’ll provide more details on the structure – closer to a data model for the bits and pieces that comprise the entire taxonomy.

Leave a Reply