Lee Romero

On Content, Collaboration and Findability

Archive for February, 2009

Embedding Knowledge Sharing in Performance Management

Tuesday, February 10th, 2009

In my last post, I wrote about a particular process for capturing “knowledge nuggets” from a community’s on-going discussions and toward the end of that write up, I described some ideas for the motivation for members to be involved in this knowledge capture process and how it might translate to an enterprise. All of the ideas I wrote about were pretty general and as I considered it, it occurred to me that another topic is – what are the kinds of specific objectives an employee could be given that would (hopefully) increase knowledge sharing in an enterprise? What can a manager (or, more generally, a company) do to give employees an incentive to share knowledge?

Instead of approaching this from the perspective of what motivates participants, I am going to write about some concrete ideas that can be used to measure how much knowledge sharing is going on in your organization. Ultimately, a company needs to build into its culture and values an expectation of knowledge sharing and management in order to have a long-lasting impact. I would think of the more tactical and concrete ideas here as a way to bootstrap an organization into the mindset of knowledge sharing.

A few caveats: First – Given that these are concrete and measurable, they can be “gamed” like anything else that can be measured. I’ve always thought measures like this need to be part of an overall discussion between a manager and an employee about what the employee is doing to share knowledge and not (necessarily) used as absolute truth.

Second – A knowledge sharing culture is much more than numbers – it’s a set of expectations that employees hold of themselves and others; it’s a set of norms that people follow. That being said, I do believe that it is possible to use some aspects of concrete numbers to understand impacts of knowledge management initiatives and to understand how much the expectations and norms are “taking hold” in the culture of your organization. Said another way – measurement is not the goal but if you can not measure something, how do you know its value?

Third – I, again, need to reference the excellent guide, “How to use KPIs in Knowledge Management” by Patrick Lambe. He provides a very exhaustive list of things to measure, but his guide is primarily written as ways to measure the KM program. Here I am trying to personalize it down to an individual employee and setting that employee’s objectives related to knowledge sharing.

In the rest of this post, I’ll make the assumption that your organization has a performance management program and that that program includes the definition for employees of objectives they need to complete during a specific time period. The ideas below are applicable in that context.

  • Community membership – Assuming your community program has a way to track community membership, being a member of relative communities can be a simple objective to accomplish.
  • Community activity – Assuming you have tools to track activity by members of communities, this can give you a way to set objectives related to being active within a community (which I think is much more valuable than simply being a member). It’s hard to set specific objectives for this type of thing, but the objective could simply be – “Be an active member of relevant communities”. Some examples
    • If your communities use mailing lists, you can measure posts to community mailing lists.
    • If your communities use an collaboration tool, such as a wiki or blog or perhaps shared spaces, measure contributions to those tools.
    • If your communities manage community-based projects, measure involvement in those projects – tasks,deliverables, etc.
    • Assuming your communities hold events (in-person meetings, webcasts, etc.), measure participation in those events.
  • Contribution in a corporate knowledge base – An obvious suggestion. Assuming your organization has a knowledge base (perhaps multiples?), you can set expectations for your employee’s contributions to these.
    • Measure contributions to a document management system. More specifically, measure usage of contributions as well.
    • If your organization provides product support of any sort, measure contributions to your product support knowledge base
    • If you have a corporate wiki, measure contributions to the corporate wiki
    • If you have a corporate blog, measure posts and comments on the corporate blog
    • Measure publications to the corporate intranet
    • In your services organization (if you have one), measure contributions of deliverables to your clients. Especially ones of high re-use value.
    • Measure relevance or currency of previously contributed content – Does an employee keep their contributions up to date?
  • A much different aspect of a knowledge sharing culture is to also capture a when employees look for knowledge contributed by others – that is, the focus can not simply be on how much output an employee generates but also on how effective an employee is in re-using the knowledge of others.
    • This one is harder for me to get my head around because, as hard as it can be to assign any credible value to the measurements listed above, it’s harder to measure the value someone gets out of received knowledge.
    • Some ideas…
    • Include a specific objective related to receiving formalized training – while a KM program might focus on less formal ways to share knowledge, there’s nothing wrong with this simple idea.
    • If your knowledge management tools support it, measure usage by each employee of knowledge assets – do they download relevant documents? Read relevant wiki articles or blog posts?
    • Measure individual usage of search tools – at least get an indication of when an employee first looks for assets instead of re-inventing the wheel.

Not all of these will apply to all employees and some employees may not have any specific, measurable knowledge sharing objectives (though that seems hard to imagine regardless of the job). An organization should look at what they want to accomplish, what their tool set will support (or what they’re willing to enhance to get their tool set to support what they want) and then be specific with each employee. This is meant only as a set of ideas or suggestions to consider in making knowledge sharing an explicit, concrete and measurable activity for your employees.

Rolling Up Objectives

Given some concrete objectives to measure employees with, it seems relatively simply to roll those objectives up to management to measure (and set expectations for up front) knowledge sharing by a team of employees, not just individual employees. On the other hand, a forward-thinking organization will define group-level objectives which can be cascaded down to individual employees.

Given either of these approaches, a manager (or director, VP, etc.) may then have both an organizational level objective and their own individual objectives related to knowledge sharing.

Knowledge Sharing Index

Lastly – while I’ve never explored this, several years ago, a vice president at my company asked for a single index of knowledge sharing. I would make the analogy of some like a stock index – a mathematical combination of measuring different aspects of knowledge sharing within the company. A single number that somehow denotes how much knowledge sharing is going on.

I don’t seriously think this could be meaningful but it’s an interesting idea to explore. Here are some definitions I’ll use to do so:

  • You would need to identify your set of knowledge sharing activities to measure – Call these A1, … , An. Note that these measurements do not need to really measure “activity”. Some might measure, say, the number of members in your communities at a particular time or the number of users of a particular knowledge base during a time period.
  • Define how you measure knowledge sharing for A1, … , An – for a given time t, the measurement of activity Ai is Mt,i
  • You then need to define a starting point for measurement – perhaps a specific date (or week or month or whatever is appropriate) whose level of activity represents the baseline for measurement. Call these B1, …, Bn – basically, Bi is M0,i
  • Assuming you have multiple types of activity to measure, you need to assign a weight to each type of activity that is measured – how much impact does change in each type of activity have on the overall measurement? Call these W1, …. Wn.

Given the above, you could imagine the “knowledge sharing index” at any moment in time could be computed as (for – I don’t know how to make this look like a “real” formula!):

Knowledge index at time t = Sum (i=1…N) of Wi * ( Mt,i / Bi )

A specific example:

  1. Let’s say you have three sources of “knowledge sharing” – a corporate wiki, a mailing list server and a corporate knowledge base
  2. For the wiki, you’ll measure total edits every week, for the list server, you’ll measure total posts to all mailing lists on it and for the knowledge base, you’ll measure contributions and downloads (as two measures).
  3. In terms of weights, you want to give the mailing lists the least weight, the wiki an intermediate weight and the combined knowledge base the most weight. Let’s say the weights are 15 for the mailing lists, 25 for the wiki, 25 for the downloads from the knowledge base and 35 for contributions to the knowledge base. (So the weights total to 100!)
  4. Your baseline for future measurement is 200 edits in the wiki, 150 posts to the list server, 25 contributions to the knowledge base and downloads of 2,000 from the knowledge base
  5. At some week after the start, you take a measurement and find 180 wiki edits, 160 posts to the list server, 22 knowledge base contributions and 2200 downloads from the knowledge base.
  6. The knowledge sharing index for that week would be 95.8. This is “down” even though most measures are up (which simply reflects the relative importance of one factor, which is down).

If I were to actually try something like this, I would pick the values of Wi so that the baseline measurement (when t= 0) comes to a nice round value – 100 or something. You can then imagine reporting something like, “Well, knowledge sharing for this month is at 110!” Or, “Knowledge sharing for this month has fallen from 108 to 92”. If nothing else, I find it amusing to think so concretely in terms of “how much” knowledge sharing is going on in an organization.

There are some obvious complexities in this idea that I don’t have good answers for:

  1. How to manage a new means to measure activity becoming available? For example, your company implements a new collaboration solution. Do you add it in as a new factor with its weight and just have to know that at some point there’s a step function of change in the measure that doesn’t mean anything except for this new addition? Do you try to retroactively adjust weights of sources already included to keep the metrics “smooth”?
  2. How to handle retiring a source of activity? For example, you retire that aging (but maybe still used extensively) mailing list server. Same question as above, though perhaps simpler – you could just retroactively remove measurements from the now-retired source to keep a smooth picture.
  3. How to handle (or do you care to handle?) a growing or shrinking population of knowledge workers? Do you care if your metric goes up because you acquired a new company (for example) or do you need to normalize it to be independent of the number of workers involved?

In any event – I think this is an interesting, if academic, discussion and would be interested in others’ thoughts on either individual performance management or the idea of a knowledge sharing index.

Retiring a Community and Capturing its Knowledge

Thursday, February 5th, 2009

Recently, there was a thread of discussion on the com-Prac list about the “death of a community” and a follow-up discussion about what or how CoPs should capture discussion-produced knowledge.

I found these to be very interesting and thought-provoking discussions. In this post, I will write about two aspects of these discussions – the retiring of a community and also a case study in how a community centered around a mailing list meets the challenge of knowledge capture.

Before getting into the details – I wanted to (re-)state that I recognize that a community is (much) more than a mailing list – community members interact in many ways, some online, some in “real space”. That being said, I also know that for many communities the tool of choice for group communication is a mailing list, so in this post, I will write about issues related to the use of mailing lists, though the ideas can be transferred to other means of electronic exchange. As John D. Smith notes in the second thread:

“All of the discussion about summarization so far assumes that a community almost exclusively lives on one platform. As Nancy alluded to, I think the reality is quite a bit more messy. Note the private emails between Eric and Miguel that were mentioned in this thread. We ourselves interact in LOTS of different locations.”

In other words, even if you could solve the knowledge capture challenge for one mode of discussion (mailing lists) you are still likely missing out on a lot of the learning and knowledge sharing going on in the community. Keep that in mind!

Retiring a Community, or at least a community’s mailing list

As I’ve written about before, within the context of my current employer’s community program, mailing lists, and their related archives are an important part of our community of practice initiative (and, by extension our KM program). We have not developed a formal means to retire (or “execute” in the terms used in the first thread mentioned above) a community, but we do have a formal process for retiring mailing lists. While the following is about mailing lists, I think the concepts can scale up to any community – though it might require aggregating similar insights about other channels used by the community.

Within our infrastructure, many of the existing mailing lists are associated with one (or more) communities and we provide a simple means for anyone to request a new mailing list. There is a very light review process, primarily focused on ensuring that the requested list is different enough from existing lists and also doesn’t have such a small topic space that it will likely be very under-utilized), which means that over time we can end up with a lot of mailing lists. Without some regular house-cleaning, this situation can have a very negative impact on how a user’s discovery process – hundreds and hundreds of mailing lists means a lot of confusion.

One way we grapple with this is to use the communities as a categorization of mailing lists. Instead of leaving a user with hundreds of mailing lists to wade through, we encourage them to look for a community in which they’re interested and, through that community, find associated mailing lists. This normally reduces the number of mailing lists to consider down to a small handful.

However, we still have needed a house-cleaning process, so several years ago, this is what we set up:

  • All mailing lists are reviewed on a periodic basis – usually around once every six to twelve months.
  • When reviewed, the following criteria are used to identify candidates for retirement
    • Age of the list (it must be a certain age in order to give new lists time to “get off their feet”)
    • New subscriptions to the list (if someone newly joins what is otherwise an un-utilized list, that represents at least *potential* utilization in the future – so no need to shut it off)
    • Posting activity on the list (if a list is old enough and has not had anyone newly join and has not had any activity in a specific span of time, it becomes a candidate). Note that even a single post removes the list from candidacy (we do not attempt to quantify the value of a post or anything like that).
  • Once a list of candidate mailing lists is identified, the moderators for that list are contacted and asked if the list is needed
    • If a list has no identified moderators or (more commonly) the moderators of record are no longer with the company, the entire list of members are contacted (via an email sent directly to the members, not via the mailing list itself as that introduces the “one” post that then keeps the list “alive” in the next review).
  • Regardless of who the question is asked of, the contact with the list is positioned as a proposal to retire the list and people only need to reply if they do not align with that proposal; a target date for reply is also provided (no reply by that date is taken as alignment with retiring the list).
    • Replies saying, “Go ahead and retire” do nothing except confirm the proposal.
    • However, even one reply requesting retention of the list takes the list off the list of retirement candidates – that is, everyone has the same weight to veto the retirement.
    • As for the archives of the list, we also state that the archives will be retained even if the list is retired unless a moderator states that the archives are not needed. (The archives are included in our enterprise search, so they remain as a potential knowledge source even if the list does not have continued value in supporting on-going discussions.)
  • Assuming a list is not removed from the candidate list (i.e., it can be retired), the remaining process is simply to remove it from the list server – I won’t bore you with the details of that here.

In our environment, doing this once a year typically reduces the count of lists by about 10% – though the count of lists has remained remarkably stable over time, which would say that we then have that same kind of growth over the next year. On the other hand, if we did not proactively review and retire lists like this, we would be seeing an ever-growing list of mailing lists, making it harder for everyone to find the lists that are engendering valuable discussions.

Knowledge Capture

Or… How to lift knowledge out of the on-going discussion of a community into a better form of reusability.

If a community uses a tool like a mailing list to engender discussion and knowledge sharing – how does a community capture “nuggets” of knowledge from the discussion into a more easily digestible form? Does the community need to (perhaps not given a sophisticated enough means to find information in the archives)?

I have no magic solution to this problem but I did find another comment to be very illustrative of one aspect of the original discussion – who “owns” the archives of a community’s discussion and what is the value of those archives? Even in their raw form, why do those archives have value? As Nancy White notes:

“I suspect that only a small percentage of the members (over time) would actually use the archives. But because they hold the words of members, there may be both individual and collective sense of ownership that have little to do with “utility.””

The rest of this post will be a brief description of a knowledge capture process I’m very familiar with – though I’m not sure if it will transfer well into other domains. For this description, I’m going completely outside of the enterprise and to a community of which I’m a member that revolves around a table-top fantasy war game named Warhammer.

A bit of background: Warhammer is a rather complex game, with a rulebook that weighs in at several hundred pages and about a dozen additional books that provides details on the various types of armies players can use. All told, probably something like 1,000 pages describing the rules and background of the game. Given the complexity of the game, it is very common that during any given game, the players will run into situations not covered well by the rules – these are usually areas involving interactions of special rules for the armies playing. In the many online forums / mailing lists that exist, one of the most frequent types of discussions revolves around these situations and how to interpret the rules. Many of the same questions come up repeatedly – obvious fodder for an FAQ.

(As an aside, given that Warhammer is published and sold by a company – Games Workshop – one could that they should publish all of the relevant FAQs. They do publish FAQs and errata but they do so at a sporadic pace at best and do not address many of the frequently asked questions.)

One particular Warhammer-related community of which I’m a member – the Direwolf (DW) community – has established a pretty well defined means to gather these FAQs and publish them back to the Warhammer community at large. A brief overview of the process:

  • A subset of the community is elected by the community each year to act as the FAQ council. This group normally includes one person responsible for questions related to the main rule book, one for each specific army and one person who’s responsible for maintaining the FAQ documents themselves (so all totaled, about 15 people).  [As another aside, I happen to be a member of this FAQ council currently, which is how I’m familiar with the process it uses.]
  • Each member of the group is responsible for monitoring discussions within the community’s mailing list related to their specific area of focus and bringing those questions to the FAQ council for consideration when they are believed to be “frequently asked” enough to warrant inclusion.
    • In addition, the council actively solicits questions specific to individual armies when a new book comes out for an army – this solicitation includes both members of the DW community and also a few other highly populated Warhammer-related communities.
  • Once a question (or set of questions) is identified for the FAQ council, the group discusses (in a mailing list available just to FAQ council members) potential answers and comes to a consensus (or at least a majority) on the answer.
    • Most commonly, the group will agree on an interpretation but occasionally, explicit polling is done to ensure at least a majority of the group agrees with an interpretation.
  • The FAQ documentation is then updated to include the relevant questions and answers and then are published on the internet and made available to anyone who plays the game.

Netting it out: A community-selected subset of the community monitors the community for questions in their area of expertise, vettes an answer with the rest of the FAQ council, and then the FAQ documentation is updated as appropriate.

This is pretty straightforward, but the value of this effort is reflected in the fact that the game publisher now very commonly uses input from the Direwolf FAQ council in considering their own responses to FAQs and also in the fact that many players from around the world use the Direwolf FAQ to ensure a consistent interpretation of those “fuzzy” areas of the game. A true value add for the Warhammer community at large.

That being said, this process does take quite a bit of energy and commitment, especially on the part of the “keeper” of the documentation, to keep things up to date. In this case, I believe that the value-add for members of the council is knowing that they are contributing to the Warhammer community at large and also knowing that they are helping themselves in their own engagement of playing the game.

How does this translate into a community of practice within an enterprise?

  • It’s possible that an exact parallel of the above could work in many communities.
  • Even if the position isn’t “elected”, some type of rotating responsibility among community members to monitor and gather FAQs (or other knowledge artifacts) could be very valuable for both the community and the member(s) who perform the job.
    • Within an enterprise that seems like an approach that will have longer legs than having a community manager (someone who helps facilitate the community but who might otherwise not have a strong vested interest in the domain of the community) responsible for this.
  • Ensuring that community members do perceive value in their involvement in the process is going to be a key component – What’s in it for them? The answer could be any number of things
    • Professional development opportunities (learning a lot more about areas in which they don’t normally work)
    • Visibility to other members of the community / career growth opportunities
    • Helping themselves be more successful in their own job (they are ensuring there is a source of gathered knowledge to be used)

Enterprise Search Best Bets – a good enough practice?

Tuesday, February 3rd, 2009

Last summer, I read the article by Kas Thomas from CMS Watch titled “Best Bets – a Worst Practice” with some interest. I found his thesis to be provocative and posted a note to the SearchCoP community asking for other’s insights on the use of Best Bets. I received a number of responses taking some issue with Kas’ concept of what best bets is and some also some responses describing different means to manage best bets (hopefully without requiring the “serious amounts of human intervention” described by Kas.

In this post, I’ll provide a summary of sorts and also describe some of the ways described for managing best bets and also the way we have managed best bets.

Kas’ thesis is that best bets are not a good practice because they are largely a hack layered on top of a search engine and require significant manual intervention. Further, if your search engine isn’t already providing access to appropriate “best bets” for queries, you should get yourself a new search engine.

Are Best Bets Worth the Investment?

Some of the most interesting comments from the thread of discussion on the SearchCoP include (I’ll try to provide as cohesive picture of sentiment as I can but will only provide parts of the discussion – if I have portrayed intent incorrectly – that’s my fault and not the original author):

From Tim W:

“Search analytics are not used to determine BB … BB are links commonly used, enterprise resources that the search engine may not always rank highly because for a number of reasons. For example, lack of metadata, lack of links to the resource and content that does not reflect how people might look for the document. Perhaps it is an application and not a document at all.”

From Walter U:

“…manual Best Bets are expensive and error-prone. I consider them a last resort.”

From Jon T:

“Best Bets are not just about pushing certain results to the top. It is also about providing confidence in the results to users.

If you separate out Best Bets from the automatic results, it will show a user that these have been manually singled out as great content – a sign that some quality review has been applied.”

From Avi R:

“Best Bets can be hard to manage, because they require resources.

If no one keeps checking on them, they become stale, full of old content and bad links.

Best Bets are also incredibly useful.

They’re good for linking to content that can’t be indexed, and may even be on another site entirely. They’re good for dealing with … all the sorts of things that are obvious to humans but don’t fit the search paradigm.”

So, lots of differing opinions on best bets and their utility, I guess.

A few more pieces of background for you to consider: Walter U has posted on his blog (Most Casual Observer) a great piece titled “Good to Great Search” that discusses best bets (among other things); and, Dennis Deacon posted an article titled, “Enterprise Search Engine Best Bets – Pros & Cons” (which was also referenced in Kas Thomas’ post). Good reading on both – go take a look at them!

My own opinion – I believe that best bets are an important piece of search and agree with Jon T’s comment above that their presence (and, hopefully, quality!) give users some confidence that there is some human intelligence going into the presentation of the search results as a whole. I also have to agree with Kas’s argument that search engines should be able to consistently place the “right” item at the top of results, but I do not believe any search engine is really able to today – there are still many issues to deal with (see details in my posts on coverage, identity, and relevance for my own insights on some of the major issues).

That being said, I also agree that you need to manage best bets in a way that does not cost your organization more than their value – or to manage them in a way that the value is realized in multiple ways.

Contrary to what Tim W says, and as I have written about in my posts on search analytics (especially in the use of search results usage), I do believe you can use search analytics to inform your best bets but they do not provide a complete solution by any means.

Managing Best Bets

From here on out, I’ll describe some of the ways best bets can be managed – the first few will be summary of what people shared on the SearchCoP community and then I’ll provide some more detail on how we have managed them. The emphasis (bolding) is my own to highlight some of what I think are important points of differentiation.

From Tim W:

“We have a company Intranet index; kind of a phone book for web sites (A B C D…Z). It’s been around for a long time. If you want your web site listed in the company index, it must be registered in our “Content Tracker” application. Basically, the Content Tracker allows content owners to register their web site name, URL, add a description, metadata and an expiration date. This simple database table drives the Intranet index. The content owner must update their record once per year or it expires out of the index.

This database was never intended for Enterprise Search but it has proven to be a great source for Best Bets. We point our ODBC Database Fetch (Autonomy crawler) at the SQL database for the Content Tracker and we got instant, user-driven, high quality Best Bets.

Instead of managing 150+ Best Bets myself, we now have around 800 user-managed Best Bets. They expire out of the search engine if the content owner doesn’t update their record once per year. It has proven very effective for web content. In effect, we’ve turned over management of Best Bets to the collective wisdom of the employees.”

From Jim S:

“We have added an enterprise/business group best bet key word/phrase meta data.

All documents that are best bet are hosted through our WCM and have a keyword meta tag added to indicate they are a best bet. This list is limited and managed through a steering team and search administrator. We primarily only do best bets for popular searches. Employee can suggest a best bet – both the term and the associated link(s). It is collaborative/wiki like but still moderated and in the end approved or rejected by a team. There is probably less than 1 best bet suggestion a month.

If a document is removed or deleted the meta data tag also is removed and the best bet disappears automatically.

Our WCM also has a required review date for all content. The date is adjustable so that content will be deactivated at a specific date if the date is not extended. This is great for posting information that has a short life as well as requiring content owners to interact with the content at least every 30 Months (maximum) to verify that the content is still relevant to the audience. The Content is not removed from the system, rather it’s deactivated (unpublished) so it no longer accessible and the dynamic links and search index automatically remove the invalid references. The content owner can reactivate it by setting the review date into the future.

If an external link (not one in our WCM) is classified as a best bet then a WCM redirect page is created that stores the best bet meta tag. Of course it has a review/expiration so the link doesn’t go on forever and our link testing can flag if the link is no longer responding. If the document is in the DMS it would rarely be deleted. In normal cases it would be archived and a archive note would be placed to indicate the change. Thus no broken links.

Good content engineering on the front end will help automate the maintenance on the back end to keep the quality in search high.

The first process is external to the content and doesn’t require modifying the content (assuming I’m understanding Tim’s description correctly). There are obvious pros and cons to this approach.

By contrast, the second process embeds the “best bet” attribution in the content (perhaps more accurately in the content management system around the content) and also embeds the content in a larger management process – again, some obvious pros and cons to the approach.

Managing Best Bets at Novell

Now for a description of our process -The process and tools in place in our solution are similar to the description provided by Tim W. I spoke about this topic at the Enterprise Search Summit West in November 2007, so you might be able to find the presentation for it there (though I could not just now in a few minutes of searching).

With the search engine we use, the results displayed in best bets are actually just a secondary search performed when a user performs any search – the engine searches the standard corpus (whatever context the user has chosen, which would normally default to “everything”) and separately searches a specific index that include all content that is a potential best bet.

The top 5 (a number that’s configurable) results that match the user’s search from the best bets index are displayed above the regular results and are designated “best bets”.

How do items get into the best bets index, then? Similar to what Tim W describes, on our intranet, we have an “A-Z index” – in our case, it’s a web page that provides a list of all of the resources that have been identified as “important” at some point in the past by a user. (The A-Z index does provide category pages that provide subsets of links, but the main A-Z index includes all items so the sub-pages are not really relevant here.)

So the simple answer to, “How do items get into the best bets index?” is, “They are added to the A-Z index!” The longer answer is that users (any user) can request an item be added to the A-Z index and there is then a simple review process to get it into the A-Z index. We have defined some specific criteria for entries added to the A-Z, several of which are related to ensuring quality search results for the new item, so when a request is submitted, it is reviewed against these criteria and only added if it meets all of the criteria. Typically, findability is not something considered by the submitter, so there will be a cycle with the submitter to improve the findability of the item being added (normally, this would include improving the title of the item, adding keywords and a good description).

Once an item is added to the A-Z index, it is a potential best bet. The search engine indexes the items in the A-Z through a web crawler that is configured to start with the A-Z index page and goes just one link away from that (i.e., it only indexes items directly linked to from the A-Z index).

In this process, there is no way to directly map specific searches (keywords) to specific results showing up in best bets. The best bets will show up in the results for a given search based on normally calculated relevance for the search. However, the best bet population numbers only about 800 items instead of the roughly half million items that might show up in the regular results – as long as the targets in the A-Z index have good titles and are tagged with the proper keywords and description, they will normally show up in best bets results for those words.

Some advantages of this approach:

  • This approach works with our search engine and takes advantage of a long-standing “solution” our users are used to (the A-Z index has long been part of our intranet and many users turn to the A-Z index whenever they need to find anything, so its importance is well-ingrained in the company).
  • Given that the items in the A-Z index have been identified at some point in the past as “important”, we can arguably say that everything that should possibly be a best bet is included.
  • We have a point in a process to enforce some findability requirements (when a new item is added).
  • The items included can be any web resource, regardless of where it is (no need to be on our web site or in our CM system)
  • This approach provides a somewhat automated way to keep the A-Z index cleaned up – the search engine identifies broken links as it indexes content and by monitoring those for the best bets index, we know when content included the A-Z has been removed.
  • Because this approach depends on the “organic” results from the engine (just on a specially-selected subset of content), we do not have to directly manage keyword-to-result mapping – we delegate that to the content owner (by way of assigning appropriate keywords in the content).

Some disadvantages of this approach

  • The tool we use to manage the A-Z index content is a database but, it is not integrated with our content management system. Most specifically, it does not take advantage of automated expiration (or notification about expiration).
  • As a follow-on from the above point, there is no systematically enforced review cycle on individual items to ensure they are still relevant.
  • Because this approach depends on the organic results from the engine, we can not directly map keywords to specific results. (Both a good and bad thing, I guess!)
  • Because the index is generated using a web crawl (and not indexing a database directly for example), some targets (especially web applications) still end up not showing particular well because it might not be possible to have the home page of the application modified to include better keywords or descriptions or (in the face of our single sign-on solution), sometimes a complex set of redirects results in the crawler not indexing the “right” target.