Over the past several years of working very closely with the enterprise search solution at Deloitte, I have tried to look “outside” as best as I can in order to understand what others in the industry are doing to evaluate their solutions in order to understand where ours ‘fits’.
I’ve attended a number of conferences and webcasts and read papers (many, I’ll admit, that are highlighted by Martin White on Twitter. I can’t recommend a follow of Martin enough!)
One thing I have never found is any common way to evaluate or talk about enterprise search solutions. I have seen several people (including Martin) comment on the relatively little research on enterprise search (as opposed to internet search, which has a lot of research behind it), and I am sure a significant reason for that is that there is no common way to evaluate the solutions.
If we could compare in a systematic way, we could start to understand how to do things like:
Why do we not have a common set of definitions?
One possibility is certainly that I have still not read up enough on the topic – perhaps there is a common set of definitions – if so, feel free to share.
Another possibility is that this is a result of dependency on the metrics that are implemented within the search solutions enterprises are using. I have found that these are useful but they don’t come with a lot of detail or clarity of definition. And, more specifically, they don’t seem common across products. That said, I have relatively limited exposure to multiple search solutions – Again, I would be interested in insights from those who have (perhaps any consultants working in this space?)
And, one more possible driver behind a lack of commonality is the proprietary nature of most implementations. I try to speak externally as frequently as I can, but I am always hesitant (and have been coached) to not be too detailed on the implementation.
I do plan to put up a small series here, though, with some of the more elemental components of our metrics implementation for comparison with anyone who cares to share.
More soon!
This is a second post in a series I have planned about the language found throughout your search log – all the way into the “long tail” and how it might or might not be feasible to understand it all.
My previous post, “80-20: The lie in your search log?“, highlighted how the slope of “short head” of your search terms may not be as steep as anecdotes would say. That is, there can be a lot less commonality within a particular time range among even the most common terms in your search log than you might expect.
After writing that post, I began to wonder about the overall re-use of terms over periods of time.
In other words:
Even while commonality of re-using terms within a month is relatively low, how much commonality do we see in our users’ language (i.e., search terms) from month to month?
To answer this, I needed to take the entire set of terms for a month and compare them with the entire set from the next month and determine the overlap and then compare the second month’s set of terms to a third month’s, and so on. Logically not a hard problem but quite a challenge in practice due to the volume of data I was manipulating (large only in the face of the tools I have to manipulate it).
So I pulled together every single term used over a period of about 18 months and broke them into the set used for each of those months and performed the comparison.
Before getting into the details, a few details to share for context about the search solution I’m writing about here:
My expectation was that comparing the entire set of terms from one month to the next would show a relatively high percentage of overlap. What I found was not what I expected.
If you look at the unique terms and their overlap, surprisingly, the average overlap between months was a shockingly low 13.2%. In other words, over 86% of the terms in any given month were not used at all in the
previous month.
If you look at the total searches performed and the percent of searches performed with terms from the prior month, this goes up to an average of 36.2% – reflecting that the terms that are re-used in a subsequent month among the most common terms overall.
As you can see, the amount of commonality from month-to-month among the terms used is very low.
What can you draw from this observation?
In a brief discussion about this with noted search analytics expert Lou Rosenfeld, his reaction was that this represented a significant amount of change in the information needs of the users of the system – significant enough to be surprising.
Another conclusion I draw from this is that it provides another reason why it is very hard to meaningfully improve search across the language of your users. Based on my previous post on the flatness of the curve of term use within a month, we know that it we need to look at a pretty significant percentage of distinct terms each month to account for a decent percentage of all searches – 12% of distinct terms to account for only 50% of searches. In our search solution, that 12% doesn’t seem that large until you realize it is still represents about 6,000 distinct terms.
Coupling that with the observation from the analysis here means that even if you review those terms for a given month, you will likely need to review a significant percentage of brand new terms the next month, and so on. Not an easy task.
Having established just how challenging this can be, my next few posts will provide some ideas for grappling with the challenges.
In the meantime, if you have any insight on similar statistics from your solution (or statistics about the shape of the search log curve I previously wrote above), please feel free to share here, on the SearchCoP on Yahoo! groups or on the Enterprise Search Engine Professionals group on LinkedIn – I would very much like to compare numbers to see if we can identify meaningful generalizations from different solution.
Recently, I have been trying to better understand the language in use by our users in the search solution we use, and in order to do that, I have been trying to determine what tools and techniques one might use to do that. This is the first post in a planned series about this effort.
I have many goals in pursuing this. The primary goal has been to be able to identify trends from the whole set of language in use by users (and not just the short head). This goals supports the underlying business desire of identifying content gaps or (more generally) where the variety of content available in certain categories does not match with the variety expected by users (i.e., how do we know when we need to target the creation and publication of specific content?)
Many approaches to this do focus on the short head – typically the top N terms, where N might be 50 or 100 or even 500 (some number that’s manageable). I am interested in identifying ways to understand the language through the whole long tail as well.
As I have dug into this, I realized an important aspect of this problem is to understand how much commonality there is to the language in use by users and also how much the language in use by users changes over time – and this question leads directly to the topic at hand here.
There is an anecdote I have heard many times about the short head of your search log that “80 percent of your searches are accounted for by the top 20% most commonly-used terms“. I now question this and wonder what others have seen.
I have worked closely with several different search solutions in my career and the three I have worked most closely with (and have most detailed insight on) do not come even close to the above assertion. Chart 1 shows the usage curve for one of these. The X axis is the percent of distinct terms (ordered by use) and the Y axis shows the percent of all searches accounted for by all terms up to X.
From this chart, you can see that it takes approximately 55% of distinct terms to account for 80% of all searches – that is a lot of terms!
This curve shows the usage for one month – I wondered about how similar this would be for other months and found (for this particular search solution) that the curves for every month were basically the exact same!
Wondering if this was an anomaly, I looked at a second search solution I have close access to to wonder if it might show signs of the “80/20” rule. Chart 2 adds the curve for this second solution (it’s the blue curve – the higher of the two).
In this case, you will find that the curve is “higher” – it reaches 80% of searches at about 37% of distinct terms. However, it is still pretty far from the “80/20” rule!
After looking at this data in more detail, I have realized why I have always been troubled at the idea of paying close attention to only the so-called “short head” – doing so leaves out an incredible amount of data!
In trying to understand the details of why, even though neither is close to adhering to the “80/20” rule, the usage curves are so different, I realize that there are some important distinctions between the two search solutions:
I’m not sure how (or really if) these factor into the shape of these curves.
In understanding this a bit better, I hypothesize two things: 1) the shape of this curve is stable over time for any given search solution, and 2) the shape of this curve tells you something important about how you can manage your search solution. I am planning to dig more to answer hypothesis #1.
Questions for you:
I will be writing more on these search term usage curves in my next post as I dig more into the time-stability of these curves.
In my last post, I wrote about a particular process for capturing “knowledge nuggets” from a community’s on-going discussions and toward the end of that write up, I described some ideas for the motivation for members to be involved in this knowledge capture process and how it might translate to an enterprise. All of the ideas I wrote about were pretty general and as I considered it, it occurred to me that another topic is – what are the kinds of specific objectives an employee could be given that would (hopefully) increase knowledge sharing in an enterprise? What can a manager (or, more generally, a company) do to give employees an incentive to share knowledge?
Instead of approaching this from the perspective of what motivates participants, I am going to write about some concrete ideas that can be used to measure how much knowledge sharing is going on in your organization. Ultimately, a company needs to build into its culture and values an expectation of knowledge sharing and management in order to have a long-lasting impact. I would think of the more tactical and concrete ideas here as a way to bootstrap an organization into the mindset of knowledge sharing.
A few caveats: First – Given that these are concrete and measurable, they can be “gamed” like anything else that can be measured. I’ve always thought measures like this need to be part of an overall discussion between a manager and an employee about what the employee is doing to share knowledge and not (necessarily) used as absolute truth.
Second – A knowledge sharing culture is much more than numbers – it’s a set of expectations that employees hold of themselves and others; it’s a set of norms that people follow. That being said, I do believe that it is possible to use some aspects of concrete numbers to understand impacts of knowledge management initiatives and to understand how much the expectations and norms are “taking hold” in the culture of your organization. Said another way – measurement is not the goal but if you can not measure something, how do you know its value?
Third – I, again, need to reference the excellent guide, “How to use KPIs in Knowledge Management” by Patrick Lambe. He provides a very exhaustive list of things to measure, but his guide is primarily written as ways to measure the KM program. Here I am trying to personalize it down to an individual employee and setting that employee’s objectives related to knowledge sharing.
In the rest of this post, I’ll make the assumption that your organization has a performance management program and that that program includes the definition for employees of objectives they need to complete during a specific time period. The ideas below are applicable in that context.
Not all of these will apply to all employees and some employees may not have any specific, measurable knowledge sharing objectives (though that seems hard to imagine regardless of the job). An organization should look at what they want to accomplish, what their tool set will support (or what they’re willing to enhance to get their tool set to support what they want) and then be specific with each employee. This is meant only as a set of ideas or suggestions to consider in making knowledge sharing an explicit, concrete and measurable activity for your employees.
Given some concrete objectives to measure employees with, it seems relatively simply to roll those objectives up to management to measure (and set expectations for up front) knowledge sharing by a team of employees, not just individual employees. On the other hand, a forward-thinking organization will define group-level objectives which can be cascaded down to individual employees.
Given either of these approaches, a manager (or director, VP, etc.) may then have both an organizational level objective and their own individual objectives related to knowledge sharing.
Lastly – while I’ve never explored this, several years ago, a vice president at my company asked for a single index of knowledge sharing. I would make the analogy of some like a stock index – a mathematical combination of measuring different aspects of knowledge sharing within the company. A single number that somehow denotes how much knowledge sharing is going on.
I don’t seriously think this could be meaningful but it’s an interesting idea to explore. Here are some definitions I’ll use to do so:
Given the above, you could imagine the “knowledge sharing index” at any moment in time could be computed as (for – I don’t know how to make this look like a “real” formula!):
Knowledge index at time t = Sum (i=1…N) of Wi * ( Mt,i / Bi )
A specific example:
- Let’s say you have three sources of “knowledge sharing” – a corporate wiki, a mailing list server and a corporate knowledge base
- For the wiki, you’ll measure total edits every week, for the list server, you’ll measure total posts to all mailing lists on it and for the knowledge base, you’ll measure contributions and downloads (as two measures).
- In terms of weights, you want to give the mailing lists the least weight, the wiki an intermediate weight and the combined knowledge base the most weight. Let’s say the weights are 15 for the mailing lists, 25 for the wiki, 25 for the downloads from the knowledge base and 35 for contributions to the knowledge base. (So the weights total to 100!)
- Your baseline for future measurement is 200 edits in the wiki, 150 posts to the list server, 25 contributions to the knowledge base and downloads of 2,000 from the knowledge base
- At some week after the start, you take a measurement and find 180 wiki edits, 160 posts to the list server, 22 knowledge base contributions and 2200 downloads from the knowledge base.
- The knowledge sharing index for that week would be 95.8. This is “down” even though most measures are up (which simply reflects the relative importance of one factor, which is down).
If I were to actually try something like this, I would pick the values of Wi so that the baseline measurement (when t= 0) comes to a nice round value – 100 or something. You can then imagine reporting something like, “Well, knowledge sharing for this month is at 110!” Or, “Knowledge sharing for this month has fallen from 108 to 92”. If nothing else, I find it amusing to think so concretely in terms of “how much” knowledge sharing is going on in an organization.
There are some obvious complexities in this idea that I don’t have good answers for:
In any event – I think this is an interesting, if academic, discussion and would be interested in others’ thoughts on either individual performance management or the idea of a knowledge sharing index.
In my previous two posts, I’ve written about some basic search analytics and then some more advanced analysis you can also apply. In this post, I’ll write about the types of analysis you can and should be doing on data captured about the usage of search results from your search solution. This is largely a topic that could be in the “advanced” analytics topic but for our search solution, it is not built into the search solution and has been implemented only in the last year through some custom work, so it feels different enough (to me) and also has enough details within it that I decided to break it out.
When I first started working on our search solution and dug into the reports and data we had available about search behavior, I found we had things like:
and much more. However, I was frustrated by this because it did not give me a very complete picture. We could see the searches people were using – at least the top searches – but we could not get any indication of “success” or what people found useful in search, even. The closest we got from the reports was the last item listed above, which in a typical report might look something like:
Search Results Pages
However, all this really reflects is the percentage of each page number visited by a searcher – so 95% of users never go beyond page 1 and the engine assumes that means they found what they wanted there. That’s a very bad assumption, obviously.
I wanted to be able to understand what people were actually clicking on (if anything) when they performed a search! I ended up solving this with a very simple solution (simple once I thought of it). I believe this emulates what Google (and probably many other search engines) do. I built a simple servlet that takes a number of parameters, including a URL (encoded) and the various pieces of data about a search result target and stores an event in a database from those parameters and then forwards the user to the desired URL. Then the search results page was updated to provide the URL for that servlet in the search results instead of the direct URL to the target. That’s been in place for a while now and the data is extremely useful!
By way of explanation, the following are the data elements being captured for each “click” on a search result:
This data provides for a lot of insight on behavior. You can guess what someone might be looking for based on understanding the searches they are performing but you can come a lot closer to understanding what they’re really looking for by understanding what they actually accessed. Of course, it’s important to remember that this does not really necessarily equate to the user finding what they are looking for, but may only indicate which result looks most attractive to them, so there is still some uncertainty in understand this.
While I ended up having to do some custom development to achieve this, some search engines will capture this type of data, so you might have access to all of this without any special effort on your part!
Also – I assume that it would be possible to capture a lot of this using a standard web analytics tool as well – I had several discussions with our web analytics vendor about this but had some resource constraints that kept it from getting implemented and also it seemed it would depend in part on the target of the click being instrumented in the right way (having JavaScript in it to capture the event). So any page that did not have that (say a web application whose template could not be modified) or any document (something like a PDF, etc) would likely not be captured correctly.
Given the type of data described above, here are some of the questions and actions you can take as a search analyst:
You can also combine data from this source with data from your web analytics solution to do some additional analysis. If you capture the search usage data in your web analytics tool (as I mention above should be possible), doing this type of analysis should be much easier, too!
Here’s a wrap (for now) on the types of actionable metrics you might consider for your search program. I’ve covered some basic metrics that just about any search engine should be able to support; then some more complex metrics (requiring combining data from other sources or some kind of processing on the data used for the basic metrics) and in this post, I’ve covered some data and analysis that provides a more comprehensive picture of the overall flow of a user through your search solution.
There are a lot more interesting questions I’ve come up with in the time I’ve had access to the data described above and also with the data that I discussed in my previous two posts, but many of them seem a bit academic and I have not been able to identify possible actions to take based on the insights from them.
Please share your thoughts or, if you would, point me to any other resources you might know of in this area!
In my last post, I provided a description of some basic metrics you might want to look into using for your search solution (assuming you’re not already). In this post, I’ll describe a few more metrics that may take a bit more effort to pull together (depending on your search engine).
First up – there is quite a lot of insight to be gained from combining your search analytics data with your web analytics data. It is even possible to capture almost all of your search analytics in your web analytics solution which makes this combination easier, though that can take work. For your external site, it’s also very likely that your web analytics solution will provide insight on the searches that lead people to your site.
A first useful piece of analysis you can perform is to review your top N searches, perform the same searches yourself and review the resulting top target’s usage as reported in your web analytics tool.
A second step would be to review your web analytics report for the most highly used content on your site. For the most highly utilized targets, determine what are the obvious searches that should expose those targets and then try those searches out and see where the highly used targets fall in the results.
Another fruitful area to explore is to consider what people actually use from search results after they’ve done a search (do they click on the first item, second? what is the most common target for a given keyword? Etc.). I’ll post about this separately.
I’m sure there are other areas that could be explored here – please share if you have some ideas.
When I first got involved in supporting a search solution, I spent some time understanding the reports I got from my search engine. We had our engine configured to provide reports on a weekly basis and the reports provided the top 100 searches for the week. All very interesting and as we started out, we tried to understand (given limited time to invest) how best to use the insight from just these 100 searches each week.
We quickly realized that there was no really good, sustainable answer and this was compounded by the fact that the engine reported two searches as different searches if there was *any* difference between two searches (even something as simple as case difference, even though the engine itself does not consider case when doing a search – go figure).
In order to see the forest for the trees, we decided what would be desirable is to categorize the searches – associate individual searches with a larger grouping that allows us to focus at a higher level. The question was how best to do this?
Soon after trying to work out how to do this, I attended Enterprise Search Summit West 2007 and attended a session titled “Taxonomize Your Search Logs” by Marilyn Chartrand from Kaiser Permanente. She spoke about exactly this topic, and, more specifically, the value of doing this as a way to understand search behavior better, to be able to talk to stakeholders in ways that make more sense to them, and more.
Marilyn’s approach was to have a database (she showed it to me and I think it was actually in a taxonomy tool but I don’t recall the details – sorry!) where she maintained a mapping from individual search terms to the taxonomy values.
After that, I’ve started working on the same type of structure and have made good headway. Further, I’ve also managed to have a way to capture every single search (not just the top N) into a SQL database so that it’s possible to view the “long tail” and categorize that as well. I still don’t have a good automated solution to anything like auto-categorizing the terms but the level of re-use from one reporting period to the next is high enough that dumping in a new period’s data requires categorization of only part of the new data. [Updated 26 Jan 2009 to add the following] Part of the challenge is that you will likely want to apply many of the same textual conversions to your database of captured searches that are applied by your search engine – synonyms, stemming, lemmatization, etc. These conversions can help simplify the categorization of the captured searches.
Anyway – the types of questions this enables you to answer and why it can be useful include:
Another useful type of analysis you can perform on search data is to look at simple metrics of the searches. Louis Rosenfeld identified several of these – I’m including those here and a few additional thoughts.
Chart of Searches per Word Count
Chart of Search Length to number of searches
Another interesting view of your search data is hinted at by the discussion above of “secondary” search words – words that are used in conjunction with other words. I have not yet managed to complete this view (lack of time and, frankly, the volume of data is a bit daunting with the tools I’ve tried).
The idea is to parse your searches into their constituent words and then build a network between the words, where the each word is a node and the links between the words represent the strength of the connection between the words – where “strength” is the number of times those two words appear in the same searches.
Having this available as a visual tool to explore words in search seems like it would be valuable as a way to understand their relationships and could give good insight on the overall information needs of your searchers.
The cost (in myown time if nothing else) of taking the data and manipulating it into a format that could then be exposed in this, however, has been high enough to keep me from doing it without some more concrete ideas for what actionable steps I could take from the insight gained. I’m just not confident enough to think that this would expose anything much more than “the most common words used tend to be used together most commonly”.
I’m missing a lot of interesting additional types of analyses above – feel free to share your thoughts and ideas.
In my next post, I’ll explore in some more detail the insights to be gained from analyzing what people are using in search results (not just what people are searching for).
In my first few posts (about a year ago now), I covered what I call the three principles of enterprise search – coverage, identity, and relevance. I have posted on enterprise search topics a few times in the meantime and wanted to return to the topic with some thoughts to share on search analytics and provide some ideas for actionable metrics related to search.
I’m planning 3 posts in this series – this first one will cover some of what I think of as the “basic” metrics, a second post on some more advanced ideas and a third post focusing more on metrics related to the usage of search results (instead of just the searching behavior itself).
Before getting into the details, I also wanted to say that I’ve found a lot of inspiration from the writings and speaking of Louis Rosenfeld and also Avi Rappoport and strongly recommend you look into their writings. A specific webinar to share with you, provided by Louis, is “Site Search Analytics for a Better User Experience“, which Louis presented in a Search CoP webcast last spring. Good stuff!
Now onto some basic metrics I’ve found useful. Most of these are pretty obvious, but I guess it’s good to start at the start.
That’s all of the topics I have for “basic metrics”. Next up, some ideas (along with actions to take from them) on more complex search metrics. Hopefully, you find my recommendations for specific actions you can take on each metric useful (as they do tend to make the posts longer, I realize!).
My last several posts have been focused on various aspects of community metrics – primarily those derived from the use of a particular tool (mailing lists) used within our communities. While quite fruitful from an analysis perspective, these are not the only metrics we’ve looked at or reported on. In this post, I’ll provide some insights on other metrics we’ve used in case they might be of interest.
Before going on, though, I also wanted to highlight what I’ve found to be an extremely thorough and useful guide covering KPIs for knowledge management from a far more general perspective than just communities – How to Use KPIs in Knowledge Management by Patrick Lambe. I would highly recommend that anyone interested in measuring and evaluating a knowledge management program (or a community of practice initiative specifically) read this document for an excellent overview for a variety of areas. Go ahead… I’ll wait.
OK – Now that you’ve read a very thorough list, I will also direct you to Miguel Cornejo Castro’s blog, who has published on community metrics. I know I’ve seen his paper on this before, but in digging just now I could not seem to come up with a link to it. Hopefully, someone can provide a pointer.
UPDATE: Miguel was kind enough to provide the link to the paper I was recalling in my mention above: The Macuarium Set of CoP Measurements. Thanks, Miguel!
If you can provide pointers to additional papers or writings on metrics, please comment here or on the com-prac list.
With that aside, here are some of the additional metrics we’ve used in the past (when we were reporting regularly on the entire program, it was generally done quarterly to give you an idea of the span we looked at each time we assembled this):
This is my last planned post on community metrics for now. I will likely return to the topic in the future. I hope the posts have been interesting and also have provided food for thought for your own community programs or efforts.
In my last post, I described some ideas about how to get a sense of knowledge flow within a community using some basic metrics data you can collect. I thought it might be useful to provide a more active visualization of the data from a sample community. As always, data has been obfuscated a bit here but the underlying numbers are most accurate – I believe it provides a more compelling “story” of sorts to see data that at least approximates reality.
I knew that Google had provided its own visualization API which provides quite a lot of ways to visualize data, including a “Motion Chart” – which I’d seen in action before and found a fascinating way to present data. So I set about trying to determine a way to use that type of visualization with the metrics I’ve written about here.
The following is the outcome of a first cut at this (requires Flash):
This visualization shows each of the lists associated with a particular community as a circle (if you hover over a circle, you’ll see a pop-up showing that list’s name – you can click on it to have that persist and play with the “Trails” option as well to see the path persist).
The default options should have “Cumulative Usage” on the Y axis, Members on the X axis, “Active Members” as the color and “Usage” as the size.
An interpretation of what you’re seeing – once you push play, lists will move up the Y axis as their total “knowledge flow” grows over time. They’ll move right and left as their membership grows / shrinks. The size of a circle reflects the “flow” at that time – so a large circle also means the circle will move up the Y axis.
It’s interesting to see how a list’s impact changes over time – if you watch the list titled “List 9” (which appears about Sept 05 in the playback), you’ll see it has an initial surge and then its impact just sort of pulsates over the next few years. Its final position is higher up than “List 7” (which is present since the start) but you can see that List 7 does see some impact later in the playback.
You can also modify which values show in which part of this visualization – if you try some other options and can produce something more insightful, please let me know!
I may spend some time looking at the other visualization tools available in the Google Visualization API and see if they might provide value in visualization other types of metrics we’ve gathered over time. If I find something interesting, I’ll post back here.
In my series on metrics about communities of practice, I’ve covered a pretty broad range of topics, including measuring, understanding and acting on:
In this post, I’ll slightly change gears and present some thoughts on a more research-like use of this data. First, an introduction to what drove this thinking.
A few years back as we were considering some changes in the navigational architecture on our intranet, I heard the above statment and it made me scratch my head. What did this person mean – there is nothing going on in communities? There sure seemed to be a lot of activity that I could see!
A quick bit of background: Though I have not discussed much about our community program outside of the mailing lists, every community had other resources that they utilized – one of the most common being a site on our intranet. On top of that, at the time of the discussion mentioned above, communities actually had a top spot in the global navigation on our intranet – which provided the typical menu-style navigation to top resources employees needed. One of the top-level menus was labeled “communities” and as sub-menu items, it included subset of the most strategic / active communities. Very nice and direct way to guide employees to these sites (and through them to the other resources available to community members like the mailing lists I’ve discussed).
Back to the discussion at hand – As we were revisiting the navigational architecture, one of the inputs was usage of the various destinations that made up the global navigation. We have a good web analytics solution in place on our intranet (the same we use on our public site) so we had some good insight on usage and I could not argue the point – the intranet sites for the communities simply did not get much traffic.
As I considered this, a thought occurred to me – what we were missing is that we had two distinct ways of viewing “usage” or “activity” (web site usage and mailing list membership / activity) and we were unable to merge them. An immediate question occurred to me – what if, instead of a mailing list tool, we used an online forum tool of some sort (say, phpBB or something similar)? Wouldn’t that merge together these two factors? The act of posting to a forum or reading forums immediately becomes different web-based activities that we could measure, right?
Given the history of mailing list usage within the company, I was not ready to seriously propose that kind of change, but I did set out to try to answer the question – Can we somehow compare mailing list activity to web site usage to be able to merge together this data?
The rest of this post will discuss how I went about this and present some of the details behind what I found.
The starting point for my thinking was that the rough analogy to make between web sites and mailing lists is that a single post to a mailing list could be thought of as equivalent to a web page. The argument I would make is that (of course, depending on the software used), for a visitor to read a single post using an online forum tool, they would have to visit the page displaying that post. So our first component is
Pc = the number of posts during a given time period for a community
In reality, many tools will combine together a thread into a single page (or, at least, fewer than one page per comment). If you make an assumption that within a community, there’s likely an average number of posts per thread, we could define a constant representing that ratio. So, define:
Rc = the ratio of posts per thread within a community for a given time period
Note that while I did not discuss it in the context of the review of activity metrics, it’s possible with the activity data we are gathering to identify thread and so we can compute Rc.
Tc = total threads within a community for a given time period
Rc = Pc / Tc
Now, how do we make an estimate of how many page views members would generate if they visited the forum instead of having posts show up in their mailbox? The first (rough, and quite poor) guess would be that every member would read every post. This is not realistic and to get an accurate answer would likely require some analysis directly with community members. That being said, I think, within a constant factor, the number of readers can be approximated by the number of active members within the community (it’s true that any active member can be assumed to have read at least some of the posts – their own). A couple more definitions, then:
Mc = the number of members of a community at a given time
Ac = the number of active members within a community for a given time period
In addition to assuming that active members represent a high percentage of readers, I wanted to reflect the readership (which is likely lower) among non-active members (AKA “lurkers”). We know the number of lurkers for a given time period is:
Lc = the number of lurkers within a community over a given time period = (Mc – Ac)
So we can define a factor representing the readership of these lurkers
PRc = the percent of lurkers who would read posts during a given time period (PR means “passive reader”)
Can we approximate PRc for a community from data we are already capturing? At the (fuzzy) level of this argument, I would think that the percentage of active to total members probably is echoed within the lurker community to estimate the number of lurkers who will read any given post in detail:
PRc ~= Ac / Mc
So, with the basic components defined above, the formula that I have worked out for computing a proxy for web site traffic from mailing lists becomes:
Uc = the “usage” of a community as reflected through its mailing list
= Pc * (Ac + PRc * Lc) / Rc
= Pc * (Ac + Ac / Mc * Lc) / Rc
= Pc * (Ac + Ac / Mc * (Mc – Ac)) / Rc
= (2 * Pc * Ac – Pc * Ac2 / Mc ) / (Pc / Tc)
= (2 * Ac * Tc – Ac2 * Tc / Mc)
So with that, we have a formula which can help us relate mailing list activity to web site usage (up to some perhaps over-reaching simplifications, I’ll admit!). All of these factors are measurable from the data we are collecting and so I’ll provide a couple of sample charts in the next section.
Here are a few samples of measuring this “usage” over a series of quarters in various communities.
As you will see in the samples, this metric shows a wide variance in values between communities, but relative stability of values within a community.
The first sample shows data for a small community. As before, I have obfuscated the data a bit, but you can see a bit jump early in the lifecycle and then an extended period of low-level usage. The spike represents the formal “launch” of the community, when a first communication went out to potential members and many people joined. The drop-off to low level usage shown here represents, I believe, a challenge for the community to address and to make the community more vital (of course, it could also be that other ways of observing “usage” of the community might expose that it actually is very vital).
The second sample shows data for a large, stable community – you’ll note that the computed value for “usage” is significantly higher here than in the above sample (in the range of around 30,000-40,000 as opposed to a range of 500-1,000 as the small community stabilized around).
Well, after putting the above together, I realized that if you ignore the Rc factor (which converts the measurement of these “member-posts” into a figure purportedly comparable to web page views), you get a number that represents how much of an impact the flow of content through a mailing list has on its members – indirectly, a measure of how much information or knowledge could be passing through a community’s members.
The end result calculation would look something like:
Kc = the knowledge flow within a community for a given period
= (2 * Pc * Ac – Pc * Ac2 / Mc )
This concept depends on making the (giant) leap that the “knowledge content” of a post is equivalent across all posts, which is obviously not true. For the intellectual argument, though, one could introduce a factor that could be measured for each post and replace Pc (which has the effect of treating the knowledge content of a post as “1”) with the sum of that evaluation of each post across a community (where each post is scored a 0-1 on a scale representing that post’s “knowledge content”).
I have not done that analysis, however (it would be a very subjective and manually intensive task!), and, within an approximation that’s probably no less accurate than all of the assumptions above (said with appropriate tongue-in-cheek), I would say that one could argue that you could multiply Kc by a constant factor (representing the average knowledge content of a community) and have the same effect.
Further, if you use this calculation primarily to compare a community with itself over time, you likely find that the constant factor likely does not change over time and you can simply remove it from the calculation (again, with the qualifier that you can then only compare a community to itself!) and you are left with the above definition of Kc.
So far, I’ve provided a fairly complicated description of this compound metric and a couple of sample charts that show this metric for a couple of sample communities. Some obvious questions you might be asking:
To be honest, so far, I have not been very successful in answering these questions. In terms of being actionable – using this data might lend itself to the types of actions you take based on web analytics, however, there is not an obvious (to me) analog to the conversion that is a fundamental component of web analytics. It seems more likely an after-the-fact measure of what happened instead of a forward-looking tool that can help a community manager or community leader focus the community.
In terms of validity, I’m not sure how to go about measuring if this metric if “valid”. Some ideas that come to my mind at least to compare this to include:
I’d be very happy to hear from someone who might have some thoughts on how to validate this metric or (perhaps even better) poke holes in what its failings are.
Whew! If you’re still with me, you are a brave or stubborn soul! A few thoughts on all of this to summarize: