Key Points:
A content inventory is only as good as the data in it, and content inventories often have incorrect data in them. For instance, I have seen a team trying to make content cutting decisions based on an inventory with incorrect last-modified dates in the inventory although that was not immediately obvious to the team. An even more common problem is out of date inventories. Figuring out whether the data in a content inventory is accurate can be tough (at least for large inventories), but here are some ideas for increasing their quality.
First, what are the characteristics of a high quality content inventory:
- Each of the fields is correct (as of the time you need to analyze the inventory) for each content item in the inventory.
- It contains one row for each piece of content in the site you are inventorying.
- You have the information you need to make business decisions about your content.
In this blog post let's concentrate on the first characteristics of a high quality content inventory.
Spot check for accuracy
One of the best ways of checking for content inventory accuracy is to randomly select rows from an inventory and then look at the data to see if it looks right. Just to be clear, I really mean random, meaning that you have a spreadsheet of say 1,000 pieces of content and let the computer pick a random sample of rows (for example, see this random sampler from Add-Ins.com if you’re using Excel). By taking a random sample you are less likely to be pulled to the magnetic areas of your site and see potential problem areas. Once you have your sample, compare both across fields (for instance if the last-modified date is older than the creation date, then you have a problem) and also across content items to look for inconsistencies.
Use the right source in the first place
As always, people usually quickly fixate on a particular tool for creating inventories, but make sure you are using the right source for the data. This may mean joining together data from multiple systems.
Consider longevity
Sometimes a quick inventory can answer immediate questions, and this makes sense. But often you need something more permanent. As with many things in life, things that start off life as temporary end up becoming permanent. So when you develop a content inventory at least consider the effective lifetime, and if it is to be short lived then clearly indicate its creation date.
Check aggregated views
In many ways the opposite of the first suggestion above, you can create summary views of your content inventory (for example topic and site inventories). In an analysis to cut content, this will naturally occur when you are doing what-if analysis to bucket content. Other aggregated views could be histograms of creation dates, and if you see a smattering of content with far different dates than others, then that may be a place to dig around to see if the data is accurate.
Drop fields
Just because some tool gives you a field to include in your inventory doesn’t mean you should include it. If you discover a field is not accurate, and it isn’t essential, then drop the field.
Automate
Related to using the right source, you should attempt to automate the creation of your content inventory as much as possible, or create a mechanism to update key fields of the inventory.