The prior post Taxonomy Mappings: Be Careful When Integrating gave some examples and described the problem of taxonomy mappings. Related to that is false precision in your tags. In thinking about this more, it occurs to me that there are probably two useful rules of thumb to keep in mind whenever tagging/pulling content (whether the content is automatically tagged, or mapped from another taxonomy, or mapped by hand):
- You can't tag in a course grained taxonomy and pull based on a fine-grained taxonomy (for example, if you have a system that only tags to "Washington, DC Metro Area," then you won't be able to pull by "Washington, DC" since any content tagged in the system may only be relevant to "Alexandria, VA").
- You can't tag in a fine-grained taxonomy when you only are using coarse information to determine the tagging (for example, if all you know about a group of content is that they're all animals, you can't tag each of content to frogs, cats, dogs, etc).
In both of these cases, when you pull by the fine-grained taxonomy there is a false sense of precision (and you can get grossly wrong.
Another way of stating the rules of thumb above:
- You have to originally tag (or possibly go through the effort of retro-actively tagging, perhaps through automated concept extraction) all content to at least as fine-grained a taxonomy as you're going to pull from,
- without artificially tagging more precisely than you are accurate.
Of course, by far the most preferable treatment is that all content, across the various systems you want to pull from (onto the same web page, for example) is tagged to the same, fine-grained taxonomy (or at as fine grained as you ever expect to need to pull from). Otherwise you'll have to resort to taxonomy mappings, or retroactively tag content.