Don’t Skip Steps, but Don’t Boil the Ocean: Practical Advice on Taxonomy Documentation

My colleague Connor Cantrell recently wrote a piece on taxonomy governance, placing it in the context of collection lifecycle management in libraries. One of the primary reasons for governance, whether of taxonomies, metadata, content, data, records, books, academic journals, or any other digital or physical information asset, is to ensure that it’s managed in a consistent and predictable way using standardized and repeatable processes. Of course this also means that someone needs to write those policies and processes down, and someone else (or even the original someone) needs to be able to find and understand them. Which brings me to the oft neglected, postponed, or flat-out ignored, but critical task of documentation.

Taxonomy documentation in an enterprise environment often consists of a large and dispersed set of documents, websites, collaboration tools such as wikis and message boards, and, of course, individual experience and expertise. Managing this material in a controlled way that supports findability, reuse, and process standardization is a common, and hard to solve, taxonomy governance problem.

It’s no exaggeration to say that at some level taxonomy documentation has been a pain point for essentially every client I’ve worked with over the course of my career. Factor recently overhauled taxonomy documentation and initiated an ongoing documentation program with a large enterprise client. I’ll be using that experience to discuss some of the challenges we faced and specific steps to starting and nurturing a grass roots taxonomy documentation program.

Specific issues we encountered included:

  • Multiple user types with very different needs.
  • A large volume of legacy documentation of varying quality, formality, age, and applicability: wikis, websites, and many documents.
  • Undocumented organizational knowledge held by practitioners and SMEs.
  • Adoption and use of the documentation.
  • Limited resources.
  • Defining what “done” means.

This article will present a reusable, bottom-up approach to capture and manage taxonomy documentation, including identification of key user groups, content prioritization, socialization, and ongoing management.

The starting point of this particular project was almost exactly what I described in the introduction above. An inventory of existing taxonomy documentation uncovered:

  • Extensive, good quality, but largely outdated, taxonomy documentation in a pre-existing Confluence Spaces wiki.
  • Several other wikis and SharePoint sites with relevant workflow and system administration documentation.
  • Dashboards with analytics on content tagging and taxonomy use.
  • Numerous documents, spreadsheets, and diagrams listing and describing taxonomy inventories, taxonomy flow between systems, and other aspects of the taxonomy ecosystem.
  • Undocumented processes – “I don’t know, but I know who to ask.”

One of the first questions to answer, and one that’s crucial to avoid “boiling the ocean,” is defining the scope of the documentation. In this case, we were given these strong, high level mandates, which were invaluable in terms of avoiding scope creep:

  • Collect and evaluate existing taxonomy documentation, delete or update outdated material, and publish it in a new Confluence Space (a wiki). This point also includes, whether we liked it or not, an important technology constraint.
  • Create new documentation if necessary, but focus on reuse of existing materials.
  • Reduce the overall amount of the documentation and dramatically simplify the navigation structure of the wiki.
  • Create a sustainable process to ensure the documentation remains useful and current moving forward.

Of course one of the most important scope questions is who are the users of the documentation. There’s a strong tendency for the person creating documentation to capture what they believe is useful and important, but that can lead you astray. For example, we found a large amount of legacy documentation about taxonomy theory, basic introductory materials, learning about taxonomies by classifying wine, that sort of thing (As an aside, this is very common – taxonomists love to tell these kinds of stories!). When we asked ourselves who is the audience for this, though, we couldn’t come up with a good answer. The three user groups for our documentation were the taxonomy team (i.e., professional taxonomists), taxonomy business users, and system owners and administrators of taxonomy consuming systems. Experienced taxonomists don’t need that material, while business and technical stakeholders aren’t interested in it. With that in mind it was an easy decision to eliminate a significant amount of legacy material.

  • Identify your audience and understand what information they need from the documentation.
  • Don’t skip user research. If they’re available, analytics are invaluable, both for evaluating legacy materials and tracking ongoing use of documentation. For example, if a page receives literally zero views, month after month, you probably don’t need it. Obviously, also talk to the users; ask them what information they need but struggle to find, what do they regularly refer to, and so on. Depending on the scale and resources of the project more formal user research might be appropriate.
  • Introductions to taxonomies, program overviews, business justifications, and similar materials are important but probably don’t belong in a documentation repository. Focus on providing material that supports proactive, self-service problem solving for specific user groups.

When the three sets of users were defined identifying the specific documentation to include was relatively straightforward. A related question is the architecture of the repository itself. In our case, with three fairly distinct sets of users, we decided to essentially have three sub-sites within our wiki, with clear separation between them. The specific materials for each section include:

  • Taxonomy business users: The primary need of business users was understanding what taxonomies are available and how to access them. To this end, this section provided a business-focused overview of taxonomy management systems and capabilities, a read-only view of available vocabularies, links to regularly updated taxonomy reports and downloads, links to contact the taxonomy team with questions or to make requests through Jira, and an evolving section on creating and using subsets of vocabularies for specific business use cases.
  • System administrators and owners of taxonomy consuming systems: We partnered with technical stakeholders to create this section. Their needs were primarily for API and system configuration to manage integrations within the taxonomy and content ecosystem. It would have been difficult for the taxonomy team to write this material so our role was to add information specific to taxonomy management when it was needed, ensuring consistent format and navigation, and adding links to other sections when appropriate.
  • Taxonomy management team: This section was for our own use. We use it for project tracking, specifically the step-by-step status of various taxonomy management tasks, as a store of institutional knowledge in the form of “SOPs” for regular taxonomy management and maintenance tasks, and for taxonomy governance guidelines and processes.

Finally, after you’ve done all of the above, how do you keep it fresh and, most importantly, how do you get anyone else to look at it? Maintenance is simple but not necessarily easy: resources have to be dedicated to doing it. Leaving it as a side project is a recipe to create one more half-finished, sporadically updated wiki, of which there are many in any enterprise environment. After the initial setup the ongoing level of effort isn’t onerous but it can’t be ignored. It needs to be someone’s (ideally several someone’s) job to maintain it.

Socialization is also an ongoing process and success is most likely when the problem is attacked from multiple directions. To start with, leadership support is absolutely essential. In our case one leader drove the project from the start, and their persistence led to advocacy from other leaders. Persistence is also important; insist on people accessing the documentation, and also adding to it. Build the mental muscle memory so that it becomes natural to look at the documentation. Partnership with the technical team was also valuable for us, as they became both active users and contributors.

  • Sustainable documentation management requires resources.
  • Leadership support, not necessarily executive level, but support from day-to-day, on the ground, managers is essential.
  • Partnership with stakeholders, especially when the scope includes areas beyond the immediate expertise of the documentation team, is extremely helpful.
  • Persistence in terms of referring to the documentation, pushing users towards it, is also essential.
  • Maintaining, expanding, and improving documentation over time will make it more valuable, and more widely used.

We made use of the common enterprise wiki platform, Confluence Spaces, to capture documentation for taxonomy management and technical implementation processes. By identifying a small set of key user groups and focusing on their needs we were able to sort through a large volume of legacy content that had accumulated over the course of many years. This resulted in a collection of documentation resources that provided good value to taxonomy stakeholders and was of a manageable size for publication on the wiki.

Critical Points

  • Don’t skip steps!
  • Why do it in the first place? The common joke is that someone needs to be able to pick up the work if we’re all hit by a bus or, more positively, win the lottery. Obviously not as extreme, but we all have experienced the reality that people are constantly changing jobs, retiring, or being laid off, businesses restructure and change vendors, a task that is important but infrequent and needs to be reinvented every time. All of these and more are good reasons to spend time on documentation.
  • Above all, a sustainable documentation program requires dedicated resources. There are many viable approaches to providing resources but documentation can’t just be something that’s maintained in someone’s “spare time.”
  • A central repository for documentation is also a key requirement. Familiarity and ease of implementation are important considerations for adoption and the best platform for documentation is likely to be one that’s already in use in your organization for a similar purpose. A basic platform that’s used is more valuable than a high-tech one that’s not. As for specific platforms, a wiki is a common choice because they’re readily available through ubiquitous enterprise platforms such as Confluence, SharePoint, GitHub, and so on. There’s a low barrier to creating one, and it worked well for us. They’re not perfect and far from the only possibility, though. For example, files in a Google Drive, which has basic workflow and metadata capabilities, is another very accessible option. There are also dedicated documentation platforms, knowledge bases, service management platforms like ServiceNow, GitHub and other issue tracking platforms, and other tools can all be effective documentation repositories. These tools offer a wide range of features and differing levels of structure..
  • Understand the needs of your users. In the example I described we were publishing documentation for professional taxonomists, taxonomy business users, and owners and administrators of taxonomy consuming systems. These groups have very different needs and it was most effective to have three sections that were clearly separated.
  • A related point, don’t forget about governance for documentation. Be clear about what is, and is not, included, both during the initial ramp-up and as part of the ongoing program.
  • Documentation should be proactive and enable self-service problem solving.
  • You don’t have to capture and report everything. In fact, over-documenting is counterproductive. Reuse and link out to other resources whenever possible. For example, every system and every enterprise environment has its own set of quirks. Documentation that addresses this makes sense and can be very helpful. But does every operation of a taxonomy management system need to be documented in detail? Probably not. If good quality documentation is available from a vendor, for example, just link to it.
  • What’s the right level of detail to capture? First, documentation is – usually – not for the person who created it. Referring back to my days as a chemist, a standard rule of thumb for what should be entered in a lab notebook, which was derived from patent standards, is that information about an experimental procedure should be sufficient that a practitioner who is unfamiliar with the specific work, but “skilled in the art” can reproduce it. I think that applies nicely to documentation as well.
+ posts