Taxonomies at Any Scale: Three Lessons for Archives, Corporations, and Anyone in Between

This article is a collaboration between the newest members of the Factor team, Jake Fatooh and Sam Stringer, who come from backgrounds in e-commerce and community archives, respectively.

What could a community archive and a global tech corporation possibly have in common? When it comes to taxonomy, more than you might think.

Community archives focus on preserving specific slices of history, taking care to curate materials most relevant to the communities they’re designed to serve. Major technology corporations, on the other hand, aim to serve a vast audience at a massive scale. They want to curate as much information as possible, for as many users as possible.

While these entities differ considerably in size, resources, and motivation, they share at least one trait: a wealth of information that requires careful governance and curation. Thoughtfully crafted taxonomies help organizations make sense of their information, regardless of scale, audience, or purpose. Here are three taxonomy lessons that will hold true for any organization—no matter its size.

Lesson 1: Audience Matters

When designing a taxonomy, understanding user needs is key to crafting language and structure that is relevant, recognizable, and intuitive to the humans that will use it.

In a community archive, where the focus may be niche, taxonomy design centers content. However, if the terms used to describe that content are inaccessible to the user community, the taxonomy is not doing its job.

Constructing a taxonomy requires reconciling the concepts embedded in the content with language users can relate to, ensuring that the system reflects both the material and the people it serves.

When not limited by content, taxonomists can center user experience to create potentially massive taxonomies. An archive about political organizing might need terms related to food insecurity and grocery prices, but likely won’t need to categorize types of cheese found in the grocery store. A tech enterprise, on the other hand, may strive for this level of granularity if its goal is to capture and classify every aspect of its users’ daily lives to power smart devices.

Knowing your audience’s needs is crucial. If a small-scale environment has too large of a taxonomy, the end user will be overwhelmed and the information they need will often be too hard to find. But at a larger scale, taxonomies must capture enough concepts to ensure the functionality of products and better facilitate user engagement.

Lesson 2: Consistent Metadata Improves Usability

Some community archives rely on “folksonomies”– de facto taxonomies created by user-generated keyword tags. While folksonomies offer great flexibility for user-driven environments, they can quickly devolve into chaos as they lack a crucial taxonomy feature: consistency. Using a taxonomy of consistent terms to classify archival materials helps group similar materials and improve discoverability, making the archive easier to navigate for its users.

When scaled up, consistent metadata is crucial for supporting automation and AI tools, as it provides critical context for systems to understand, categorize, and process data consistently. Without standardized and accurate metadata, automated workflows can become unreliable and AI tools struggle to accurately interpret data, leading to biased results or inefficiencies.

This is especially true in e-commerce environments. Too often, online product taxonomies are incorporated with inconsistent metadata. When product metadata is inconsistent and unorganized, a customer will have a harder time finding and purchasing products. People can’t buy what they can’t find. If your taxonomy is properly governed and your style values are managed consistently, the likelihood that your products will be found successfully by a customer will increase significantly, leading to greater sales.

Lesson 3: Organizational Values Shape Taxonomy Outcomes

Names have power and creating labels can easily become a political exercise. Taxonomies shape how information is organized and prioritized, tacitly reflecting the values of the organization overall.

In a community archive, the domain itself can reflect the information most important to that community; how the concepts in that domain are named and organized can also demonstrate the archive’s values.

Archives dealing with colorful histories, for example, may be pressured to sanitize or academize language — the decision to do so (or not) signals a political choice. These decisions are responsive to the archive’s subject matter as well as the community it serves.

As the domain scales, though, so does the audience. Customer-facing terms used by large organizations require careful crafting to maximize user recognition and approval. For an organization that values quality and customer satisfaction, quality metrics can provide insight into taxonomy success by measuring product performance.

When naming product categories, for example, it’s important to know that the majority of your American customers would more readily recognize the term “cellphone” over “mobile phone.” Quality metrics and quality data overall will lead to a better customer experience, better online sales, and a better, more informed taxonomy.

Creating and implementing well-designed taxonomies is a complex process that looks different for every organization. Taxonomies provide a framework not only for classification, but for communicating an organization’s values and shaping the user experience, whether those users are a small group of community archivists or a vast audience of online shoppers.

At any scale, taxonomies are an essential part of supporting information architecture, ensuring that organizations can understand and access their own data.