I recently re-read a pre-LLM blog post about our work creating ontologies to support an AI-powered chatbot. My sense going into it was that our argument about the need for strong information foundations was still relevant. After re-reading it was clear that taxonomies play a crucial role in any AI application, and that role is especially important for generative AI. Based on that past project and the generative AI possibilities it is clear that Taxonomies can directly support and improve the following:
1. Bias Mitigation
Generative AI reflects and compounds biases present in training data. Taxonomies can minimize bias by explicitly defining relationships between concepts, categories, and domains. By maintaining a taxonomy that actively addresses bias, the risk of wrong or offensive responses from generative models can be reduced. For example, taxonomies can help by associating terms with balanced and neutral definitions. Additionally, taxonomies are an essential tool for making sure that a full set of concepts or categories are used to define a domain.
2. Semantic Understanding and Contextualization
This was called out directly in the initial blog post, but it is still essential for generative AI. Taxonomies define relationships between terms, categories, and contexts. It is a crucial component of any information-rich endeavor (of which generative AI is no exception) to grasp nuances like synonyms, homonyms, and query disambiguation. For example, if a user asks about “jaguar,” taxonomies help the AI distinguish between the animal, the car brand, or the sports team based on the context of the query.
3. Dynamic Query Expansion
Similar to #3, the structure and relationships within a taxonomy/ontology provide additional context. Taxonomies offer structured ways to extend queries dynamically. If a user inquires about “laptop features,” the taxonomy could allow the chatbot to generate more detailed questions or responses about processor types, memory, or battery life, enhancing the interaction.
The following examples are from the original post:
Capabilities of taxonomy for a chatbot
- Synonyms, homonyms, antonyms, etc.
- Query disambiguation (I.e., “Turkey” the animal vs. “Turkey” the country)
- Query expansion / refinement (i.e., “Terrier” → Dog)
- Identify relationships across domains (i.e., Dog → Therapeutic Aids)
4. Continuous Learning and Adaptation
As generative AI models learn from new inputs over time, taxonomies provide a consistent framework for integrating this new information without compromising the organization of existing data. Taxonomy-driven learning paths ensure that as the chatbot’s knowledge base expands, it remains aligned with business goals and user needs.
Of course, none of these benefits are unique to AI projects.
All information-rich initiatives, applications, and capabilities benefit from well-modeled ontologies or taxonomies.
AI or not, organizations are working in complex, heterogeneous information environments that contain data and information from many sources. Taxonomies/ontologies/knowledge models are essential tools for operating in a controlled and predictable way that minimizes risk and returns useful information.