Knowledge graphs genAI ONTOFORCE

BLOG

Knowledge graphs and GenAI: integration architecture and challenges

We're discussing the common challenges that pop up when building and maintaining a knowledge graph for GenAI and covering the common architectures for integrating the two.

ONTOFORCE team
3 December 2024 3 minutes

As the AI landscape rapidly advances, combining knowledge graphs with generative AI (GenAI), specifically large language models (LLMs), is emerging as an effective approach to create powerful solutions to address relevant use cases, especially in the life sciences industry. Knowledge graphs offer structured, domain-specific information and reasoning that can enhance GenAI’s reliability and accuracy. On the flipside, GenAI can strengthen a knowledge graph with natural language processing, edge prediction, and more. The symbiotic relationship enhances the two technologies’ capabilities. 

While this synergy is often discussed, the practical challenges and intricacies are sometimes overlooked. In this blog post we’ll detail how knowledge graphs and GenAI can be combined and discuss the common challenges that pop up when building and maintaining a knowledge graph for GenAI.   

Architectures for integrating knowledge graphs with GenAI 

Knowledge graphs provide structured, reliable data that knowledge workers depend on to navigate and solve complex tasks. GenAI, on the other hand, brings creativity and interpretive power, allowing systems to understand and respond to user intent. Together, they act like a "left brain" and "right brain" for data systems, with knowledge graphs grounding responses in factual data, and LLMs interpreting user queries interactively and translating knowledge back to them. 

Over time, various architectures have been developed to combine LLMs and knowledge graphs effectively:

Summarization and basic querying knowledge graph

Query translation models 
One approach is to use LLMs to translate user queries into structured, graph-compatible queries. This setup reduces the gap between user intent and data access, allowing users to type natural language questions that are then interpreted by the system. This approach relies solely on graph data, making it a simpler, more controlled model. Although effective, it is limited in its conversational capabilities.

 

 

 

 

Retrieval-augmented generation​ knowledge graph ONTOFORCE

RAG-based solutions 
Retrieval-augmented generation (RAG) solutions enable more conversational interactions. Here, an LLM interprets user queries, identifies key concepts, retrieves the relevant information from the knowledge graph, and uses GenAI to structure a response. This approach provides users with context-rich answers, though the integration adds complexity and requires clear guardrails to maintain data accuracy. 

 

 

 

 

Task-oriented agent-based systems​ knowledge graph

Task-oriented agent-based systems 
This approach breaks complex user requests into subtasks. Each subtask is handled by a separate agent and the agents can interact/validate with each other. While effective, this architecture requires high-quality knowledge graph data, as well as well-defined data relationships. 

 

 

 

 

 

Common challenges in optimizing a knowledge Graph for GenAI 

Building a knowledge graph that meets the demands of generative AI requires addressing several challenges. Here are a few challenges that our ONTOFORCE knowledge graph experts have seen repeatedly: 

  • Closing the domain gap: Users’ mental models often differ from the knowledge graph model. For instance, a user might ask for all “publications by authors in the U.S.,” but if the graph lacks a “based in” relationship, this query cannot be fulfilled.  
  • Bad descriptions: Concise and clear labels in the knowledge graph help LLMs accurately interpret data points. Striking a balance between sufficient context and avoiding label noise is crucial.  
  • Ambiguous concepts and relationships: Overlapping terms, like “clinical protocol” and “clinical trial,” or “drug” and “medicine,” can create ambiguity. Just like a human, an LLM will struggle to reason with ambiguity. 
  • Handling technical imperatives: Sometimes, technical constraints require additional “bookkeeping” or intermediary concepts within the graph. These concepts facilitate complex queries but may confuse users.  

Rigorous testing and fixing failure models 

Testing is vital in transitioning prototypes to reliable production systems. Utilizing a combination of testing approaches ensures the knowledge graph is intuitive and performs well for both LLMs and end-users.  

Ready to learn more about testing approaches? Our latest webinar details our recommended testing approaches and also provides tangible strategies to address the common challenges of building and optimizing a knowledge graph for GenAI. Watch now!

Watch now Fireside chat The value of FAIR - business case and ROI