The value of knowledge graphs for AI & ML applications

BLOG

The value of knowledge graphs for AI & ML applications

Knowledge graphs are great for storing scattered data in a single, searchable location allowing efficient implementation of AI & ML. Different ways already exist to combine their strengths and an increased focus on user experience will expand their applicability.

ONTOFORCE team

Knowledge graphs are collaborative tools

Knowledge graphs connect data from different groups of people and different organizations. They form a collaborative platform that provides space for individuals from various scientific backgrounds to work together with a common understanding, improving domain knowledge over time. Because these graphs make data traceable, they are very useful for complex collaborations in well-regulated domains such as the pharmaceutical industry. In addition, the implementation of artificial intelligence (AI) in general comes with its own regulations and policies. Knowledge graphs provide a clear audit trail, not only allowing users to trace back the origin of data, but also how it has been processed. This results in easy identification and correction of gaps in the available data.

AIs are (kind of) people too

A bold statement indeed, but not as far off as you would think. People and organizations need common standards, processes, and norms to collaborate efficiently. Also, a clear, unambiguous understanding of the proposed information and its origin are crucial to form correct and trustworthy conclusions. Furthermore, when new information is available, you want to update your prior beliefs and feed this back into the existing pool of knowledge by sharing it with others.

Sound familiar? These challenges also apply to AI and machine learning (ML) applications. This is important because there will always be a human decision maker accompanying these technologies. Knowledge-based models not only support collaboration between different people and organizations, but also between humans and AI. Acknowledging these common challenges is, therefore, very important.

How knowledge platforms and AI work together

Collaborations between knowledge platforms and AI come in different flavors and with their own level of maturity. A few examples are given below ranging from basic applications, like sending data to a machine learning algorithm or AI, to a deeper integration of AI in knowledge graphs, driving high amounts of value.

Classical machine learning

If you want to make a model which identifies patients with a high risk of cardiovascular disease, you will probably start by collecting all the available data. Afterwards, you need to identify key features from these data that are associated with the level of risk. These features, however, consist of different data types and are scattered across several databases using distinct code and structures.

To bring uniformity to this chaos, knowledge graphs can be of huge help by combining data into one platform allowing integration and standardization of data, together with efficient feature selection and extraction. And here, user experience becomes increasingly important. We are now reaching a point where users don’t need to know the intricacies behind the knowledge graph in order to extract meaningful conclusions.

Embeddings encoding the meaning of concepts

Embeddings are large strings of numbers representing how closely concepts are related to each other. In the context of scientific papers, for example, each number represents how closely words are related to each other in the text. This leads to similar embeddings for words with similar meanings. With this knowledge in mind, you can predict, for example, the next word in a sentence. A knowledge graph takes this further out of the box, by providing between the concepts that are required to create an embedding.

For example, if you want to find candidates for repurposing drugs and you have a knowledge graph with data on existing drugs, you can produce embeddings for these data and see how the different drug targets, pathway conditions, etc. of the different candidates are related to each other. Afterwards, you can use these embeddings to train a model to find drugs with similar properties.

This is already more advanced than the classical machine learning explained above, given that feature selection happens automatically, and the model is able to spot connections invisible to the human eye. However, this power comes with limitations since the data used must be complete, accurate and of very high quality. Secondly, the model is a black box. If a wrong prediction is made, it’s difficult to trace back the origin. The best way to deal with these shortcomings is to combine different complementary technologies to mitigate their respective weaknesses and efficiently answer the user’s questions.

Retrieval augmented generation (RAG)

Large language models (LLMs) are very potent in understanding human queries and figuring out what the user wants to know, favoring the model’s accessibility. However, LLMs are programmed to generate text, not store knowledge. Therefore, they are prone to accidentally ‘making things up’, providing users with wrong information. That’s where retrieval augmented generation comes to the rescue. By checking the model’s answers against data from the knowledge graph, the LLM can produce a response augmented with knowledge. This improves the accuracy of generative AI models by mitigating the risk of wrongful information production and providing explainability to the generated outcomes.

Read our blog post on retrieval augmented generation (RAG) and learn how it can effectively address LLM hallucinations and out-of-date training data.

Feeding knowledge back into the graph

After the creation of new valuable information by AI or ML, its may be useful to feed this information back into our current knowledge pool. That’s where the capabilities of the knowledge graphs and -platforms become really valuable because they form a ready-made environment to retrieve, normalize, structure and validate all this knowledge. This allows you to manage the information you want to include in your AI or ML analysis. You can, for example, only include highly confident data or only include human-generated knowledge. So, feeding data back into our knowledge graph provides us with a fine-tuned approach to recycle and use this knowledge effectively in future applications. The human role in mediating this knowledge exchange will become increasingly important, again highlighting the crucial role that user experience plays.

The future: Knowledge graphs for the typical user

We’re moving towards a world where knowledge graphs can be continuously updated in real time, orchestrated by ML. As previously mentioned, humans play an important part in this evolution because new knowledge needs to be inspected and validated. Luckily, knowledge platforms form the perfect infrastructure for people, multi-modal data, and AI to collaborate using a common language. Over the next few years these models will probably become more and more accessible due to the increased attention to user experience. In this way, knowledge graphs will become useful to everyone looking for an answer amongst the chaos of data.

Hungry for details? View our webinar and learn how to elevate your AI and ML analytics with knowledge graphs.

Or, get best practices and tips for modeling, data quality, and knowledge utilization for generative AI and machine learning in life sciences research in this blog.

How about discussing your project with our experts?

Book a meeting