ONTOFORCE  ACCELERATING DRUG DISCOVERY USING KNOWLEDGE GRAPH (2)

Knowledge graph

Pharmaceutical research investment

According to Deloitte, the leading 20 pharmaceutical firms expended a massive $139 billion on research and development in 2022. Despite this substantial investment, the rate of new drug approvals has slowed considerably in recent years. This disparity underscores the need for a paradigm shift in drug discovery methodologies, especially in underserved therapeutic areas.

Automation and pharma 4.0 Vision

Automating drug discovery workflows at research sites brings hope and aligns with the Pharma 4.0 vision, which emphasizes outcomes tailored to clinical, process development, and manufacturing data aggregation.

 

ONTOFORCE  knowledge graph  Automation and Pharma 4.0 Vision (2)
ONTOFORCE  knowledge graph  Syloed Data

Challenges of siloed data

Biotech and pharma companies collect a lot of data over time. This data includes information about molecules, lab experiments, and clinical trials.

However, researchers often find it difficult to access and use this data effectively because it is stored in different places. The painstaking process of connecting and consolidating this data can take weeks or even months, impeding the drug discovery journey.

Crafting the solution: the knowledge graph

The solution to fragmented and disconnected data sources lies in making use of a knowledge graph.

HERO-Research-ICON-120px-Java@2x

The architecture of knowledge graph construction and downstream tasks

Creating a knowledge graph involves two important parts: the user interface and the data storage system.

Data integration services

ETL processes are used to convert and combine data efficiently. The data is then prepared for easy access. Monitoring and notifications are carefully managed to make sure the data flows smoothly.

HERO-Data Modelling-ICON-120px-Java@2x

Enhancing knowledge graph with predictive models

The knowledge graph can do more with predictive models. These models use advanced techniques to do different tasks, like checking toxicity or recommending similar drugs. These tasks make the knowledge graph more impactful and give researchers more useful information.

Search functionality

The knowledge graph is also useful for searching. It helps users find specific information and important details. This information can be used to make smart choices, like adjusting experiments or improving drug production using past data.

HERO-Idea-ICON-120px-Java@2x

Data licensing and maintenance

One critical aspect of building and maintaining a knowledge graph is data licensing.

Understanding licensing terms

Many sources provide data under various licenses, and it's crucial to understand the licensing terms and ensure that the data is available for use.

Regular license review

Moreover, maintaining data licenses is not a one-time task. It requires regular reviews because sources and their licenses can change over time. This ongoing process ensures that the knowledge graph remains compliant with the latest licensing terms.

Creating globally unique URIs

After integrating data sources into the knowledge graph, the first step is to create unique URIs for entities.

Choosing the right URI structure

The choice of URI structure depends on various factors, including the presence of source-specific UI schemas.

Actionable-strategies-for-integrating-data-silos-in-pharma-Webinar-ONTOFORCE URIS AT THE FOUNDATION

 

Navigating equivalency matching in knowledge graphs

As the knowledge graph grows with data from multiple sources, the challenge of equivalency matching arises.

Multi-layered equivalency approach

Equivalency matching means finding entities in the graph that represent the same concept, even if they're from different sources.

ONTOFORCE  knowledge graph  Navigating equivalency matching in knowledge graphs
ONTOFORCE  knowledge graph  Modularization strategies for scalable systems in knowledge graphs

Reintegration challenge

However, modularization also introduces the challenge of reintegrating data modules to ensure that all links within the global knowledge graph are established correctly. This step involves stitching together the modularized data.

Modularization strategies for scalable systems

Building and maintaining a knowledge graph with data from numerous sources requires a scalable approach.

Advantages of modularization

Modularization provides several advantages, including:

  • Parallel processing: Processing modules independently enable faster data integration.
  • Technologies like Kubernetes enable scaling processing resources for better efficiency.
  • Isolation: If an issue arises in one module, it doesn't disrupt the entire process. Only the affected module needs attention.

 

Data quality control and maintenance

Maintaining data quality is crucial in the knowledge graph construction and maintenance process

STAGES OF QUALITY CONTROL 

Downloading

The downloading stage checks data sources for changes in structure or format. Validation ensures that the downloaded data matches expectations.

Modularization

During the modularization stage, we perform quality control checks on each module, which include link verification and data normalization.

Integration

During the integration stage, we combine data domains into the global knowledge graph and conduct quality control checks. A regression test compares the current data with previous versions to identify any significant changes.

Collaboration and issue resolution

Collaboration plays a vital role in maintaining data quality and resolving issues.

Approaches for issue resolution

When issues arise due to source-specific changes, there are two primary approaches for resolution:

  • Adjust logic: Logic within the knowledge graph can be adjusted to handle specific issues.
  • Contact source: Collaboration with the original data source is often the preferred approach.

Conclusion


Building and maintaining a knowledge graph for drug discovery is a complex but highly rewarding endeavor.


Key components of successful knowledge graph construction and maintenance


Achieving success in knowledge graph construction and maintenance hinges on elements such as data licensing compliance, the creation of globally unique URIs, equivalency matching, modularization for scalability, data quality control, and collaborative issue resolution.

The pharmaceutical industry can speed up finding new drugs and promote innovation in the search for new treatments using knowledge graphs.