Delivering Data Driven Value

How FAIR Is My Data: The FAIRe(nough) Benchmark at AstraZeneca

Leveraging the full potential of data is the key to gaining new insight into the scientific and developmental challenges faced by the pharmaceutical industry today. The FAIR principles offer a conceptual framework to meet this challenge by focusing on making data Findable, Accessible, Interoperable and Reusable, but realisation of these concepts greatly varies depending on context and critical needs met in each use case.

As a key step in the journey to data compliance with FAIR, organisations need to evaluate their data against a set of measurable, well defined metrics. The FAIRe(nough) benchmark has been developed in AZ with a focus on integration of data products and systems against a specific context or use case. Providing both high- and low-level assessment metrics, the benchmark is an essential enabler for anyone facing a data integration and FAIRification project.

In this talk, we will present the main characteristics of the FAIRe benchmark, provide an overview of its development and initial application use cases and discuss the perspectives and future plans for development within AZ.

Improving the Efficiency and Effectiveness of R&D: Data Science and Standards

The Pistoia Alliance is pleased to announce a series of three webinars hosted by Accenture Boston Innovation Hub. As part of our overall theme of Improving the Efficiency and Effectiveness of R&D, each webinar will explore the power of collaboration to solve shared challenges, drive transformation and reduce barriers to innovation in R&D.

  • Collaborate to Innovate the Lab of the Future
  • Data Science and Standards
  • AI/ML in R&D Automation

Learn from industry experts and thought leaders as they share the latest trends and best practices in life sciences R&D and discuss how we can collaborate to innovate.

Ontologies in Action, Role in Data-Centric Biomedical Projects

Targeted drug discovery is data-intensive and involves extracting statistical trends from events of interest collected over a large population. But, relevant data in biomedical research is often hard to find. High throughput experiments churn up data quickly, but much of it is devoid of context due to missing labels, wrong annotations, and irretrievable storage. And in the case of predicting rare events – such as cell lines that respond to a specific drug—only a few thousand relevant samples will be available.

Generating more data is essential, but we often do not sufficiently leverage existing data. Ontologies are key to good interoperability, enabling data & metadata integration and allowing researchers to leverage more of what is known in their scientific domains to make data-driven decisions faster. In addition, the advanced relationships in ontologies help mine for unexpected co-occurrences and suggest novel uses of drugs or similarities between diseases. This webinar will demonstrate how ontologies are essential to understanding the incoming data deluge for an enterprise & how they help retrieve the most relevant data for R&D efforts.

IDMP Core Ontology – Enable Patient Safety with a Semantic Approach

The voice of the empowered patient is on the rise. Through a series of keynote presentations, focused discussions, and breakout sessions, the Pistoia Alliance’s Annual European Conference focused on Patient Centricity in R&D in an effort to look at the challenges and opportunities for embedding patient centricity within the biopharma value chain—from early discovery right through to post commercialization and its potential for real-world data insights.

Session Summary

This session will explore the Pistoia Alliance’s new IDMP Ontology project and its potential global implementation approach.

  • Cross-industry collaboration with an iterative implementation approach
  • Implementation of the ISO IDMP Logical Model with a FAIR ontological approach
  • Enable the interoperability of data and metadata across the value chain

Speakers

  • Sheila Elz, Master Data Manager, Bayer
  • Heiner Oberkampf, CEO, Accurids
  • Gerhard Noelken, Project Lead, Pistoia Alliance
  • Mike Meriton, Co-Founder & COO, EDM Council

UDM (Unified Data Model) for chemical reactions – past, present and future

The UDM (Unified Data Model) is an open, extendable and freely available data format for the exchange of experimental information about compound synthesis and testing. The UDM had been initially developed in a collaborative project between Elsevier and Roche, where chemical reactions data from a variety of disparate data sources existing at Roche was consolidated and integrated into the Roche in-house version of the Reaxys database. Elsevier adapted the UDM model to its needs and finally donated its pre-4.0 release to the Pistoia Alliance for further development together with the five project founders (Elsevier, Roche, BIOVIA, GSK and Novartis, joined later by BMS), who contributed with funding and expertise to the Pistoia Alliance UDM project between 2017 and 2020. The latest UDM version 6.0 has been made freely available for the community under the MIT license in January 2021. The past, present, and future of the UDM exchange format are discussed in this article and factors that contribute to the successful adoption of the UDM format.

Maximizing Data Value for Biopharma through FAIR and Quality Implementation: FAIR Plus Q

Over recent years, there has been exciting growth in collaboration between academia and industry in the life sciences to make data more Findable, Accessible, Interoperable and Reusable (FAIR) to achieve greater value. Despite considerable progress, the transformative shift from an application-centric to a data-centric perspective, enabled by FAIR implementation, remains very much a work in progress on the ‘FAIR journey’. In this review, we consider use cases for FAIR implementation. These can be deployed alongside assessment of data quality to maximize the value of data generated from research, clinical trials, and real-world healthcare data, which are essential for the discovery and development of new medical treatments by biopharma.

HELM Toolkit

HELM (Hierarchical Editing Language for Macromolecules) enables the representation of a wide range of biomolecules (proteins, nucleotides, antibody drug conjugates etc) whose size and complexity render existing small-molecule and sequence-based informatics methodologies impractical or unusable.

Expert Panel Discussion: Ontologies and FAIR

Biopharmaceutical industry R&D continues the shift from being application-centric to being data-centric in recognition of the idea that, while technologies and applications may come and go, it is the data assets from internal and external sources that really drive drug discovery and development. Therefore, it is critically important for organizations to manage data and metadata effectively so that they may be used and reused to provide maximum value, and those organizations that do this best will have a significant and enduring advantage over their competitors.

The ultimate goal of the FAIR Data Principles is to help researchers increase the reusability of data. In particular, increasing the interoperability of data using the formal, standardized methods suggested by FAIR lays the foundation for establishing a shared understanding of the meaning of the data. Ontologies are key to good interoperability, which enables data and metadata integration and allows researchers to leverage more of what is known in their scientific domains to make higher quality decisions faster. Knowledge graphs, increasingly relied upon by researchers to help them understand the complex relationships between biomedical entities, are impossible to implement without ontologies.

Ontologies can be challenging to manage and deploy, but those activities are much more successful when started with data and metadata that are FAIR. The use of the FAIR Principles, conversely, can help make better ontologies. One major challenge today, however, is the limited availability of expertise in knowledge engineering and the FAIR Data Principles across biopharmaceutical industry R&D, which can hinder efforts to build and leverage knowledge graphs and other ways of making the best use and reuse of data. We have assembled a panel of experts to help you navigate this space, to help you understand why it’s valuable to leverage the interplay of ontologies and FAIR, and what a difference it can make in R&D decision making.

Speakers

Moderator: Ted Slater, Senior Director, Product Management PaaS, Elsevier
 
Panel Participants:

  • Jane Lomax, Head of Ontologies, SciBite
  • Peter McQuilton, Product Owner, GSK
  • Nathalie Conte, Data infrastructure Lead (Omic Data), AstraZeneca
  • Sabine Schefzick, Director, Science and Clinical Analytics and Informatics, Pfizer

IDMP Viewer

IDMP Viewer is a JAVA application that is specifically designed to access both the IDMP structure and its content in the easiest possible way. It can serve both as a web application and REST API. IDMP Viewer is an open-source project that EDM Council hosts. See https://github.com/edmcouncil/onto-viewer for details.

FAIR4Clin Implementation Guide

A Guide for Clinical Trial and Healthcare Data

This work was done and is maintained as part of the FAIR implementation project – an initiative by the Pistoia Alliance, a not-for profit organization to facilitate pre competitive collaboration in life science industry. The FAIR4Clin guide consists of three parts: Introduction, Metadata and Application.

Enhancing Access to Clinical Trial Data for Secondary Use

While the life sciences R&D industry is re-imagining clinical trial design in the age of digitalization, historical clinical trial data remains an important source of evidence that could inform today’s drug discovery and development.

In many pharma companies, it takes considerable time for an internal business function, such as research, to gain access to the company’s historical clinical data assets. Specifically, secondary use of clinical trial data needs approval by the appropriate internal governance function (legal/compliance/medical). Inter alia, this approval is granted upon verifying that the informed consent form (signed by the patient involved in the clinical trial) provides permission for the company to access the data for secondary use.

Furthermore, in a global clinical trial, patients will be recruited from many different nation-states, with their different languages, and each nation-state will have a national competent agency (regulatory agency) which may have a view on the eligibility of the secondary use of the data from that clinical trial.

At the Pistoia Alliance, we are exploring how the FAIRification of such data, along with advanced analytics including AL/ML/NLP powered systems can be used to access/share/reuse patient data from historical clinical trials. All this takes into account any applicable regulations and legislation.

In this talk, our presenters will highlight the key aspects to be considered in our path towards finding a common solution to this common problem.

Featured Topics

  • F.A.I.R. and Shared Data, Transforming the Ecosystem to drive Insights and Advanced Analytics
  • The principal legal and ethical considerations regarding the repurposing of clinical trial data
  • Policy-based data access

Learning Objectives

At the conclusion of this session, participants should be able to:

  • Recognize the role of advanced analytics in transforming data sharing
  • Consider the pathways afforded to us for the repurposing of (‘the secondary use of’) clinical trial data within the current legal and ethical frameworks for data privacy and confidentiality.
  • Evaluate data access through purpose and policy

Speakers

  • Benjamin Szilagyi, MSc, VP, Head Insights Data & Experimental Analytics, Roche
  • Francis Crawley, Executive Director, GCP Alliance
  • Chris Edwards, Solution Architect, Patient Data, AstraZeneca