Datafairy Bioassay Annotation

Project Charter

This project aims to convert unstructured assay protocol descriptions into a high-quality FAIR data set, and create standards for this information.

This project will:

Revolutionize biotech R&D by standardizing research methods and improving reproducibility.
Save time for scientists by reducing assay planning efforts and avoiding failed experiments.
Help data scientists harmonize datasets, cleanse data, and enable advanced analytics, ML, and AI.
Foster precompetitive collaborations and interoperable scientific data initiatives.
Lower internal bioassay curation costs and simplify regulatory submissions.
Aid assay kit vendors and CROs by increasing visibility in public databanks like ChEMBL and PubChem.
Improve scientific publications’ quality through common assay annotation standards and public data banks.
Maximize research funders’ ROI by increasing the value of funded science.

The project do this by addressing key issues:

Over 1.4 million assay protocols in publications lack suitable formats for automated mining.
Research organizations spend 4–12 weeks per assay selecting, setting up, and validating protocols, often leading to wasted efforts.
Current manual curation methods are labor-intensive, while automated systems remain inaccurate.
Many organizations already convert unstructured assay protocols into machine-readable forms, duplicating costly efforts.
Obsolete assay protocols and evolving technologies hinder reproducibility and historical data interpretation.

What will the project deliver?

Share costs for converting published bioassay data into accurate, machine-readable FAIR data using a community-defined model.
Develop a FAIR data model based on public ontologies like BioAssay Ontology and promote it as an industry standard.
Ensure FAIR data is publicly accessible after a brief exclusivity period for partners.