Skip to main content

Regulatory Repository | 30+ Years Historical Data

We have built an integrated repository of regulatory data in the Life Sciences space using Open Data. This allows users to trace a drug's or medical device's entire lifecycle including:

  • Starting with chemical compounds (NLM's PubChem)
  • Through clinical trials (ClinicalTrials.gov, WHO's ITPR)
  • Documentation on regulatory pathway (IND, NDA, etc.)
  • Reported adverse events (FDA's FAERS / MAUDE)
  • Manufacturer payments to providers (HHS' OpenPayments)
  • Medicare reimbursement data (CMS' Provider Utilization and Payment Data)

The Repository offers users a 360 degrees view of each previously-cleared drug or medical device.

Medicine Navigator

Once we integrate all the Open Data related to a drug then we'll build a navigation mechanism to be able to view this massive amounts of data through personalized "views".

We think these "views" would be a good starting point:

Intellectual Property View: extracted relevant data from the US' Patent and Trademark Office
Chemical View: compound-level information
Payments & Providers View: this US-specific view displays the money flows in the US healthcare system
Research View: publications relevant to the specific drug
Regulatory View: data gathered across offices of regulatory agencies
Foreign Views: country-specific data from sources outside the US (initially Mexico and Spain)

Spain

Here's a view of a type of Open Data sourced from Spain's regulatory agency, AEMPS.

FDA's 510(k)

We download and process thousands of PDFs from the FDA and extract all the text from those documents.

EMA - Summary of Opinions

Here's a view of a type of Open Data sourced from the European Union's regulatory agency, EMA.

Open Data to Train AI, ML

The data stored in our Open Data Repository module can be used to train AI / ML models. Think about this data as the "Ground Truth" of what has happened in the pharma space in the US for the last 30 years.

Users can leverage Life Sciences-specific regulatory documents crafted before the age of "AI", including:

  • 40,000+ Protocols, SAPs, ICFs
  • over 70,000 FDA application files
  • 110,000 full FDA labels ("SPL")

Users can use this data to train their Models with the text extracted from all those documents, containing 600+ million words.

We can also include additional Open Data from other US agencies, including:

  • CMS – Medicare
  • HHS – healthcare
  • NLM – research and publication references

Contact us

Please contact us for more details.