Regulatory Repository | 30+ Years Historical Data
We have built an integrated repository of regulatory data in the Life Sciences space using Open Data. This allows users to trace a drug's or medical device's entire lifecycle including:
- Starting with chemical compounds (NLM's PubChem)
- Through clinical trials (ClinicalTrials.gov, WHO's ITPR)
- Documentation on regulatory pathway (IND, NDA, etc.)
- Reported adverse events (FDA's FAERS / MAUDE)
- Manufacturer payments to providers (HHS' OpenPayments)
- Medicare reimbursement data (CMS' Provider Utilization and Payment Data)
The Repository offers users a 360 degrees view of each previously-cleared drug or medical device.
Medicine Navigator
Once we integrate all the Open Data related to a drug then we'll build a navigation mechanism to be able to view this massive amounts of data through personalized "views".
We think these "views" would be a good starting point:
Spain
Here's a view of a type of Open Data sourced from Spain's regulatory agency, AEMPS.
FDA's 510(k)
We download and process thousands of PDFs from the FDA and extract all the text from those documents.
EMA - Summary of Opinions
Here's a view of a type of Open Data sourced from the European Union's regulatory agency, EMA.
Open Data to Train AI, ML
The data stored in our Open Data Repository module can be used to train AI / ML models. Think about this data as the "Ground Truth" of what has happened in the pharma space in the US for the last 30 years.
Users can leverage Life Sciences-specific regulatory documents crafted before the age of "AI", including:
- 40,000+ Protocols, SAPs, ICFs
- over 70,000 FDA application files
- 110,000 full FDA labels ("SPL")
Users can use this data to train their Models with the text extracted from all those documents, containing 600+ million words.
We can also include additional Open Data from other US agencies, including:
- CMS – Medicare
- HHS – healthcare
- NLM – research and publication references