Real-World Evidence Sources for Drug Safety: Registries and Claims Data Explained

Real-World Evidence Sources for Drug Safety: Registries and Claims Data Explained
Evelyn Ashcombe

Rare Event Detection Calculator

Calculate the minimum number of patients needed to detect a rare adverse event using either patient registries or claims data. Based on FDA guidance and real-world evidence studies.

Input Parameters
Results

Claims Data: -- patients needed

Disease Registries: -- patients needed

Based on FDA guidance: Claims data requires approximately twice the sample size of registries for the same level of statistical confidence when detecting rare adverse events.

This is because registries capture more detailed clinical information (87% lab result completeness) compared to claims data (45-60% lab result completeness).

When a new drug hits the market, the real test doesn’t happen in a clinical trial-it happens in the real world. Millions of people take it, with different health conditions, lifestyles, and genetics. That’s where real-world evidence comes in. Two of the most powerful tools for tracking drug safety outside controlled trials are patient registries and claims data. They don’t replace clinical studies, but they fill in the gaps that trials can’t see-like rare side effects, long-term risks, or how a drug performs in older adults or people with multiple chronic conditions.

What Exactly Is Real-World Evidence?

Real-world evidence (RWE) isn’t made in labs or under strict trial conditions. It’s pulled from everyday healthcare interactions-what doctors record, what pharmacies dispense, what insurers pay for. The U.S. Food and Drug Administration (FDA) officially defined RWE in 2018 as clinical evidence derived from real-world data (RWD). That means data collected during normal care, not research. Registries and claims data are the backbone of this. They’re not new-FDA has used them since the 1980s-but their role in regulatory decisions has exploded since 2017. Between 2017 and 2021, the FDA approved 12 drugs or new uses where RWE played a direct role. Five of those relied specifically on claims or registry data.

How Disease Registries Work

Disease registries are like detailed medical diaries for groups of patients with the same condition. Think of them as structured databases that track everything: diagnosis, treatments, lab results, imaging, even how patients feel day to day. The Cystic Fibrosis Foundation Patient Registry, for example, helped spot safety signals for ivacaftor-a drug that worked brilliantly in specific genetic subgroups that weren’t well represented in the original trials. Without this registry, those risks might have gone unnoticed for years.

These registries vary in scale. Some are small, run by a single hospital with a few hundred patients. Others, like the SEER cancer registry, cover nearly half the U.S. population. What makes them powerful is depth. A 2021 study found registries provide 37% more detail on long-term outcomes than claims data alone. They capture things like blood test results, imaging reports, and patient-reported symptoms-data that insurance claims simply don’t record.

But they’re not perfect. Setting up a registry takes 18 to 24 months and costs between $1.2 million and $2.5 million upfront. Annual upkeep runs $300,000 to $600,000. Participation rates are often only 60-80%, meaning the data might not represent everyone. And nearly one in three academic registries shut down within five years due to funding gaps.

Claims Data: The Power of Scale

Claims data is what insurers and government programs like Medicare use to pay doctors and hospitals. Every time someone gets a prescription, visits the ER, or has a lab test, it gets coded and billed. These codes-ICD-10 for diagnoses, CPT for procedures, NDC for drugs-create a massive, continuous trail of healthcare activity.

Because claims data covers millions of people, it’s unmatched for spotting rare side effects. In 2015, the FDA analyzed 1.2 million Medicare records over five years to check if entacapone (used for Parkinson’s) increased heart risks. No link was found. In 2014, they used 850,000 records to review olmesartan (a blood pressure drug) for risks in diabetics. These studies would’ve been impossible in clinical trials, which rarely enroll more than a few thousand people.

Major commercial claims databases include IBM MarketScan (200 million lives), Optum Clinformatics (100 million), and Truven Health MarketScan (150 million). Medicare claims can follow a patient for 15+ years. That’s critical for spotting delayed side effects-like liver damage or increased cancer risk-that only show up after years of use.

But claims data has blind spots. It doesn’t capture lab values, physical exam findings, or patient symptoms unless they lead to a billable service. Only 45-60% of lab results are recorded. Diagnosis codes can be wrong-up to 20% error rates, according to AHRQ. And it doesn’t tell you why a drug was prescribed or whether the patient actually took it.

Clocktower of claims data versus detailed registry hub with patient health tracking over time.

Registries vs. Claims Data: A Side-by-Side Look

Comparison of Registries and Claims Data for Drug Safety Monitoring
Feature Registries Claims Data
Population Size 1,000-50,000 patients 100 million+ patients
Clinical Detail High (87% lab result completeness) Low (45-60% lab result completeness)
Longitudinal Coverage 5-10 years (typically) 15+ years (Medicare)
Data Completeness 68-92% 95-98% for inpatient visits
Best For Rare diseases, detailed outcomes, specialized populations Common drugs, large-population safety signals
False Positive Rate Low (due to clinical context) Up to 22% (requires clinical review)

For rare events-like a side effect affecting 1 in 10,000 people-claims data needs about 1 million records to detect it reliably. Registries, because they’re more detailed, can do it with 500,000. That’s why oncology leads in registry use: cancer drugs often target small, genetically defined groups. Cardiovascular drugs, used by millions, rely more on claims data.

Why the FDA and EMA Are Betting Big on Both

The FDA’s Sentinel Initiative, launched in 2008, connects 11 major healthcare systems and three claims processors to monitor safety across 300 million patient records. It’s the gold standard for real-time surveillance. In 2022, the FDA reviewed 107 RWE submissions-up from just 29 in 2018. That’s a 270% increase in just four years.

The European Medicines Agency (EMA) launched Darwin EU in 2021 to do the same across the EU. By October 2023, it connected 32 databases covering 120 million people. Both agencies now recommend combining registries and claims data. A 2023 ICH proposal found this hybrid approach cuts false safety signals by 40%. Why? Registries confirm the clinical reality behind a statistical spike in claims data.

Dr. Amy Abernethy, former FDA deputy commissioner, said registry data can offer evidence nearly equivalent to randomized trials for certain safety questions. Dr. Janet Woodcock, former head of FDA’s drug division, called claims databases “indispensable” for catching rare risks. But not everyone is convinced. Dr. Joseph Ross from Yale warns that claims data alone often leads to false alarms-22% of initial signals turned out to be noise after clinical review. That’s why regulators now demand more than just code patterns. They want context.

Control room with holograms of claims and registry data, AI scanning for drug safety signals.

What’s Changing in 2024 and Beyond

The rules are tightening. In January 2024, the FDA released draft guidance requiring registries to have at least 80% data completeness on key variables to be accepted for safety studies. That’s a big deal-it means half-baked registries won’t cut it anymore.

The FDA’s 2023-2027 RWE Action Plan promises to develop 5-7 new statistical standards for claims data analysis by 2025. One major focus: fixing immortal time bias-a statistical error that inflates drug safety risks if not corrected. Proper methods can reduce this bias by 35-50%.

Technology is catching up too. Novartis started blending wearable data-like heart rate and activity levels-from patients on Entresto with claims records in 2023. AI tools now scan millions of records for unusual patterns, reducing false positives by 28% according to a 2024 JAMA study. The FDA’s REAL program is now standardizing registry data collection for 20 priority diseases, starting with rare conditions where traditional monitoring fails.

Who’s Using This-and Why It Matters

The global real-world evidence market was worth $2.14 billion in 2022. By 2030, it’s expected to hit $10.7 billion. Pharmaceutical companies are spending 8-12% of their pharmacovigilance budgets on RWE-up from 3-5% in 2017. Why? Because regulators demand it. Investors demand it. And patients deserve better safety monitoring than what clinical trials alone can offer.

For patients, this means drugs are being watched more closely after approval. For doctors, it means better data to guide treatment choices. For companies, it means faster approvals and fewer surprises down the line. Registries and claims data aren’t just tools-they’re becoming the new standard for how we understand drug safety in the real world.

14 Comments:
  • Shayne Smith
    Shayne Smith December 6, 2025 AT 15:18

    Honestly? This is the kind of stuff that actually saves lives. I work in pharmacy and see how often we miss things until it's too late. Real-world data isn't glamorous, but it's real.

  • pallavi khushwani
    pallavi khushwani December 8, 2025 AT 10:06

    I grew up in a small town in India where people just took meds and hoped for the best. Knowing that someone's tracking this stuff across millions of records... it gives me hope. Not just for me, but for my grandma who's on five different pills.

  • brenda olvera
    brenda olvera December 9, 2025 AT 21:35

    This is why I love American healthcare even when it sucks. We've got the data. We've got the will. We just need to stop arguing long enough to use it.

  • olive ashley
    olive ashley December 10, 2025 AT 05:25

    So you're telling me the FDA is now relying on insurance billing codes to decide if a drug kills people? Wow. So next they'll use TikTok trends to approve cancer meds. I'm just saying.

  • Ibrahim Yakubu
    Ibrahim Yakubu December 12, 2025 AT 02:17

    Nigeria has no such system. My cousin died from a drug reaction and no one even knew it was the drug. We have no registries, no claims data, just prayers and hope. This is a luxury only rich countries can afford.

  • Brooke Evers
    Brooke Evers December 13, 2025 AT 06:13

    I just want to say thank you to everyone who works on these databases. I know it's boring, behind-the-scenes work, but when my mom was on that new blood thinner and we saw the warning pop up in her chart because of a pattern found in claims data? That saved her. You don't get medals for this, but you should.

  • Saketh Sai Rachapudi
    Saketh Sai Rachapudi December 15, 2025 AT 05:13

    America thinks it's so advanced but they still can't even spell 'registry' right in their reports. And claims data? Ha! My cousin works at a hospital and says 40% of the codes are just guesses. This whole system is built on sand.

  • joanne humphreys
    joanne humphreys December 16, 2025 AT 08:04

    I'm curious how they handle cultural differences in reporting. Like, in some communities, people don't report side effects because they don't trust the system. Does that skew the data? Or is that just ignored?

  • Kay Jolie
    Kay Jolie December 18, 2025 AT 00:10

    The convergence of longitudinal claims datasets with granular registry phenotyping represents a paradigmatic shift in pharmacovigilance architecture. We're no longer just detecting signals-we're reconstructing clinical narratives at scale. The ICH E2E framework is now fundamentally inadequate without this hybridized epistemic infrastructure.

  • Dan Cole
    Dan Cole December 18, 2025 AT 07:25

    You know what's ironic? We spend billions on clinical trials that only include 2% of the population, then act shocked when the drug behaves differently in the real world. The real trial has been running for decades. We just refused to look at the results.

  • Billy Schimmel
    Billy Schimmel December 20, 2025 AT 04:10

    So basically, we're using insurance records to catch drug side effects because we don't trust doctors to report them. That's... not reassuring.

  • Max Manoles
    Max Manoles December 20, 2025 AT 12:04

    The FDA's 80% data completeness requirement is long overdue. I've reviewed registry proposals that were missing vital lab values and demographic info. If you can't track hemoglobin levels or age distribution, you're not studying safety-you're guessing.

  • Katie O'Connell
    Katie O'Connell December 21, 2025 AT 09:42

    It is imperative to acknowledge that the integration of real-world evidence into regulatory decision-making necessitates a rigorous adherence to methodological standards, particularly with regard to confounding variable adjustment and temporal alignment of exposure windows. Without such precision, the validity of inferred causal relationships remains inherently compromised.

  • Akash Takyar
    Akash Takyar December 22, 2025 AT 23:49

    I am grateful that our healthcare systems are finally moving toward data-driven safety monitoring. However, we must ensure that funding for registries is sustained, and that participation rates are improved through community engagement and transparent communication. This is not merely a technical issue-it is a moral obligation.

Write a comment