Bridging the Gap: Key Data Elements in Oncology RCTs vs. RWD

Conducting Real-World Evidence (RWE) research in oncology requires access to diverse data sources, such as patient registries, electronic health records (EHR), claims data, and cancer-specific databases. These data sources often vary in completeness and quality limiting their suitability. Additionally, the rapid advancement of cancer therapies, including immunotherapies, targeted therapies, and combination treatments, complicates the ability to draw definitive conclusions from real-world data (RWD). Despite these challenges, RWE research holds great potential in oncology for advancing personalized medicine, as it can identify patterns in real-world patient populations and inform treatment decisions based on individual characteristics. This personalized approach could lead to more targeted and effective cancer therapies, improving patient outcomes. However, while Randomized controlled trials (RCTs) provide highly structured and reliable data, RWD often lacks key clinical details leading to challenges in RWD suitability when conducting comparative effectiveness research, external control arm studies, and regulatory decision-making.

 

This article explores the critical oncology data elements systematically collected in RCTs and key to address research questions but often missing or inconsistently reported in RWD.

 

Key Data Elements Missing in RWD Compared to RCTs

1. Tumor-Specific Characteristics

  • TNM Staging: RCTs systematically document tumor size, lymph node involvement, and metastasis, whereas RWD often lacks structured staging data, relying on physician notes or diagnostic codes.
  • Biomarker and Genomic Data: Molecular alterations (e.g., EGFR, ALK, KRAS mutations, PD-L1 expression) are mandatory in trials but infrequently available in RWD unless linked with specialized molecular testing databases.
  • Histology and Tumor Grading: RCTs precisely categorize tumor subtypes, whereas RWD may have missing or vague histopathology reports.

 

2. Treatment-Specific Data

  • Line of Therapy (LoT) Sequencing: RCTs rigorously define first-line, second-line, and subsequent treatments, while RWD databases often lack structured information, making it difficult to infer treatment progression.
  • Discontinuation Reasons: Trials capture whether treatment cessation was due to adverse events (AEs), lack of efficacy, or patient preference. In RWD, only prescription stops or administrative data may be available, without specific reasons.
  • Dosing Adjustments & Modifications: RCTs meticulously track dose reductions and delays, while RWD typically records only initial prescriptions without details on adjustments.

 

3. Disease Progression and Response

  • Objective Response Rate (ORR): RCTs use imaging-based RECIST (Response Evaluation Criteria in Solid Tumors) to determine complete response, partial response, stable disease, or progression. RWD lacks standardized radiologic assessments.
  • Progression-Free Survival (PFS): RCTs monitor disease progression through predefined imaging schedules, whereas RWD often infers progression based on treatment changes or hospital admissions.
  • Minimal Residual Disease (MRD) Status: Measured in trials for hematologic malignancies but rarely available in RWD.

 

4. Safety and Adverse Event Reporting

  • CTCAE-graded Toxicities: RCTs document toxicity severity based on the Common Terminology Criteria for Adverse Events (CTCAE), while RWD underreports side effects, often capturing only severe toxicities requiring hospitalization.
  • Onset and Resolution of Adverse Events: In trials, exact dates of AE onset and resolution are recorded, whereas RWD sources lack structured AE timelines.
  • Patient-Reported Outcomes (PROs): Fatigue, pain, and quality-of-life metrics are assessed in RCTs but are absent or inconsistently recorded in RWD.

 

5. Survival Outcomes

  • Overall Survival (OS): RCTs have complete survival data, whereas RWD depends on national death registries, which may have reporting delays.
  • Cause of Death: Trials differentiate between disease-related and treatment-related mortality, while RWD often lacks granularity.

 

6. Performance Status: The Case of ECOG Scores

ECOG Performance Status is a critical prognostic factor in oncology but is often missing, inconsistently recorded, or inferred in RWD.

 

Table 1: Key Data Elements Captured in RCTs compared to RWD

Aspect RCTs RWD
Availability Always collected at baseline and follow-up Often missing or inconsistently documented
Standardization Assessed systematically by investigators May be recorded as free-text in EHRs or not captured at all
Granularity Clearly defined criteria (ECOG 0-5) Inferred from physician notes or treatment patterns
Impact on Treatment Decisions Used for eligibility criteria and stratification May not always influence recorded clinical decisions

 

Why Do These Gaps Matter?

 

Missing or incomplete data in RWD can lead to:

  • Bias in comparative studies: Incomplete staging or performance status data can distort treatment effect estimates.
  • Misclassification of treatment patterns: Lack of structured LoT information hinders accurate real-world effectiveness assessments.
  • Regulatory challenges: Incomplete safety data can limit the acceptability of RWD-based evidence for decision-making.

 

Addressing RWD Gaps: Potential Solutions

  • Data Linkage: Combining claims, EHRs, pathology reports, and genomic registries enhances completeness.
  • Natural Language Processing (NLP): Extracting tumor characteristics, performance status, and adverse events from unstructured clinical notes.
  • Structured Data Capture in EHRs: Encouraging oncology-specific data fields (e.g., TNM staging, ECOG scores, biomarker results).
  • Machine Learning for Data Imputation: Estimating missing ECOG scores or disease progression markers from available data.

 

In conclusion, while RWD holds immense potential for oncology research, key data gaps in staging, biomarker data, response assessments, safety reporting, and performance status must be addressed to ensure its utility for regulatory, clinical, and epidemiological applications. By leveraging advanced data integration, structured documentation, and machine learning techniques, the oncology community can improve the robustness and reliability of RWD in complementing traditional RCT evidence.

By Nadia Barozzi

Passionate about data-driven insights and the advancement of Real World Evidence research, drug safety and pharmacovigilance.