Accepted to BioNLP 2026

SCoPE: Planning for Hybrid Querying over Clinical Trial Data

Structured Clinical hybrid Planning for Evidence retrieval in clinical trials

Suparno Roy Chowdhury, Manan Roy Choudhury, Tejas Anvekar, Muhammad Ali Khan, Kaneez Zahra Rubab Khakwani, Mohamad Bassam Sonbol, Irbaz Bin Riaz, Vivek Gupta

Arizona State University, Mayo Clinic

1,500 questions 31 target fields 159 x 32 table

Abstract

We study clinical trial table reasoning, where answers are not directly stored in visible cells and must be inferred from semantic understanding through normalization, classification, extraction, and lightweight domain reasoning. We introduce SCoPE, a multi-LLM planner-based framework that decomposes the problem into row selection, structured planning, and execution. Across 1,500 hybrid reasoning questions over oncology clinical-trial tables, explicit planning improves grounded row-level reasoning accuracy over direct prompting and stronger tabular baselines, while maintaining a favorable accuracy-efficiency tradeoff.

Method

  1. Executor row selection: identify candidate relevant rows from the visible table.
  2. Planner structured reasoning: predict source field, relevant columns, reasoning rules, and output constraints.
  3. Executor final generation: apply the plan and return row-aligned predictions.

Dataset

The benchmark contains 1,500 programmatically augmented hybrid reasoning questions constructed from an expert-authored seed set of 500 questions over oncology clinical-trial data.

Statistic Value
Rows159
Columns32
Unique trials105
Cancer types19
Total questions1,500
Target fields31

Main Results

SCoPE improves grounded row-level reasoning over direct prompting and tabular baselines. In the reported results, it is strongest overall on GPT-OSS and Qwen3, and ties Table F1 while improving grounding metrics on Llama-3.3.

Method Qwen3 F1 Llama-3.3 F1 GPT-OSS F1
Zero Shot56.3266.9673.50
CoT55.3770.8774.17
Few-Shot54.7469.3873.99
EHRAgent32.9930.9934.85
SCoPE63.1970.8774.31

Acknowledgment

Supported by the Mayo Clinic and Arizona State University Alliance for Health Care Collaborative Research Seed Grant Program.