22 Advancing clinical trial reporting and AI integration: Optimizing protocol data extraction using LLMs and regulatory best practices

Ramya Sri Baluguri; Nicholas Anderson

doi:10.1017/cts.2024.713

22 Advancing clinical trial reporting and AI integration: Optimizing protocol data extraction using LLMs and regulatory best practices

Published online by Cambridge University Press: 11 April 2025

Ramya Sri Baluguri and

Nicholas Anderson

Show author details

Ramya Sri Baluguri: Affiliation:
University of California, Davis
Nicholas Anderson: Affiliation:
University of California, Davis

Article contents

Abstract

Rights & Permissions

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

Objectives/Goals: This study aimed to enhance clinical trial data management through large language model information retrieval and generation techniques within the clinical trial reporting workflow. We focused on improving compliance with reporting, reducing human labor, and promoting standardized reporting structure and data quality oversight. Methods/Study Population: We used approved study protocols from UC Davis IRB-approved investigator-initiated studies compared to the same studies reported to ClinicalTrials.gov. Our baseline data extraction system employs commercial large language models (LLMs) and retrieval augmented generation (RAG) to isolate data sources within the secure extraction environment. We stratified protocol documents into easy, complex, and random categories based on study focus, document complexity, the extent of amendments or modifications, and completion metrics from ClinicalTrials.gov. We developed a pilot web-based architecture to capture variations in categorization, labeling, and reporting style and compared generated extraction data. We primarily focused on qualitative evaluation through a review of expert staff. Results/Anticipated Results: Our results revealed significant variations in reporting quality, with dependencies stemming from multiple authors and stages throughout the clinical trial protocol lifecycle. Based on these variations, we used prompt engineering to improve the pilot application’s output compliance with the protocol registration and results system (PRS) structured data format for various study types. We piloted the assisted workflow with prospective studies by partnering with study investigators and the clinical trial office staff to assist in review and clinical trial reporting creation. Initial studies reported by our system were approved and released to the public by PRS staff. We are refining content generation and workflows to different components of studies and evaluating their use in quality and training areas. Discussion/Significance of Impact: Our system fosters collaboration, efficient review, and compliance with clinical trial reporting standards. It supports the promise of AI-driven assistance in clinical trial management, design, and reporting. We focus on the multiple stakeholders, expertise, and data flows in the organizational management of clinical and translational science.

Type: Informatics, AI and Data Science
Information: Journal of Clinical and Translational Science , Volume 9 , Issue s1 , April 2025 , pp. 8

DOI: https://doi.org/10.1017/cts.2024.713 [Opens in a new window]
Creative Commons: This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.

Article contents

22 Advancing clinical trial reporting and AI integration: Optimizing protocol data extraction using LLMs and regulatory best practices

Abstract

Save article to Kindle

Save article to Dropbox

Save article to Google Drive

Reply to: Submit a response

Your details

You have entered the maximum number of contributors

Conflicting interests