Hostname: page-component-745bb68f8f-grxwn Total loading time: 0 Render date: 2025-01-09T14:31:59.612Z Has data issue: false hasContentIssue false

420 Computable Phenotyping with “Big Data” as a Foundation for Artificial Intelligence Algorithm Construction: Puberty as a Transdisciplinary Case Example

Published online by Cambridge University Press:  03 April 2024

Lorah D. Dorn
Affiliation:
The Pennsylvania State University
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

OBJECTIVES/GOALS: Artificial intelligence (AI) depends on quality machine learning (ML) algorithms constructed with high-quality training data. This TL1 trainee project develops a disease-agnostic computable phenotype framework for ML algorithm construction, modeling male puberty as a case example. METHODS/STUDY POPULATION: A computable phenotype of male puberty was constructed to answer the question: “Does early pubertal timing increase the risk of developing type II diabetes (T2D) in males?” A computable phenotype of males < 18 years old was created in the TriNetX© Diamond Network utilizing Boolean operator data queries. TriNetX© contains patient electronic health record information (ICD-10 diagnoses, anthropometric measures). An exploratory analysis of patient counts reflecting various computable phenotypes allowed for outcome (T2D) comparison of males diagnosed with precocious puberty (E30.1, ICD code for early pubertal timing) to those without, controlling for body mass index (BMI). RESULTS/ANTICIPATED RESULTS: Subjects (n=12,996,132) displayed the following computable phenotype: Male, < 18 years old, without ever having a BMI documented >85th percentile. Males diagnosed with precocious puberty (E30.1) were 6.89 times more likely to develop T2D when aged 14-18 years old than those without (OR 6.89, 95% CI: 5.17-9.19, p<0.0001). Next steps involve training a ML model on each computable phenotype groupings’ health data, with anticipated results identifying underlying salient pathophysiologic variables. A generalized computable phenotype approach is further developed to: 1) explore clinical questions in large databases like TriNetX©, and 2) model disease development with AI/ML algorithm construction. DISCUSSION/SIGNIFICANCE: Computed phenotypes reveal males with precocious puberty may have increased T2D risk. Next steps utilize subject data to train an AI/ML algorithm, model development to identify salient pathophysiologic variables, and synthesize a generalized AI/ML developmental research framework for dissemination.

Type
Precision Medicine/Health
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (https://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is unaltered and is properly cited. The written permission of Cambridge University Press must be obtained for commercial re-use or in order to create a derivative work.
Copyright
© The Author(s), 2024. The Association for Clinical and Translational Science