Emerging Scholars – Mentors

EMERGING RESEARCH SCHOLARS AI PhD MENTORS


feremi

Department of Neurology

Abbas Babajani- Feremi, Ph.D.

Dr. Babajani’s lab is dedicated to advancing the integration of neuroimaging and electrophysiological modalities, such as magnetoencephalography (MEG), intracranial electroencephalography (EEG), and functional MRI (fMRI), with cutting-edge AI techniques, including deep learning algorithms. We focus on developing AI-driven methodologies to better understand and diagnose neurological disorders, particularly epilepsy and neurodegenerative diseases such as Alzheimer’s Disease (AD), Lewy Body Dementia (LBD), and Parkinson’s Disease (PD). By leveraging multi-modal neuroimaging and electrophysiological data, we aim to uncover novel biomarkers and neural patterns that can enhance diagnostic accuracy, guide therapeutic interventions, and deepen our understanding of disease mechanisms. Additionally, we are exploring innovative AI approaches to decode speech from brain signals, with the goal of creating advanced brain-computer interface (BCI) systems. These systems hold significant potential for enhancing communication capabilities in individuals with speech impairments, particularly in patients with conditions such as amyotrophic lateral sclerosis (ALS) or those recovering from stroke, offering new pathways for interaction and improving quality of life. We welcome ERS-AI PhD students with a background in AI and a passion for interdisciplinary research to join our team in pushing the boundaries of neuroscience and AI integration.


person

Department of Neuroscience

Michelle Bedenbaugh, Ph.D.

In the Bedenbaugh Lab, we have two overarching research questions: how does an individual choose between satiating hunger versus fulfilling other need states, and how do perturbations during critical periods of development influence neural circuit formation, physiology and behavior across the lifespan.  To answer these questions, we use a whole brain approach to reveal brain regions at the intersection of feeding and other motivated behaviors and identify how developmental perturbations disrupt neural networks. One of the main components of this approach is an advanced neuroanatomical pipeline that combines tissue clearing and light sheet microscopy with computational neural mapping. This pipeline employs deep learning algorithms to detect labeled neurons and register their precise locations within the Allen Brain Atlas Common Coordinate Framework, enabling systematic analysis of circuit-wide patterns. Through the use of this unbiased, whole brain approach, we hope to identify new brain regions and neural networks that may aid us in the treatment of several metabolic and neuropsychiatric diseases.


w

Department of Neuroscience

Sara Burke, Ph.D.

Higher cognitive functions that decline in old age and the early stages of AD, such as memory and executive functions, are supported by neural networks distributed across the medial temporal lobe (MTL) and prefrontal cortex (PFC). Critically, these structures are among the earliest to accumulate pathology in AD, of which aging is the single greatest risk factor. While the precise mechanisms that render the aged brain vulnerable to neurodegeneration remain to be determined, it is known that aging is associated with a host of regionally specific neurobiological alterations within the PFC and MTL that do not correlate. This fact presents a major challenge for the development of effective therapeutics because higher cognition is supported by networks distributed across these vulnerable areas. Thus, targeted interventions that restore function in one brain region may neglect or exacerbate dysfunction in another, hindering the restoration of normal cognition. As such, interventions that target the optimization of “cognitive networks” rather than discrete brain regions may be more effective for improving behavioral outcomes in older adults. In order to do this, we need new technologies that can link cellular changes at the microscopic level to global changes in macroscopic brain networks that cooperate to support higher cognitive function. A current focus of my research program that implements artificial intelligence is developing methods that can link cellular changes to global brain connectivity through machine learning that can co-register different imaging platforms and classify cellular activity.


dale

Department of Physiology and Aging

Erica Dale, Ph.D.

More than half of the~275,000 global, annual, traumatic spinal cord injuries (SCI) occur at the cervical level, leading to paralysis and respiratory compromise or failure. Approximately 20-30% of cervical SCI(cSCI) patients will require ventilator support for which there are very few therapeutic options for recovery. Indeed, the leading cause of morbidity and mortality after cSCI is respiratory compromise. Even in cases where mechanical ventilation is not required, many people with SCI are unable to cough to clear their airways and thus die of pneumonia. Acute epidural electrical stimulation has emerged as a strategy to restore vital motor, sensory, and autonomic functions in both experimental and clinical settings after SCI. For example, after spinal injury, epidural stimulation improves cardiovascular, bladder and trunk stability via neuromodulation of spinal neural networks. And more recently, we have shown modest success in eliciting respiratory neuroplasticity in the spinal neural network controlling breathing after short-term epidural stimulation in rats. Though limited underlying mechanisms have been proposed, to date little is known how epidural stimulation elicits this motor function at the neuronal level. Even less is known about the capacity for epidural stimulation to promote long-lasting recovery and device-independence nor by which stimulation paradigms this could occur. Thus, it is imperative to functionally map the stimulation parameter space in order to characterize and optimize recovery.


Guo

Department of Health outcomes & bioMedical informatics

Yi Guo, Ph.D., FAMIA

‎Recent Advances in artificial intelligence (AI), including natural language processing (NLP) and large language models (LLMs), offer powerful tools to enhance autism spectrum disorder (ASD) phenotyping, early risk prediction, and the development of clinical decision support (CDS) tools. These models can extract nuanced features from unstructured notes, such as developmental delays and behavioral observations, that are rarely captured in structured electronic health record (EHR) fields. However, few studies have applied LLMs to ASD research or tested their utility in real-world pediatric settings. Existing ASD prediction models typically rely on structured codes and static features, which limits their accuracy and generalizability. Furthermore, clinical implementation of AI models remains limited due to concerns about workflow compatibility, clinician trust, and performance biases. To address these gaps, we propose to integrate advanced AI methods with multi-sourced EHR data to develop scalable tools for early ASD detection, evaluated through a silent trial in routine care.

In this project, we propose to develop, validate, and test analytic tools that integrate advanced AI methods with large-scale, multi-sourced EHR data for early ASD detection. We will leverage longitudinal EHR data from UF Health and the OneFlorida+ Clinical Research Network.


h

Department of Neuroscience

Habibeh Khoshbouei, Ph.D.

Landmark scientific discoveries support the neural population doctrine, where the neuronal population, not the single neuron, are the essential unit of computation in many brain regions. New computing technologies have enabled neuroscience research at the level of the neural population. The long-term goal of our research is to apply artificial intelligence to the analysis of dopamine neural populations to decode neural dynamics. We recently employed live-cell calcium imaging in the midbrain slices of DAT-cre/loxP-GCaMP6f (DAT-GCaMP6f) mice of either sex and computational analyses to show that functional network connectivity greatly differs between substantia nigra pars compacta (SNc) and ventral tegmental area (VTA) regions. Using complex network analysis, we found a higher incidence of hyperconnected (i.e. hub-like) neurons in the VTA than the SNc. The lower number of hyperconnected neurons in the SNc is consistent with the interpretation of a lower dopamine neuronal network resilience to the SNc’s neuronal loss-implicated in neurological disorders. Our ongoing studies expand this work to in vivo studies in freely moving DAT-GCaMP6f mice of either sex via live cell calcim imaging through microendoscopic lenses. This approach enables imaging of previously inaccessible dopamine neuronal populations deep within the midbrain of freely moving animals exposed to saline or methamphetamine.


f

Department of Health outcomes and biomedical informatics

Mei Liu, Ph.D.

My long-term research goal is to develop innovative Artificial Intelligence/Machine Learning (AI/ML) methods to support Predictive, Preventive, Personalized, and Participatory (P4) medicine. The ERS-AI PhD student will be working on collaborative research projects that address challenges in EHR-data analysis such as federated learning, transfer learning, and personalized learning for more accurate and robust disease prediction and risk factor identification. Developing AI/ML algorithms that will improve model reproducibility, interpretability, transportability, and fairness will be a central focus of the research projects. The first project to which the student will be recruited will involve model development for acute kidney injury (AKI) prediction, prognosis, and sub-phenotyping using multi-institutional electronic health records (EHRs).


professor

Department of Neuroscience

Freddyson Martinez-Rivera, Ph.D.

In the Martínez-Rivera Lab, we are committed to incorporating bioinformatic, computational, and artificial intelligence (AI) approaches to advance the understanding of fundamental mechanisms of maladaptive behaviors, including those involving substance abuse, depression, aggression, decision making, and stress. Our projects are focused on generating behavioral, cellular, and transcriptional/genetic datasets using innovative techniques integrating machine learning and statistical tools, such as behavioral tracking systems, cellular recordings, and RNA-sequencing pipelines. These efforts are combined with the HiPerGator’s infrastructure, essential for the data collection, storage, and subsequent analyses.


m

Department of Neuroscience

Andrew Maurer, Ph.D.

Preclinical Assays of Hippocampal-Prefrontal Cortical Circuit Engagement for Application in Therapeutic Development

The high failure rate of translating discovery science to positive clinical outcomes in the treatment of psychiatric diseases demonstrates the necessity of improving the efficiency and rigor of the therapeutic development pipeline. To this end, the critical importance of advancing the discovery of in vivo physiological and behavioral measures of the engagement of specific circuits for normal cognitive function has been acknowledged across funding initiatives. The hippocampus (HPC)-prefrontal cortical (PFC) circuit is critical for affective processing as well as higher cognitive functions and vulnerable in a number of mental health disorders. Although disrupted functional connectivity in the HPC-PFC circuit is a common feature of anxiety, bipolar disorder, schizophrenia, and autism, how local cellular interactions within this circuit manifest as large-scale temporal coordination to support higher cognitive functions remains unknown. Addressing this fundamental gap in our knowledge will establish a foundation for using circuit-based models for therapeutic target discovery and screening tools of novel drug efficacy. The long-term goal of this proposal, in line with the Funding Opportunity Announcement (PAR-19-289), is to enhance the therapeutic development pipeline for mental illness treatment by optimizing, evaluating, and mechanistically testing neurophysiological and behavioral measures of circuit engagement. The primary objective of this proposal, which is the first step towards achieving our goal, is to relate behavioral performance on the rodent analog on the Paired Associates Learning task (PAL), part of human Cambridge Neuropsychological Test Automated Batteries [CANTAB] assessment, and surface EEG recordings to invasive neurophysiological measures of neural coordination in the HPC-PFC circuit. Through an innovative series of experiments that integrate in vivo neurophysiological local field potential (LFP) recordings, circuit manipulation, surface EEG, and behavior, we will optimize, evaluate and mechanistically test novel noninvasive biomarkers of HPC-PFC circuit engagement by pursuing the following specific aims: 1) Optimize behavioral and non-invasive EEG biomarkers for inferring HPC-PFC circuit engagement and temporal coordination, 2) Evaluation of behavioral and non-invasive EEG biomarkers for determining HPC-PFC circuit engagement through pharmacological manipulation, and 3) Mechanistically test HPC-PFC projections as a driver of surface EEG organization. The proposed research is innovative because it integrates a clinically relevant behavioral task, designed to be analogous to human cognitive assessments, with surface EEG measures that translate across mammals. This will enable the optimization, evaluation, and testing of novel and translatable measures of HPC-PFC circuit engagement in the context of higher cognition and global neural organization. The significance of this contribution will be to provide novel diagnostic tools that can be used to enhance the therapeutic development pipeline for treating mental illness.

The student selected for this project will work on interfacing predictive algorithms, leveraging Al tools and techniques, to anticipate intracortical activity based on cortical EEG and behavior. Through this, the student would have made advancements that are directly translatable to the clinic.


headshot

Department of Biochemistry & molecular biology

Robert McKenna, Ph.D.

Adeno-associated virus (AAV) vectors are widely used in gene therapy due to their safety-profile and ability to deliver genetic material to a wide range of tissues. However, challenges such as immune response, limited tissue specificity, and cargo capacity constrain their broader application. Leveraging artificial intelligence (AI), particularly machine learning and deep learning, offers a transformative approach to optimizing AAV design by harnessing structural insights from their three-dimensional (3D) capsid architecture.


AI models provide the opportunity to interrogate high-resolution 3D structures of AAV capsids to identify patterns that govern tissue tropism, immune evasion, and transduction efficiency. A structural dataset of capsid structures, obtained through cryo-electron microscopy and X-ray crystallography, affords the opportunity to train algorithms to predict the functional impact of amino acid substitutions or insertions on capsid performance. For instance, generative models can potentially propose novel capsid variants with desired properties, such as reduced immunogenicity or enhanced specificity for target tissues to the brain.


Furthermore, AI should be able to assist in mapping antigenic regions on the capsid surface that are recognized by neutralizing antibodies. By identifying and redesigning these regions, AI-guided engineering could produce capsids that better evade pre-existing immunity in patients. Additionally, AI models could simulate protein folding and stability, helping predict whether engineered variants will maintain structural integrity and function under physiological conditions.


In summary, AI-driven design pipelines have the potential to significantly accelerate the discovery cycle, reducing reliance on labor-intensive trial-and-error experimentation. Tools like AlphaFold and Rosetta, integrated with reinforcement learning or evolutionary algorithms, allow for rapid in silico screening of capsid libraries to select promising candidates for experimental validation.


j

Department of Biochemistry & molecular biology

Matthew Merritt, Ph.D.

Dr. Merritt’s project uses AI approaches, primarily neural networks, for automated quantitation and denoising of nuclear magnetic resonance (NMR) data. AI has well known abilities for performing image recognition, and by its very nature, a neural network can evaluate a target image almost instantaneously once it is trained. The speed and robustness of neural network approaches suggest that its application to the spectra denoising/fitting and quantitation problem in NMR could be very profitable. Initial results using a deep learning neural network produced an increase in signal-to-noise ratio (SNR) of 200to 1for13C NMR spectra(1).Using traditional Fourier transformation methods, the SNR is proportional to (square root of # of scans) the which means that it takes 4 times the number of scans to give twice the SNR. A gain in SNR of 200 is equivalent to running the same sample 40000 times longer. Given that most13C spectra acquired in my lab take at least 6 hours to acquire, the time savings possible with this approach are truly transformational.


olivieria

Department of Pharmacology & Therapeutics

Aline C. Oliveira, MSc, Ph.D.

The Oliveira Lab is dedicated to uncovering how disruptions in the brain–lung axis contributes to the pathophysiology of pulmonary hypertension (PH). Our research focuses on the role of neuro-immune interactions in driving maladaptive neural plasticity and sustaining sympathetic overactivation, a key feature of PH progression.


To investigate these mechanisms, we integrate advanced neuroscience tools with established cardiopulmonary techniques. Our approach includes live-cell imaging in brain slices, right heart catheterization, immunohistochemistry, and molecular analyses, paired with translational validation in human tissues and cells. By bridging brain function with cardiopulmonary outcomes, our lab aims to identify novel targets for therapeutic intervention in pulmonary hypertension.


n

Department of neuroscience

Nancy Padilla-Coreano, Ph.D.

The Padilla-Coreano Lab studies how the brain facilitates social behaviors using tools at the intersection of neuroscience and Artificial Intelligence. Specifically, the lab studies the neural mechanisms of social competence, that is how we adjust our social behavior based on information, using mouse models. Two key elements of this research goal are: being able to measure social behaviors and understanding the relationship between behavior and brain activity. The lab uses Artificial Intelligence to tackle both key elements. The PI is a co-developer of a recent Deep Learning tool (AlphaTracker) that does pose estimation for multiple animal tracking (Padilla-Coreano et al., 2020 preprint). Furthermore, this lab has active collaborations with machine learning scientists at UF to create new tools to analyze behavior incorporating temporal information and structure for unbiased automatic behavior classification. Furthermore, the lab is focused on studying neural function at the network level. By recording neural activity of multiple brain regions simultaneously we can identify what circuits and sequences of circuits lead to important social behaviors. Given the complexity of the data (both neural and behavioral),Artificial Intelligence helps identify the causal relationship between neural activity and behavior. The PI has applied similar approaches to predict behaviors and conditions from neural activity and the lab will expand this approach to consider neural activity from a whole network.


d

Department of Psychiatry

Paola Giusti-Rodriguez, Ph.D.

The Giusti-Rodriguez Lab works at the intersection of neuroscience, human genetics, and functional genomics, and aims to maximize the tools and techniques of these fields to advance our understanding of the genetics of neuropsychiatric disorders. AI in genomics is growing rapidly, and deep learning methods have been applied to the analysis of diverse datatypes, including DNA and RNA-sequencing, methylation, DNA accessibility and chromatin, and 3Dgenome organization. The Giusti-Rodríguez lab will generate diverse data types using mouse, postmortem human brain tissue, iPSCs, etc., and has access to many external datasets through existing collaborations and or publicly available datasets. The Giusti-Rodríguez lab will apply machine learning and artificial intelligence approaches to multiomics datatypes relevant to understanding specific susceptibilities to psychiatric disorders and to parse out genetic underpinnings in individuals from diverse populations and complex admixture.


s

Department of Medicine

Pinaki Sarder, Ph.D.

Dr. Sarder’s lab develops novel computational methods to study and understand tissue micro-anatomy using multi-modal whole-slide microscopy images as well as associated molecular omics data. Our method facilitates decision making in a clinical work-flow (both for diagnosis and predicting progression of diseases), and also allows studying fundamental systems biology of disease dynamics. Currently, our major focus involves studying chronic kidney diseases as well as ‘reference’ organ systems across scale.


s

Department of medicine

Wei Shao, Ph.D., M.S.

This Ph.D. project aims to explore the development and application of generative models for advancing medical image analysis. The student will work on creating and refining algorithms for multi-modal data fusion, integrating diverse imaging modalities (e.g., MRI, CT, ultrasound) and non-imaging clinical data to enhance diagnostic accuracy and clinical decision-making. Leveraging state-of-the-art generative techniques, such as diffusion models and transformers, the project seeks to address key challenges in medical imaging, including synthesis, segmentation, and registration.


The research will focus on enabling robust, generalizable AI models that can operate across various imaging protocols and clinical contexts. By developing innovative methodologies for combining data from multiple sources, the project will contribute to precision medicine initiatives and improve outcomes in areas like cancer diagnosis, cardiovascular imaging, and more.


n

Department of medicine

Benjamin Shickel, Ph.D.

The ERS-AI PhD student will be recruited into projects exploring the application of multi-modal foundation models for a variety of clinical applications and patient health modeling. Briefly, foundation models comprise a recent class of large-scale machine learning frameworks based on the Transformer model architecture that are designed to formulate scalable data-driven representations from voluminous data, merging AI principles of supervised, unsupervised, and self-supervised learning techniques; such data representations can be applied to several downstream AI tasks. Currently popularized by innovations in natural language processing (NLP),the ERS-AI PhD student will research the translation of these discoveries into the healthcare domain by developing foundation models of patient health that integrate granular and temporal health data from multiple modalities (e.g. continuous and discrete electronic health record measurements, clinical notes, radiography, omics data) for unified health representations that can be applied to downstream clinical prediction tasks (e.g. sepsis, acute kidney injury, mortality). Methods to measure and improve explainability, fairness, and causality of foundation models will be a large focus of the research projects. The first project to which the student will be recruited will involve the development of a Transformer foundation model for dynamic monitoring of acute kidney injury (AKI).


Headshot

Department of health outcomes and biomedical informatics

Qianqian Song, Ph.D.

This project focuses on leveraging artificial intelligence (AI) and data science to advance precision oncology, with a particular emphasis on the integration and analysis of multimodal cancer-related data. These data types include genomic variants, transcriptomics, histopathology imaging, and clinical records data. Student recruited to this project will be involved in the design, development, and benchmarking of advanced AI frameworks, including deep learning, graph transformer, and contrastive learning models, that are capable of extracting meaningful patterns from complex, high-dimensional, and multi-modal datasets.


A core aim of the project is to develop robust AI tools to support cancer subtype classification and treatment response prediction, ultimately contributing to more individualized therapeutic strategies. Student will gain practical experience in all aspects of the data science pipeline, including preprocessing and harmonizing large-scale datasets from disparate sources, building and fine-tuning predictive models using transformer-based architectures, and interpreting model outputs through attention mechanisms, saliency maps, and SHAP analysis. Additionally, student will learn best practices in FAIR (Findable, Accessible, Interoperable, Reusable) data management, as well as how to navigate and utilize multi-modal data and high-performance computing environments. This project is embedded in an active, multidisciplinary research environment that bridges AI, data science, and biomedical informatics, offering a highly relevant and enriching experience for student interested in AI applications in biomedical informatics.


u

Department of Pharmacology & Therapeutics

Nikhil Urs, Ph.D.

My research interests broadly cover dopamine neurotransmission in neurological and psychiatric disorders. My primary research focus is to learn more about the dopamine system by deciphering a) signaling pathways involved in DA neurotransmission, b) functional dopamine neuronal circuits and c) how these integrate and manifest behaviorally in an organism. Using this integrated approach will in parallel allow us to fine-tune dopamine neurotransmission and devise novel drug- and gene-based therapeutic approaches to treat dopamine-related disorders such as PD and schizophrenia. One of the main projects in the lab studies cortical dopamine circuits in motivated behavior and how these circuits regulate striatal dopamine. Our goal is to manipulate these circuits and assess their effects on behavior. We will simultaneously measure calcium or dopamine dynamics in the brain during behavior using fiber photometry using fluorescent biosensors GCaMP and dlight. The photometry data needs to be extracted from the RZ10 photometry unit using python and Matlab and requires coding knowledge. This is essential since we need to extract fluorescent signal data during particular behavioral events (cue, approach, reward, avoidance etc) over time, i.e a single training session or multiple days of training. 

In addition, we also will study effects of these cortical circuits on motor learning and behavior for which we will use DeepLabCut (http://www.mackenziemathislab.org/deeplabcut) an opensource software that uses machine learning to track fine and gross motor movements in rodents. 

ERS-AI scholars will be trained by us to learn and use “python/MATLAB” and “DeepLabCut” as part of their research projects


w

Department of molecular genetics & Microbiology

Eric Wang, Ph.D.

The brain is a complex network of multiple cell types, each with its own transcriptome and proteome. Modern technologies facilitate high throughput, precise measurement of transcriptomes in particular; this is now commonly performed using bulk tissue, as well as at the single cell and subcellular levels. These techniques are useful for studies of the brain, given the complex morphology of cell types such as neurons whose gene products must be transported from nuclei to synapses, sometimes millimeters away. Many neurological and neurodegenerative diseases are caused by mutations in genes that cause downstream changes to RNA metabolism and intracellular transport. One of these is myotonic dystrophy, a repeat expansion disease with symptoms manifesting in muscle, heart, and brain tissues. Some of the symptoms potentially mediated by brain dysfunction include profound hypersomnolence, altered regulation of circadian rhythms, executive dysfunction, problems with learning/memory, and white matter atrophy. We currently do not understand which cell types and brain regions are affected in this disease and are studying post-mortem samples from myotonic dystrophy patients to better understand disease pathogenesis. We seek to profile transcriptomes and proteomes using bulk tissue, transcriptomes at the single cell level (with a focus on RNA splicing isoforms), and transcriptomes with spatial information. In addition, we seek to better understand the variability in somatic repeat expansion across brain regions and cell types and will employ cutting edge optical mapping approaches coupled to transcriptome profiling approaches to obtain this information. All of these techniques and studies require extensive computational analyses. We routinely write custom code to analyze these datasets, and also leverage existing packages (e.g. Python and R). Artificial intelligence approaches such as Bayesian Inference, mixture models, and linear regression will be employed in this project, and deep learning approaches will also be applied when appropriate. Overall, these efforts will not only provide insights into myotonic dystrophy pathogenesis, but also inform studies of repeat expansion disease and brain diseases in general.


q

Department of health outcomes and biomedical informatics

Yonghui Wu, Ph.D.

In the last two decades, the introduction of targeted anticancer therapies has revolutionized the treatment of hematological malignancies such as multiple myeloma, chronic myeloid leukemia, and solid malignancies such as breast and renal carcinoma. Contemporary cancer therapy has led to a 23% reduction in cancer-related mortality rate and a rapid increase in cancer survivorship in the last 15 years. However, some devastating side effects of these treatments have also resulted in increased morbidity and mortality. For example, cardiotoxicity is one of the well-documented adverse events of cancer treatments resulting either from accelerated development of cardiovascular diseases in cancer patients or from the direct effects of the treatment on the structure and function of the heart. The goal of this project is to develop predictive models for the identification of cancer patients with a high risk of cardiotoxicity to prevent or minimize the risk of cardiotoxicity in cancer treatments.


x

Department of Biochemistry & molecular biology

Mingyi Xie, Ph.D.

Gene expression, the flow of genetic information from DNA to messenger RNA (mRNA) to protein, involves delicate regulation by a group of small RNAs named microRNAs (miRNA). The development of high throughput technology of next-generation sequencing and the advancement in artificial intelligence provides new opportunities for miRNA target identification. We aim to develop an innovative machine learning framework to efficiently predict high-confidence miRNA-mRNA interaction pairs in cancer patients with contrastive convolutional neural networks based on the combination of heterogenous RNA-seq data (miRNA, mRNA and miRNA-mRNA hybrids). We will also anticipate profiling and validating the effect of discovered miRNA-mRNA pairs in patient samples to facilitate the hypothesis generation process for potential cancer therapeutics. Collectively, our efforts will result in rapid and accurate identification of high-quality miRNA-mRNA pairs with our proposed model, which would accelerate the process of elucidating the underlying mechanism of cancer progression and provide the basis for improving current therapeutic interventions. Additionally, apart from identifying the miRNA-mRNA pairs, our proposed framework also has the potential to be applied in other types of cancer to facilitate the development of cancer therapeutics.


j

Department of health outcomes & biomedical informatics

Jie Xu, Ph.D.

To develop machine learning methods for the identification of Alzheimer’s disease (AD) and its related dementias (ADRD) sub-phenotypes. Using electronic health records (EHRs) from patients diagnosed with AD/ADRD, we will retrospectively review their structured EHRs, clinical notes, and neuroimages and develop machine learning methods for connecting these data sources and computationally deriving AD/ADRD sub-phenotypes based on hierarchical clustering. Interfaces with Data Science/AI: Students are required to develop machine learning methods to connect different data modalities and develop AI methods to derive disease subtypes from large-scale health data.


r

Department of health outcomes & biomedical informatics

Rui Yin, Ph.D.

In this project, we will develop and validate research-grade computable phenotyping (CP) algorithms and tools, leveraging advanced natural language processing (NLP) methods, to accurately identify AD/ADRD drug repurposing study cohorts and then extract and standardize relevant patient characteristics (e.g., APOE) and outcomes (e.g., ADRD subtypes and severity) from RWD. The algorithms will be developed and internally validated at OneFlorida and externally validated with INSIGHT data. This work will address misclassification errors and incomplete information through CP and clinical NLP. Neither NLP nor CP is a novel method; nevertheless, there has been no systematic investigation for AD/ADRD drug repurposing. Our effort will be the first to make publicly available resources to support AD/ADD drug repurposing research using real-world data (RWD). With the CP/NLP pipeline, we can accurately identify study cohorts, extract drug exposures, outcomes, and other important confounders and potential effect modifiers, which enables more precise estimation of the treatment effects for the candidate repurposing drugs from RWD.