The National Health Data Science Lab conducts interdisciplinary research at the intersection of computational science, genomics, and biomedical research. Our work focuses on developing reliable and reproducible computational methodologies for analyzing complex biomedical data and supporting biologically meaningful interpretation of data-driven discoveries. The following research areas define the current scientific directions of the Lab.
The Lab conducts research focused on the development of computational approaches for analyzing genomic and multi-omic data in complex human diseases. Our work aims to improve interpretation of genetic risk factors through scalable analytical frameworks capable of integrating large-scale genomic datasets with advanced statistical and machine learning methods.
A central objective of this research direction is to improve the reliability and interpretability of genomic risk modeling, particularly in neurodegenerative and neuropsychiatric disorders where genetic architecture is complex and heterogeneous. By developing robust computational methodologies, the Lab seeks to support earlier disease understanding and more accurate characterization of disease-associated biological processes.
This research area focuses on methodological challenges associated with applying machine learning to complex biomedical data. The Lab develops computational approaches that emphasize robustness, reproducibility, and transparency of machine learning models, recognizing that methodological reliability is essential for the responsible use of artificial intelligence in scientific and healthcare research.
Research activities include development of scalable analytical workflows, systematic evaluation of model behavior across datasets, and approaches for improving interpretability and uncertainty-aware evaluation. The goal is to establish computational methodologies that enable reproducible and scientifically reliable AI-driven analysis in biomedical research environments.
The Lab integrates genomic analysis with disease-relevant cellular systems to support biological interpretation of computational findings. This work includes the use of human induced pluripotent stem cell (iPSC)-derived neuronal and microglial models to investigate cellular mechanisms underlying genetic risk in neurological disorders.
By linking genomic variation to functional cellular phenotypes, this research direction aims to bridge the gap between genetic association studies and biological mechanism. Functional validation of computational findings supports improved understanding of disease pathways and contributes to the development of experimentally grounded hypotheses for future translational research.
Modern biomedical research increasingly relies on the integration of heterogeneous data sources, including genomic, molecular, and experimental datasets. The Lab develops approaches for integrating diverse data types within unified analytical frameworks to enable comprehensive analysis of complex biological systems.
This research direction focuses on designing scalable and reproducible data workflows that support interdisciplinary research and facilitate collaboration between computational scientists and experimental researchers. By improving data integration methodologies, the Lab seeks to enable more reliable and efficient data-driven biomedical discovery.