Internship Alumni
Year | Name | Affiliation/Mentors | Project |
---|---|---|---|
2025 | Arnab Aich | PhD Biostatistics, Florida State University; Mentor Yuan Zhang |
As part of his internship, Arnab developed RMSTSS, a comprehensive R package and interactive Shiny web application designed to meet the growing demand for current power and sample size resources in clinical study design. This project is based on the Restricted Mean Survival Time (RMST) and offers an easy-to-understand and robust alternative to the traditional hazard ratio. The software suite enables research scientists to design complex studies by implementing advanced statistical methods. It accommodates a wide range of user needs, featuring both fast analytical computations and robust simulation-based bootstrapping techniques. Additionally, the accompanying Shiny program provides a user-friendly point-and-click interface, making these advanced methods accessible to non-programmers. Users can easily upload data, configure visual analysis settings, create interactive survival and power plots, and generate downloadable PDF analysis reports with just a single click. |
2025 | Abhirath Anand | MSc Biostatistics, University of Michigain; Mentors Gregory Farage and Saunak Sen | Abhirathworked on developing BigRiverMetabolomics.jl, a Julia package designed for end-to-end workflows with metabolomics matrices. This is an umbrella package that depends on two component packages that were also developed during the internship – BigRiverJunbi.jl and BigRiverMakie.jl. BigRiverJunbi.jl is designed for seamless data preparation with ‘omics data, and includes features like data imputation, normalization, transformation, and standardization. BigRiverMakie.jl uses the Makie.jl plotting ecosystem to creating visualizations of the data and the resulting analyses. |
2025 | Shengqiang Chen | PhD Biostatistics, University of Memphis; Mentors Chi-Yang Chiu and Qi Zhao | During my internship, I contributed to research investigating the role of metabolomics in the development and progression of sarcopenia—a condition characterized by the progressive loss of muscle mass, strength, and function in older adults. To explore this, we employed a Seemingly Unrelated Regression (SUR) model to simultaneously evaluate three key indicators of sarcopenia. This modeling approach accounted for the potential correlation between outcome errors, enhancing statistical efficiency and providing a more comprehensive understanding of the potential role of short-chain fatty acids (SCFAs) in sarcopenia. |
2025 | Ziyi Zhang | MPH Data Science, UNC Chapel Hill; Mentors Gregory Farage and Saunak Sen | During her internship, Zoey worked on establishing a biomedical-focused Dataverse platform, a centralized, web-based resource designed to improve the discoverability and usability of publicly available health datasets. Her primary project involved building a structured public dataset inventory and conducting a user evaluation to ensure the platform’s accessibility, reproducibility, and equity in data use. In parallel, she initiated a new project on Triple Negative Breast Cancer (TNBC) DNA methylation, laying the groundwork for data downloading, processing, and statistical modeling to identify potential biomarkers and investigate epigenetic patterns. |
2024 | Durbadal Ghosh | PhD Biostatistics, Florida State University; Mentors Gregory Farage and Saunak Sen | During his internship, Durbadal contributed to the BigRiverQTL.jl Julia package designed to streamline quantitative trait locus (QTL) analysis. BigRiverQTL.jl integrates multiple functionalities, including preprocessing, genome scans, and visualization tools, to facilitate efficient and accessible QTL and eQTL analysis within the Julia programming environment. He also contributed toward process improvements to the package FlxQTL.jl, a package for multivariate and longitudinal trait analyses. |
2024 | Shanti Sree-Edara | MPH Biostatistics and Epidemiology, University of Southern Mississippi; Mentors Kimberly Kelly and Chi-Yang Chiu | Shanti conducted a study on the impact of gift card challenges on the quality and quantity of research. Deployed a Qualtrics survey to NIH-funded researchers and other research groups, analyzed the data, performed cross-tabulation and ordinal regression, and presented the findings. At the end of her internship she worked on a Qualtrics survey for a skin cancer awareness project - pre and post intervention and conducted literature reviews. |
2024 |
Yinan Chen |
MSc Statistics, University of Illinois at Urbana-Champaign; Mentors Zhu Wang and Lauren Bell | Ivy examined trends in drug overdose deaths related to illicitly manufactured fentanyls (IMFs) among adolescents over. She built a linear model to analyze the changes in drug use across this demographic. At the end of her project she built an interactive web application in R Shiny to promote awareness and public engagement in drug overdose deaths. |
2023 | Galvin Li | MS Statistics, University of Virginia; mentors Chi-Yang Chiu and Feng Liu-Smith | NHANES data analysis to investigate the association of sex hormones with skin cancers. Galvin built multiple logistic regression to determine the association of E2, T and T/E2 levels with melanoma risk, adjusting to other relevant variables and examining potential gender-based differences. A similar analysis was performed using non-melanoma skin cancer as the outcome. |
2023 | Harper Kolehmainen | BS, Rhodes College; mentors Gregory Farage and Saunak Sen | Interactive Interface for Genome Scans using Pluto.jl and BulkLMM.jl packages. Harper created an interface for the Pluto.jl and BulkLMM.jl packages in Julia. Additional improvements included documentation and methodological improvisations as well as interface/API changes. |
2023 |
Miyeon Yeon |
PhD Candidate Biostatistics, Florida State University; mentor Hyo Young Choi | Assessment of mRNA degradation in FFPE tissue. Miyeon examined the RNA-seq data paired in FFPE and fresh frozen primary tumor tissue obtained from a subset of TCGAshe performed genome-wide comparisons of gene expression profiles between FFPE tissue and fresh frozen tissue using unsupervised/supervised clustering methods. The project looked at gene-specific as well as sample-specific degradation using SCISSOR and identified severely degraded samples in both cohorts looking for association with multiple factors such as total gene length, 3’ UTR length, GC concentration, etc. All analyses were performed in R/R Markdown. |
2023 | Siling Liu |
PhD Student Statistics Michigain State University, MSc Computational Science and Engineering, Rice University; mentor Qi Zhao. |
This internship focused on analyzing data from the CANDLE project entitled “Placental Epigenome-Wide Association Study of Early Childhood Body Mass Index Growth Trajectories and Overweight/Obesity Risk”. After completing her internship with us, she continued on to a PhD in Statistics program at Michigain State University. |