Detecting potential outliers in longitudinal data with time-dependent covariates

Lazarus K. Mramba, Xiang Liu, Kristian F. Lynch, Jimin Yang, Carin Andrén Aronsson, Sandra Hummel, Jill M. Norris, Suvi M. Virtanen, Leena Hakola, Ulla M. Uusitalo, Jeffrey P. Krischer

Research output: Contribution to journalArticlepeer-review


Background: Outliers can influence regression model parameters and change the direction of the estimated effect, over-estimating or under-estimating the strength of the association between a response variable and an exposure of interest. Identifying visit-level outliers from longitudinal data with continuous time-dependent covariates is important when the distribution of such variable is highly skewed. Objectives: The primary objective was to identify potential outliers at follow-up visits using interquartile range (IQR) statistic and assess their influence on estimated Cox regression parameters. Methods: Study was motivated by a large TEDDY dietary longitudinal and time-to-event data with a continuous time-varying vitamin B12 intake as the exposure of interest and development of Islet Autoimmunity (IA) as the response variable. An IQR algorithm was applied to the TEDDY dataset to detect potential outliers at each visit. To assess the impact of detected outliers, data were analyzed using the extended time-dependent Cox model with robust sandwich estimator. Partial residual diagnostic plots were examined for highly influential outliers. Results: Extreme vitamin B12 observations that were cases of IA had a stronger influence on the Cox regression model than non-cases. Identified outliers changed the direction of hazard ratios, standard errors, or the strength of association with the risk of developing IA. Conclusion: At the exploratory data analysis stage, the IQR algorithm can be used as a data quality control tool to identify potential outliers at the visit level, which can be further investigated.

Original languageEnglish
JournalEuropean Journal of Clinical Nutrition
Publication statusE-pub ahead of print - 2024

Subject classification (UKÄ)

  • Endocrinology and Diabetes


Dive into the research topics of 'Detecting potential outliers in longitudinal data with time-dependent covariates'. Together they form a unique fingerprint.

Cite this