TY - JOUR
T1 - Development of scoring system for risk stratification in clinical medicine
T2 - a step-by-step tutorial
AU - Zhang, Zhongheng
AU - Zhang, Haoyang
AU - Khanal, Mahesh Kumar
PY - 2017/11
Y1 - 2017/11
N2 - Risk scores play an important role in clinical medicine. With advances in information technology and availability of electronic healthcare record, scoring systems of less commonly seen diseases and population can be developed. The aim of the article is to provide a tutorial on how to develop and validate risk scores based on a virtual dataset by using R software. The dataset we generated including numeric and categorical variables and firstly the numeric variables would be converted to factor variables according to cutoff points identified by the LOESS smoother. Then risk points of each variable, which are related to the coefficients in logistic regression, are assigned to each level of the converted factor variables and other categorical variables. Finally, the total score is calculated for each subject to represent the prediction of the outcome event probability. The original dataset is split into training and validation subsets. Discrimination and calibration are evaluated in the validation subset. R codes with explanations are presented in the main text.
AB - Risk scores play an important role in clinical medicine. With advances in information technology and availability of electronic healthcare record, scoring systems of less commonly seen diseases and population can be developed. The aim of the article is to provide a tutorial on how to develop and validate risk scores based on a virtual dataset by using R software. The dataset we generated including numeric and categorical variables and firstly the numeric variables would be converted to factor variables according to cutoff points identified by the LOESS smoother. Then risk points of each variable, which are related to the coefficients in logistic regression, are assigned to each level of the converted factor variables and other categorical variables. Finally, the total score is calculated for each subject to represent the prediction of the outcome event probability. The original dataset is split into training and validation subsets. Discrimination and calibration are evaluated in the validation subset. R codes with explanations are presented in the main text.
U2 - 10.21037/atm.2017.08.22
DO - 10.21037/atm.2017.08.22
M3 - Article
C2 - 29201888
SN - 2305-5839
VL - 5
SP - 1
EP - 9
JO - Annals of Translational Medicine
JF - Annals of Translational Medicine
IS - 21
M1 - 436
ER -