We note that the distributions of these PRS are skewed for non EUR pops but will normalize for our analysis:
Using residual PRS
Code
library(ggridges)pcmod =lm(PRS ~ PC1 + PC2 + PC3 + PC4 + PC5 + PC6 + PC7 + PC8 + PC9 + PC10, data = totaldata)totaldata$predictscore =predict(object = pcmod, newdata = totaldata)totaldata$residscore = totaldata$PRS - totaldata$predictscoretotaldata$scaled_resid =scale(totaldata$residscore)## because only one populatinototaldata$pop_scaled_prs =scale(totaldata$residscore)totaldata$Population="Saudi"ggplot(totaldata, aes(x = residscore, y = Population, fill = Population)) +geom_density_ridges(alpha =0.7) +scale_fill_brewer(palette ="Set1") +labs(title ="Density Distribution of Sum Scores",x ="Sum Score",y ="Population")
Picking joint bandwidth of 1.71e-08
Code
ggplot(totaldata, aes(x = pop_scaled_prs, y = Population, fill = Population)) +geom_density_ridges(alpha =0.7) +scale_fill_brewer(palette ="Set1") +# Optional: Adds a nice theme suitable for ridgeline plotslabs(title ="Density Distribution of Sum Scores: Scaled for all",x ="Sum Score",y ="Population")
# A tibble: 1 × 5
Population `quantile(pop_scaled_prs, 0.25)` `median(pop_scaled_prs)`
<chr> <dbl> <dbl>
1 Saudi -0.663 0.0114
# ℹ 2 more variables: `mean(pop_scaled_prs)` <dbl>,
# `quantile(pop_scaled_prs, 0.75)` <dbl>
Table1
Code
library(table1)
Attaching package: 'table1'
The following objects are masked from 'package:base':
units, units<-
Code
dat <- totaldatadat$pop_scaled_prs <-as.numeric(as.character(round(dat$pop_scaled_prs,2)))dat <- totaldatadat$CAD_censor <-as.factor(dat$CAD_censor)label(dat$Sex) <-"Sex"label(dat$ldl1_adj) <-"LDL-C (adjusted for Statin)"label(dat$CAD_censor) <-"CAD Ever"label(dat$scaled_resid) <-"Overall Scaled Residual PRS"label(dat$pop_scaled_prs) <-"Pop Scaled Residual PRS"label(dat$Age) ="Age Enrolled"# Continue with creating the tabletable1(~Sex + ldl1_adj + CAD_censor + Age +round(pop_scaled_prs,4) + scaled_resid , data = dat)
Overall (N=5399)
Sex
Mean (SD)
0.643 (0.479)
Median [Min, Max]
1.00 [0, 1.00]
LDL-C (adjusted for Statin)
Mean (SD)
136 (53.4)
Median [Min, Max]
127 [3.87, 895]
Missing
952 (17.6%)
CAD Ever
0
2009 (37.2%)
1
3321 (61.5%)
Missing
69 (1.3%)
Age Enrolled
Mean (SD)
54.8 (14.8)
Median [Min, Max]
56.0 [13.0, 102]
Pop Scaled Residual PRS
Mean (SD)
-0.0000000185 (1.00)
Median [Min, Max]
0.0114 [-3.62, 3.38]
Overall Scaled Residual PRS
Mean (SD)
-0.0000000000000000354 (1.00)
Median [Min, Max]
0.0114 [-3.62, 3.38]
Impact of PRS
To assess the impact of LDL-C PRS on CAD risk, you need to establish a two-step relationship: first, between LDL-C PRS and LDL-C levels (per 40 mg/dL change), and then between LDL-C PRS and CAD outcomes. The approach allows you to understand how changes in LDL-C levels attributed to genetic risk (as quantified by PRS) relate to CAD risk.
Linear Regression for LDL-C Levels: Fit a linear regression model with LDL-C levels as the outcome and LDL-C PRS as the predictor. You might want to adjust for relevant covariates.
Rescale the Coefficient for LDL-C PRS: Adjust the coefficient from tAMR model to reflect a 40 mg/dL change in LDL-C.
Step 2: Associate Rescaled LDL-C PRS with CAD
Incorporate Rescaled LDL-C PRS into Cox Model: Use the rescaled LDL-C PRS from Step 1 as a predictor in a Cox proportional hazards model to analyze its relationship with CAD risk.
Code
library(dplyr)library(survival)# Assuming 'df' is your dataset# List of unique populationspop="Saudi"# Initialize an empty list for storing models and resultscox_models <-list()glm_models =list()results <-data.frame(Population =character(),PRS_Mean =numeric(),PRS_SD =numeric(),Percent_male =numeric(),LDL_Coef_CI_Lower =numeric(),LDL_Coef_CI_Upper =numeric(),Population_Scale_Factor =numeric(),LDL_Scaled_Coef_CI_Lower =numeric(),LDL_Scaled_Coef_CI_Upper =numeric(),HR =numeric(),HR_CI_Lower =numeric(),HR_CI_Upper =numeric(),OR =numeric(),OR_LCI =numeric(),OR_UCI =numeric(),stringsAsFactors =FALSE) pop_data=totaldata pop_data$PRS = pop_data$pop_scaled_prs##print("PRS Summary")#print(summary(pop_data$PRS)) ldl_model <-lm( ldl1_adj ~ PRS +as.numeric(Age) +as.factor(Sex) + PC1 + PC2 + PC3 + PC4 + PC5 + PC6 + PC7 + PC8 + PC9 + PC10,data = pop_data[!is.na(pop_data$ldl1_adj),] ) ldl_coef_ci <-confint(ldl_model, "PRS")#print(paste("LDL_coef_CI for population:", pop))if ("PRS"%in%names(coef(ldl_model))) {#print(confint(ldl_model, parm = "PRS")) } else {#print("PRS coefficient is not available in the model.") }