Mon analyse du dataset trial
Importation
<- read.csv2(here("data/trial.csv")) trial
Description
Characteristic | Overall, N = 2001 | Drug A, N = 981 | Drug B, N = 1021 | p-value2 |
---|---|---|---|---|
age | 47.24 (14.31) | 47.01 (14.71) | 47.45 (14.01) | 0.8 |
missing values | 11 | 7 | 4 | |
marker | 0.92 (0.86) | 1.02 (0.89) | 0.82 (0.83) | 0.12 |
missing values | 10 | 6 | 4 | |
stage | ||||
T1 | 26.50% | 28.57% | 24.51% | |
T2 | 27.00% | 25.51% | 28.43% | |
T3 | 21.50% | 22.45% | 20.59% | |
T4 | 25.00% | 23.47% | 26.47% | |
grade | ||||
I | 34.00% | 35.71% | 32.35% | |
II | 34.00% | 32.65% | 35.29% | |
III | 32.00% | 31.63% | 32.35% | |
response | 31.61% | 29.47% | 33.67% | 0.5 |
missing values | 7 | 3 | 4 | |
death | 56.00% | 53.06% | 58.82% | 0.4 |
1
Mean (SD); %
2
Two Sample t-test
|
Modélisation
Analyses uni et multivariées
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -1.70335908 0.7132940 -2.3880184 0.01693950
## age 0.01911857 0.0118930 1.6075474 0.10793433
## marker 0.32829134 0.1956681 1.6777969 0.09338676
## stageT2 -0.78271069 0.4691961 -1.6681953 0.09527696
## stageT3 -0.13355845 0.4822316 -0.2769592 0.78181145
## stageT4 -0.42915078 0.4729451 -0.9074008 0.36419491
## gradeII 0.04267335 0.4273544 0.0998547 0.92045968
## gradeIII 0.05046867 0.4073378 0.1238988 0.90139538
Dependent: response | 0 | 1 | OR (univariable) | OR (multivariable) | |
---|---|---|---|---|---|
age | Mean (SD) | 45.9 (14.4) | 49.8 (14.2) | 1.02 (1.00-1.04, p=0.095) | 1.02 (1.00-1.04, p=0.108) |
marker | Mean (SD) | 0.8 (0.8) | 1.1 (0.9) | 1.35 (0.94-1.93, p=0.100) | 1.39 (0.94-2.05, p=0.093) |
stage | T1 | 34 (65.4) | 18 (34.6) | - | - |
T2 | 39 (75.0) | 13 (25.0) | 0.63 (0.27-1.46, p=0.285) | 0.46 (0.18-1.13, p=0.095) | |
T3 | 25 (62.5) | 15 (37.5) | 1.13 (0.48-2.68, p=0.775) | 0.87 (0.34-2.25, p=0.782) | |
T4 | 34 (69.4) | 15 (30.6) | 0.83 (0.36-1.92, p=0.668) | 0.65 (0.25-1.63, p=0.364) | |
grade | I | 46 (68.7) | 21 (31.3) | - | - |
II | 44 (69.8) | 19 (30.2) | 0.95 (0.45-2.00, p=0.884) | 1.04 (0.45-2.42, p=0.920) | |
III | 42 (66.7) | 21 (33.3) | 1.10 (0.52-2.29, p=0.808) | 1.05 (0.47-2.35, p=0.901) |
Number in dataframe = 200, Number in model = 173, Missing = 27, AIC = 222.6, C-statistic = 0.648, H&L = Chi-sq(8) 5.13 (p=0.743) |
Modele final
Characteristic | log(OR)1 | 95% CI1 | p-value |
---|---|---|---|
age | 0.02 | 0.00, 0.04 | 0.11 |
marker | 0.28 | -0.09, 0.64 | 0.14 |
1
OR = Odds Ratio, CI = Confidence Interval
|
L’équation du modèle final est :
\[ \begin{aligned} \log\left[ \frac { \widehat{P( \operatorname{response} = \operatorname{1} )} }{ 1 - \widehat{P( \operatorname{response} = \operatorname{1} )} } \right] &= -1.95 + 0.02(\operatorname{age}) + 0.28(\operatorname{marker}) \end{aligned} \]
Les résultats peuvent être visualisés ci dessous:
Resultats
## We fitted a logistic model (estimated using ML) to predict response with age and marker (formula: response ~ age + marker). The model's explanatory power is weak (Tjur's R2 = 0.03). The model's intercept, corresponding to age = 0 and marker = 0, is at -1.95 (95% CI [-3.22, -0.77], p = 0.002). Within this model:
##
## - The effect of age is statistically non-significant and positive (beta = 0.02, 95% CI [-3.86e-03, 0.04], p = 0.109; Std. beta = 0.27, 95% CI [-0.06, 0.62])
## - The effect of marker is statistically non-significant and positive (beta = 0.28, 95% CI [-0.09, 0.64], p = 0.138; Std. beta = 0.24, 95% CI [-0.08, 0.56])
##
## Standardized parameters were obtained by fitting the model on a standardized version of the dataset. 95% Confidence Intervals (CIs) and p-values were computed using