(Linear association between exposure & outcome; 10 marks) We must decide whether it is reasonable to assume a linear association between the numerical exposure variables, age and haemoglobin, and the log odds of death. a) Create a new variable in the dataset containing quintiles of age using the xtile command: xtile age_q5=age, nq(5) Use Stata to plot the log odds of death versus age_q5. Note please use the Stata option commands: ciplot yscale(log) yscale(range(0.5 2)) ylabel(0.25 0.5 0.75 1 1.5 2) [Note:- for earlier versions of Stata you may need to replace “ciplot” above with “graph”] Briefly summarise the plot, by describing whether the association looks linear. (3 marks) b) Using the variable age_q5, fit separate simple logistic regression models with age_q5 as a categorical variable and as a continuously valued variable. Compare the models using the likelihood ratio test and comment on whether the association between log odds of death and age is linear. (2 marks)

