MATH39512 Survival Analysis for Actuarial Science: example sheet 3
*=easy, **=intermediate, ***=difficult
* Exercise 3.1
Let T1,..., T5 be i.i.d. positive random variables, where the random variable Ti represents the genuine failure time of individual i. Let C1,..., Cn be positive random variables, where the random variable Ci represents the censoring time of individual i. Assume we have the following outcomes of the random variables Ti and Ci for i = 1, . . . , 5: (T1, T2 , T3 , T4 , T5 ) = (5, 7, 3, 8, 5) and (C1, C2 , C3 , C4 , C5 ) = (6, 6, 9, 8, 8).
(a) Assume that for each individual i we observe him/her from time 0 until the minimum of Ti and Ci and we record this minimum of Ti and Ci as his/her survival time. The survival time of individual i is considered to be censored if and only if Ci < Ti. Compute the Nelson-Aalen estimate of the cumulative hazard function corresponding to the above observations and plot your estimate.
(b) Let Nt = Σi(5)=1 1{Ti≤t,Ti≤Ci} . Draw the sample path of the process N = {Nt : t ≥ 0} that corresponds to the above observations. What does Nt represent?
(c) Let Rt = Σi(5)=1 1{Ti≥t,Ci≥t} . Draw the sample path of the process R = {Rt : t ≥ 0} that corresponds to the above observations. What does Rt represent?
(d) Draw the sample path of the process A = {At
: t ≥ 0} defined by At =
R 0
t∧τR Rs/1 dNs that corresponds to the above observations and where τR = inf{t > 0 : Rt = 0}. What can you conclude about At?
** Exercise 3.2
Suppose we have the following survival data for two homogeneous groups, where + denotes a censored survival data:
Group a 15+ 24 30 40+ 42 43+
Group b 10 12+ 26 28 29 41
The hazard function of the survival time of an individual from group a, respectively b, is denoted by µa(t), respectively µb(t). We want to test whether the survival time distribution of an individual from group a is different from that of an individual from group b. To this end, we want to test, for a given t0 , the null hypothesis,
H0 : µa(t) = µb(t) for all t ∈ [0, t0 ∧ τR],
versus HA : µa(t) µb(t) for some t ∈ [0, t0 ∧τR], where τR is the first time when there are no more individuals at risk in one of the two groups. Recall that the log-rank test says to reject H0 at significance level α if |Zt0 / √Vt0 | > zα/2, where
• with Rt
j
denoting the number of individuals at risk just before time t in group j and Aj (t) denoting the Nelson-Aalen estimator at time t of the cumulative hazard function of an individual from group j;
• with Nt
j
denoting the number of individuals in group j that are observed to have failed before or at time t;
• zα > 0 is such that Φ(zα) = 1 − α, where Φ(·) is the cumulative distribution function of the standard normal distribution.
(a) Given the survival data at the beginning of the question, perform, by hand, the log-rank test with t0 = 50 and at significance level 0.05 and report your conclusions.
(b) (i) Show that in general (i.e. without making use of the above survival data) for any t ≥ 0,
(ii) Verify that the two equalities in (i) are correct for the particular survival data given at the beginning of the question
(iii) Explain in words why can, under the null hypothesis, be interpreted as
the expected number of deaths by time t in group a given that we observe/know, at each time, the numbers at risk in each group and the aggregate number of deaths over both groups (but not the number of deaths in each group separately).
** Exercise 3.3
Consider the following observed values of the survival times of a group of independent homogeneous individuals, where + denotes a censored value: 1, 3, 3+, 4+, 5, 6, 6+, 6+, 8, 10+. We want to test whether the hazard rate is a constant given by λ = 0.15. Therefore we want to test the null hypothesis
H0 : µ(t) = λ for all t ∈ [0, t0] versus HA : µ(t) λ for some t ∈ [0, t0],
where λ = 0.15 and t0 = 10. The following 1-sample test says to reject H0 at significance level α if where
• Z(~)t = Nt − λ ∫0(t) Rsds with Nt denoting the number of individuals that are observed to have failed before
or at time t and with Rt denoting the number of individuals at risk just before time t,
• V(~)t = λ ∫0(t) Rsds,
• zα > 0 is such that Φ(zα) = 1 − α where Φ(·) is the cumulative distribution function of the standard normal distribution.
(a) Given the survival data at the beginning of the question, carry out this 1-sample test (with λ = 0.15 and t0 = 10) at significance level 0.05 and report your conclusions.
(b) Show that in general (i.e. independently of the survival data provided) we can rewrite the test statistic Zet0 as Zet0 =
R 0
t0 Rsd A(s) − A0(s)
, where A(t) is the Nelson-Aalen estimator and A0(t) is the cumulative hazard function under the null hypothesis.
* Exercise 3.4
Consider the following observed values of the survival times of 12 individuals, where + denotes a censored value:
1.5+ 2 2.5 4.5 5+ 7+ 7.5+ 8 8.5+ 10 12+ 14.
(a) Compute the Nelson-Aalen estimate corresponding to the above data.
(b) Under certain conditions an unbiased estimate for the variance of the Nelson-Aalen estimator at time t is given by Vb (t) := R 0
t∧τR R2s/1 dNs, where Rs is the number at risk just before time s and Ns is the number of individuals that have been observed to fail by time s. Compute Vb (t) for t ∈ [0, 14] and plot the outcome.