MATH377: Financial and Actuarial Modelling in R
Tutorial 4
Exercise 1. For the Chi-Squared distribution (root name chisq):
a) Use 3 degrees of freedom to generate a sample of size 1000.
b) Plot a histogram of your sample in a).
c) Write an R function to compute the loglikelihood function. Test your function with the simulated sample in a) and 2 degrees of freedom.
d) With your simulated sample in a), plot the loglikelihood function for parameter values between 1 and 4.
e) Add a vertical line to your plot indicating the original parameter value.
f) Compute the maximum likelihood estimator for your sample.
g) Add the estimated density function to your plot in b).
h) Find the quantile-matching estimator using the median (i.e., 50%).
i) Create a QQ-plot to assess visually the model fitted via quantile-matching.
Exercise 2. In this exercise, we will compare the fit of two discrete distributions. Recall that the density function of a negative binomial random variable X with parameters α > 0 and p ∈ (0, 1] is given by
where Γ(·) denotes the Gamma function.
a) Create a sample of size 1000 from a negative binomial distributed random variable with parameters α = 15 and p = 0.6.
b) Create a histogram of your sample in a). Hit: Use something like breaks = 0:max(x) to show bins of size 1 if you use hist(). Alternatively, you can use plot() in combination with table().
c) Fit a Poisson model to your sample in a) via maximum likelihood estimation.
d) Fit a negative binomial model to your sample in a) via maximum likelihood estimation. Note: Check the help of nbinom and fitdist to see which arguments are being used.
e) Plot the fitted densities along with the histogram of the sample and conclude which model describes the data better. Hint: Use points() to plot the densities evaluations.
f) Confirm your selection in e) by using an information criteria.
Exercise 3. Consider the danish fire insurance data set danishuni. As in the lecture notes, take the losses above 1 million danish kroner and subtract 1 million to all data points to bring the data to the origin.
a) Fit a lognormal distribution to the data via MLE.
b) Fit a loglogistic distribution to the data via MLE. Hint: This distribution is available in the actuar package under the root name llogis.
c) Compute the mean for both fitted distributions.
d) Compute the 95%, 99%, and 99.9% quantiles for both fitted distributions.
e) Which one of these two models seems to describe the data better? Remember, this data set is not easy to describe with a global model. Hence, neither of these two models may be a great overall choice.