MA4601/MAT061 Stochastic Search and Optimisation
Assignment 5: Markov Decision Processes - Corrected
Due 12:00 mid-day, Thursday 7th May
In this assignment we will use Markov Decision Processes to develop an optimal strategy for
harvesting salmon. The problem is based on the paper “Optimal harvest strategies for salmon
in relation to environmental variability and uncertainty about production parameters” by C.J.
Walters (1975). A copy of the paper is availble on the course web site.
You will need to submit two files: a programme file titled YOUR NAME programme.r (or .py,
.jl, etc.) and a report as a pdf file titled YOUR NAME report.pdf. Submission by email to
. The report should be presented as a stand-alone document that
can be understood without having to read your code. It should be no more than four pages
long.
Consider a salmon fishery. Each year we catch some number of fish and leave the rest to
spawn. The state of the system in any season is the size of the population x, and the action
taken is the size of the population left for spawning y. The immediate reward (without discount
factor) is then the number of fish caught, namely x − y. At the beginning of the next season
the salmon population will be W where, for α ∼ N(α, σ2) and population capacity m,
W ∼ byeα(1−y/m)c
Here bxc is the floor of x, that is the largest integer less than or equal to x.
For the assignment do the following
1. From Walters (1975) determine values for α, σ and m. Note that they may depend on
the chosen unit of measurement.
2. For discount factor γ = 1 find the optimal policy for time horizon N = 0, 1, . . . , 20.
Comment on any pattern(s) that you see.
3. For a variety of discount factors γ ∈ (0, 1) find the optimal policy for an infinite time
horizon. How do these policies compare to the finite time policies?
Marks will be allocated on the following basis:
50% Code correctness (how well does it work).
25% Quality of analysis (what have we learnt about harvesting salmon).
25% Clarity of report.
x