COMP0233
Planning Delivery Tours - Coursework 1 (Individual) - #001
1 Foreword and Summary
Please READ THIS ASSIGNMENT CAREFULLY. If you have any questions about the assignment, please post them on the Q&A forum in Moodle or email the module leaders. Do not post samples of your code concerning the assignment to public forums such as Moodle.
This assignment asks you to write some code for choosing the location for a new depot to minimise delivery times to the surrounding area. We will describe how the code must behave, but it is up to you to fill in the implementation. Besides this, you will also need to create some tests, and demonstrate your ability to use version control (via git).
The exercise will be semi-automatically marked, so it is very important that your solution adheres to the correct file and folder name convention and structure, as defined in the rubric below. An otherwise valid solution that doesn’t work with our marking tool will not be given credit.
For this assignment, you can only use:
• The Python standard library,
• numpy,
• matplotlib, and
• pytest.
Your code should work with Python 3.10 or newer. Some modules in the standard library that might be useful to keep in mind are:
• csv,
• string,
• timeit.
This document is laid out as follows:
• First, we provide the setting of the problem that you will be solving.
• Next, we outline the functionality that should be implemented, and any other tasks you should complete.
• Finally, to assist you in creating a good solution, we state the marking scheme we will use.
2 Setting
You have been contacted by Courier Ltd. (CLtd), a (fictional) delivery company who are looking for help in choos-ing where to construct their newest delivery depot. CLtd want you to help them decide where to build their delivery depot so they can best serve the major towns and cities, and have provided you with information on the location of these places. To provide them with an answer, you will need to write some code to make it easier to load, analyse, and visualise the data. We have provided you with a Jupyter notebook (deciding_depot_location.ipynb) containing the workflow you should be able to run once your code is complete.
CLtd have identified a number of locations that they want their new delivery depot to be able to serve, as well as a number of sites that would be suitable for building the depot on. They use the following terminology:
• Location means any town, city, or potential building site for the new depot. It is a broad term for any place of interest that CLtd has identified.
• Settlement will be used to mean “a location that is not a potential building site for the depot”. Think of these as towns, cities, or other industrial areas that will need deliveries sent to them.
• Depot (or “building site”) will be used to mean “a location that is a potential building site”. These are the places where the CLtd is thinking of potentially setting up their delivery depot.
• Region is a “collection of locations”. Locations that belong to the same region are typically related in some administrative manner.
• Country is a “collection of regions”. Like how England is a country, which has several counties (regions). A Country of course still has locations in it - those locations that belong to the regions.
CLtd want to build their new depot at a building site that allows them to provide fast delivery times to the surrounding settlements. They have decided that they want to select the building site that minimises the time taken for a delivery to travel from the depot, visit every settlement, then return to the depot. Essentially, CLtd want to minimise the time it would take a single delivery-horse to deliver to every settlement, in a single trip. It will be up to you to write some reliable, reproducible code to determine which location they should build at.
2.1 Format of the Data File(s)
The data files that CLtd expect to handle are all comma-separated value (.csv) files, and have a particular structure. Each location in the country is one row in the file, and you are given information about;
• Its name. Note that this comes under the column that is headed “location”.
• Its region. Regions will be important when it comes to working out how quickly a delivery can move between locations. Every location belongs to precisely one region.
• Its polar coordinates.
• Whether the location is being considered as a potential site for the new depot. This is reflected by the value in the depot column; which is either TRUE (if the location is being considered by the company as a depot) or FALSE (it is not).
You can read more about how the dataset was created in the appendix. We have provided you with the data file (locations.csv) concerning the country CLtd is interested in.
2.2 Travel Between Locations
In reality there are a vast number of routes that can be taken between any two given locations, and CLtd has not provided us with any information about the road or transport network. Instead, they have told us to assume that all locations are directly connected to each other.
Travelling between two locations takes an amount of time that depends on the physical distance between the locations, and whether or not they share the same region. Travel time in hours between two locations is given by the formula
(1)
where
• D the physical distance between the locations, in metres.
• S is the travel speed between the two locations, in metres per second.
• Rdiff (“Different Regions”) is equal to 1 if the locations belong to different regions, and 0 otherwise.
• Nlocs (“Locations in Destination Region”) is equal to the number of locations in the same region as the destination location (including the destination itself). This has the effect of making travel into busier regions more time consuming than leaving those regions, and represents the interregional “border control” that CLtd expect to encounter.
2.3 Tours
A tour is a sequence of journeys between locations that starts at one particular depot, visits every settlement exactly once, then returns to the depot it started at. Tours are written by listing the locations that the delivery-horse visits, in the order of travel. CLtd need us to find the depot that provides the shortest tour for the country data provided in locations.csv.
For example, consider the country defined below:
Within this country, the following are all tours:
• Brightwater -> Sapphire Plaza -> Cobalt Mine -> Cinnabar Beach -> Cardinal’s Rest -> Brightwater.
• Rosewood -> Sapphire Plaza -> Cinnabar Beach -> Cardinal’s Rest -> Cobalt Mine -> Rosewood.
• Bluebell Meadow -> Sapphire Plaza -> Cardinal’s Rest -> Cinnabar Beach -> Cobalt Mine -> Bluebell Meadow.
Also notice that, if there are s settlements within a country, there will be
possible tours that start at each depot. (s! is read as “s factorial”). So, if there are d depots in the country, there are (d × s!) possible tours that can be undertaken. In the example shown above with d = 3 depots and s = 3 settlements there are 3 × 3! = 18 possible tours. If we take the country in the locations.csv file as an example; this provides d = 5 depots and s = 14 settlements, which gives 4.35 × 1011 (just over 400 billion) possible tours - so you can quickly see why we need a computer to tell us which depot is the best one!
The time taken to complete a tour is calculated by summing the travel times of the individual journeys that make up the tour.
2.3.1 Computing Tours
As mentioned above, there are multiple tours available starting at any given depot in the country, so we need a way to compute tours given a starting depot. To compute an appropriate tour from each depot, CLtd would like you to use the Nearest-Neighbour Algorithm (NNA), which we outline now. We define
• The current location as the location we last moved to, and now want to move out of.
• The starting depot as the depot we want this tour to start and end at.
• The target settlement as the location we are considering moving to next.
The algorithm then proceeds as follows:
1. Set the current location to be the starting depot, and initialise the path as a list that only contains the starting depot.
2. Label all of the settlements we want to visit as part of the tour as unvisited.
3. Select the unvisited settlement that has the smallest travel time from the current location. Set it as the target settlement.
4. Add the target settlement to the end of the path. Set the current location to be the target settlement.
5. Mark the current location as visited.
6. If there are unvisited settlements remaining, go to step 3.
7. Add the starting depot to the end of the path.
Once the final step of the algorithm completes, the path is the computed tour. The travel time to complete the tour can be computed by summing the travel time of the individual journeys.
2.3.1.1 Edge Cases in the NNA Some operational notes on edge cases in the NNA:
• If there are no locations to visit in the first place, that is if in step 2 there are 0 locations marked as unvisited, then the tour is just given as the starting location (we don’t need to go anywhere). The travel time of the tour should be interpreted as 0.
• It is possible for there to be two locations that are equidistant from the current location, which both happen to have the minimal travel time. We will specify what to do in this case when we discuss your coding tasks.
2.3.1.2 Example Suppose that we have the following locations in a country with only a single region,
• Brightwater (depot)
• Bluebell Meadow (depot)
• Sapphire Plaza
• Cobalt Mine
• Azure Observatory
and the locations have the following travel times to get between each other;
We can actually illustrate this country using a network diagram, as in fig. 1.
In this example we will break any ties in our NNA by selecting the location that comes alphabetically before the other(s). Using the NNA to find a tour starting from Brightwater:
1. The path starts at Brightwater. Brightwater is the current location.
2. Sapphire Plaza, Cobalt Mine, and Azure Observatory are all unvisited settlements. Note that Bluebell Meadow is not a settlement, so we don’t want to visit it on this tour.
3. Sapphire Plaza and Cobalt Mine are both 5 hours travel time from Brightwater, which is the shortest travel time available from this location. Since Cobalt Mine is alphabetically before Sapphire Plaza, so Cobalt Mine is set as the target settlement.
Figure 1: The network representing the example country with a single region. Arrows indicate the connections between locations (note every location connects to every other), and the numbers beside the arrows represent the travel time.
4. The path becomes Brightwater -> Cobalt Mine. Cobalt Mine is marked as visited, and the current location moves to Cobalt Mine.
5. From Cobalt Mine, the nearest unvisited settlement is Azure Observatory with a travel time of 2, so we go there next.
6. Then, from Azure Observatory the only remaining unvisited settlement is Sapphire Plaza, which we go to next.
7. Finally, since we have no remaining unvisited settlements, we return to the starting depot (Brightwater).
The final tour is thus Brightwater -> Cobalt Mine -> Azure Observatory -> Sapphire Plaza -> Brightwater. The travel time of the tour is 5 + 2 + 4 + 5 = 16 hours.
If we had chosen to find a tour from Bluebell Meadow instead, we would have obtained the tour: Bluebell Meadow -> Sapphire Plaza -> Cobalt Mine -> Azure Observatory -> Bluebell Meadow. This would have given a total travel time of 2 + 3 + 2 + 3 = 12 hours. In this country, CLtd’s best site for their new depot would thus be at Bluebell Meadow, as it provides the shortest tour via the NNA.
2.4 Goal
Your goal for this assignment is to provide CLtd with some code that will enable them to determine the best site to place their new depot at, given the data provided in locations.csv. What you produce will also need to be general enough that CLtd can use it to perform. similar analysis in the future, on different (and potentially much bigger) datasets. As such, your code will need to be reliable, accurate, efficient, and version-controlled.
Our marking script. will be testing the functionality of your code over the locations.csv file, as well as other data files of the same format but which describe different countries. Therefore, do not assume that testing on the locations.csv file is sufficient proof that your code performs correctly.
3 Project-Wide Tasks
This section covers tasks that you’ll need to be doing throughout your time working on this assignment. They are not restricted to one function or one feature, and affect your submission as a whole.
You may find it useful to take a look at the getting started section before you actively start working on the tasks in this section.
3.1 Use of Git
To track your changes as and after you make them, you should work in a git repository. You should initialise this repository inside a folder named depot_locations. You will place the .py files containing your code in this folder. In addition to the code directly inside depot_locations, you should also track the content of both the data and report subfolders (once you create them).
You should make commits as you work through the assignment and add functionality, or complete the various written tasks. When you submit, your repository should contain everything needed to run the code and tests (see below), but no files that are not necessary. In particular, you should not commit “artefact” files that are produced by your code or IDE. Please refer to the course notes for information on how to exclude such files from your repository!
As this is an individual assignment, you may work on one or multiple branches, as you prefer. However note that we will only mark the latest commit on the main branch of the repository, and will run git switch main before grading, so be careful about leaving changes on other branches or uncommitted changes to the HEAD of main. Due to our automated marking tool, only work that has a valid git repository, and follows the folder and file structure described in the relevant section of this assignment, will receive credit.
You must use GitHub for your work, you need to use the repository that you’ll get access to by accepting this invitation. After you accept the permissions, this will create a repository named depot_locations- <gh_username>. To prevent plagiarism, avoid uploading your work to any public git repository. Failure to comply with this requirement would be considered academic misconduct.
You have to submit a link to that GitHub classroom repository to Moodle in order to be considered a valid submission. To do so, write the repository link into a plain text file called submission.txt and submit that file only.
3.1.1 Some Reminders about the Use of Git
• Make sure that you give your commits meaningful descriptions!
The goal is that someone (you or others) can look at your commit history in the future and get a rough idea of what changes have happened. The messages should guide them in finding when a particular change was made – for example, if they want to undo it or fix a related bug. Therefore, avoid vague messages (e.g., “Fixed a bug”, “Made some changes”) or ones that don’t describe the changes at all (e.g., “Finished section 1.2.1”). Prefer the use of concrete messages such as “Check the type of the arguments” or “Add tests for reading data”.
• Do not be afraid to roll back (git revert) your own commits!
A big part of why version control is useful - even in solo work - is the ability to revert back to something you know is working.
• If you accidentally commit an artefact file, and notice later, do not panic - there is no need to delete everything and start over.
Simply make a new commit that deletes the artefact from the repository, and give it a clear commit message such as: “deleted artefact <artefact-name> that was accidentally committed in <commit-sha>”. Do not conflate such commits with other changes though (that is, do not remove an artefact and in the same commit make additional changes). You will not be penalised for committing files that you later identify as artefacts and remove (though it might be better to use .gitignore to avoid accidentally committing them in the first place).
3.2 Testing
We expect you to provide unit tests for the code your write as part of the assignment tasks, which extends to any functions or methods that you break out into smaller chunks. You can write all of these at the end, but we recommend you write them as you go along.
You should provide all tests inside a test_country.py file, and use the pytest testing framework. test_country.py should include tests for functions from utilities.py, as well as the Location and Country classes.
Running pytest in your submission directory should find the tests inside test_country.py, and the tests should pass:
(my-environment) $ pwd
path/to/my/work/depot_locations/
(my-environment) $ pytest
platform. linux -- Python 3.11.5, pytest-7.4.0, pluggy-1.0.0
rootdir: path/to/my/work/depot_locations/
plugins: None
collected X items
Make sure that your submission includes all files that are needed to run the tests. You are welcome to use the locations.csv file in your test cases, or write your own custom country data for your tests. If you do include your own country data, make sure that you include the data files for your tests in your submission - see the section on your submission file format for where you should place these files.
NOTE: You should also be be aware that the locations.csv dataset is quite large - comprising 400 connections between 20 locations. As such, you might find it difficult to manually compute tours on this network for testing purposes.
3.2.1 Code that does not Require Testing
Code that does not require testing will be explicitly indicated in the assignment text, otherwise it should be assumed that tests are required as part of the coding task in question. You do not need to write tests for the functions and methods that are provided to you by us, and you will not receive credit for such tests if they are included.
Specifically, the following functions and methods do not require you to write tests for them:
• All functions within plotting_utilities.py.
• Location.__repr__
• Country.plot_country
• Country.plot_path
• regular_n_gon
You do not need to write unit tests for your execution_time.py script. You will not receive credit for such tests if they are included.
3.2.2 Designing and Writing your Tests
At a minimum, you should have at least one test for each of the units and/or cases below:
• Errors with appropriate error messages are thrown when invalid values are encountered during creation of Locations and Countrys. See the requirements for creating Locations and creating Countrys.
• Within the Location class;
∘ The distance_to function.
∘ The use of the == operator.
• For the Country class;
∘ The depots and settlements are correctly identified by the respective properties.
∘ The travel_time method functions as intended.
∘ fastest_trip_from works as intended, including when passed its optional argument.
∘ nn_tour correctly implements the NNA in a case you have manually verified.
∘ best_depot_site correctly identifies the best building site.
However, remember that when writing unit tests your objective is to check;
• the core functionality of the unit,
• error-cases,
• edge-cases,
• deliberate exceptions the unit might have to its normal behaviour.
For the longer, more complex methods and functions we ask you to write, you will need more than one test to provide adequate coverage. Fixtures and test parametrisations, as well as other pytest features, might be good ideas to look into.
Don’t forget that the purpose of unit tests is to “build up”; you do not need to test aspects of your methods that call your other methods / functions that already have unit tests! For example, best_depot_site is likely to depend on nn_tour. Provided your test coverage of nn_tour is sufficient, you don’t need to check that the best depot found the correct tour, since this is covered by nn_tours unit tests. You only need to verify that best_depot_site is returning the depot you expect assuming all the tours were computed correctly (the details of how it obtained those tours is then covered by your other methods and their tests)!
3.3 Written Answers, Code Styling, and Documentation
Some tasks will ask you to provide written answers. Place these answers inside the report/report.md file, clearly heading the answers using their section name in the assignment. Any plots you are asked to produce should be committed to the report folder too, saved as .png files.
You are expected to present your submission in a clear, readable fashion. You should stick to a consistent coding style. as you work, maybe even using a code linter to help you. We also expect you to write docstrings and leave appropriate comments in your codebase to help a new user understand how to use it.