首页
网站开发
桌面应用
管理软件
微信开发
App开发
嵌入式软件
工具软件
数据采集与分析
其他
首页
>
> 详细
辅导COMP9418、讲解program编程设计、Python程序语言辅导 解析Java程序|辅导留学生 Statistics统计、回归、迭代
项目预算:
开发周期:
发布时间:
要求地区:
Assignment 2
COMP9418 – Advanced Topics in Statistical Machine Learning
Last revision: Monday 2nd November, 2020 at 19:11
Assignment designed by Jeremy Gillen
Instructions
Submission deadline: Sunday, 22nd November 2020, at 18:00:00.
Late Submission Policy: The penalty is set at 20% per late day. This is a ceiling penalty, so if a group is
marked 60/100 and they submitted two days late, they still get 60/100.
Form of Submission: This is a group assignment. Each group can have up to two students. Write the
names and zIDs of each student at the top of solution.py and in your report. Only one member of the
group should submit the assignment.
There is a maximum file size cap of 5MB, so make sure your submission files do not in total exceed this size.
You are allowed to use any Python library used in the tutorial notebooks or given in the example code.
No other library will be accepted, particularly libraries for graph and Bayesian network representation and
operation. Also, you can reuse any piece of source code developed in the tutorials.
Submit your files using give. On a CSE Linux machine, type the following on the command-line:
$ give cs9418 ass2 solution.py report.pdf *.csv *.py
Zero or more csv files can be submitted to store the parameters of your model, to be loaded by solution.py
during testing. Zero or more python helper files may be included in the submission, if you want to organise
your code using multiple files.
Alternatively, you can submit your solution via WebCMS.
Recall the guidance regarding plagiarism in the course introduction: this applies to this assignment, and if
evidence of plagiarism is detected, it will result in penalties ranging from loss of marks to suspension.
Changelog
Oct 30th: Added clarification that the cost is calculated using instantaneous counts of people in each room
(i.e. every 15 seconds, a snapshot of each room is magically taken at exactly the same time, and the number
of people in each room is counted. If someone passes through multiple rooms within 15 seconds, they will
not increment the count in multiple rooms, only in one room). Added clarification that the ground truth
number of people in each room is also an instantaneous value. Added clarification that sensor data is not
instantaneous, but robot reports are.
Nov 2nd: Added additional information: the number of people who come to the office each day varies
according to this distribution: num_people = round(Normal(mean=20, stddev=1)). This information was
obtained from records of the number of workers present each day.
1
Description
In this assignment, you will write a program that plays the part of a “smart building”. This program will be
given a real-time stream of sensor data from throughout the building, and use this data to decide whether
to turn on the lights in each room. Your goal is to minimise the cost of lighting in the building, while also
trying to make sure that the lights stay on if there are people in a room. Every 15 seconds, you will receive a
new data point and have to decide whether each light should be turned on or off. There are several types
of sensors in the building, each with different reliability and data output. You will be given a file called
data.csv containing one day of complete data with all sensor values and the number of people in each room.
This assignment can be approached in many different ways. We will not be giving any guidance on what
algorithms are most appropriate.
Your solution must include a Probabilistic Graphical Model as the core component. Other than that you are
free to use any algorithm as part of your approach, including any algorithm available in Python’s sklearn
library.
It is recommended you start this assignment by discussing several different possible approaches with your
partner. Make sure you discuss what information you have available, what information is uncertain, and
what assumptions it may be reasonable to make.
Every area on the floor plan is named with a string of the form ‘r’, ‘c’, ‘o’, or ‘outside’. ‘r’,‘c’ and ‘o’ stand
for room, corridor, and open area respectively.
Data
The file data.csv contains complete data that is representative of a typical weekday in the office building.
This data includes the output of each sensor as well as the true number of people in each room. This data
was generated using a simulation of the building, and your program will be tested against many days of data
generated by the same simulation. Because this data would be expensive to collect, you are only given 2400
complete data points, from a single workday. The simulation attempts to be a realistic approximation to
reality, so it includes many different types of noise and bias. You should treat this project as if the data came
from a real office building, and is to be tested on real data from that building. You can make any assumptions
that you think would be reasonable in the real world, and you should describe all assumptions in the report.
Part of your mark will be determined by the feasibility of your assumptions, if applied to the real world.
Added Nov 2nd: [** The number of people who come to the office each day varies according to
this distribution: num_people = round(Normal(mean=20, stddev=1)). This information was
obtained from records of the number of workers present each day, and the empirical distribution
of num_people was found to be identical to round(Normal(mean=20, stddev=1)).**]
Data format specification
Sensor data
Your submission file must contain a function called get_action(sensor_data), which receives sensor data
in the following format:
sensor_data = {'reliable_sensor1': 'motion', 'reliable_sensor2': 'motion',
'reliable_sensor3': 'motion', 'reliable_sensor4': 'motion',
'unreliable_sensor1': 'motion', 'unreliable_sensor2': 'motion',
'unreliable_sensor3': 'motion', 'unreliable_sensor4': 'motion',
'door_sensor1': 0, 'door_sensor2': 0, 'door_sensor3': 0, 'door_sensor4':0,
'robot1': ('r1', 0), 'robot2': ('r16', 0), 'time': datetime.time(8, 0), 'electricity_price': 0.81}
Added Oct 30th: [The motion and door sensors report on motion from the entire previous 15
seconds, but the robot reports an instantaneous count of the number of people.]
2
The possible values of each field in sensor_data are:
• reliable_sensors and unreliable_sensors can have the values [‘motion’, ‘no motion’]. All reliable_sensors
are of the same brand and are usually quite accurate. unreliable_sensors are a different type of motion
sensor, which you know tends to be a little less accurate.
• door_sensors count how many people passed through a door (in either direction), so it can be any
integer.
• The robot sensors are robots that wander around the building and count the number of people in each
room. The value is a 2-tuple of the current room, and the number of people counted. I.e. if the robot
goes into r4 and counts 8 people, it would have the value (‘r4’,8). If it goes into room ‘c2’ and no one is
present, it would have value (‘c2’,0).
• Any of the sensors may fail at any time, in which case they will have the value None. They may start
working again.
The value of time is a datetime.time object representing the current time. Datapoints will be provided in 15
second resolution, i.e., your function will be fed data points from 15 second intervals from 8 am - 6 pm.
Training data
The file data.csv contains a column for each of the above sensors, as well as columns for each room, which
tell you the current number of people in that room. The columns of data.csv are the following and can be
divided into two groups:
1. Columns that represent readings from sensors, as described in the previous section: reliable_sensor1,
reliable_sensor2, reliable_sensor3, reliable_sensor4, unreliable_sensor1, unreliable_sensor2, unreliable_sensor3,
unreliable_sensor4, robot1, robot2, door_sensor1, door_sensor2, door_sensor3,
door_sensor4, time, electricity_price.
2. Columns that are present only in the training data and provide the ground truth with the number
of people in each room, corridor, open area, and outside the building: r1, r2, r3, r4, r5, r6, r7, r8,
r9, r10, r11, r12, r13, r14, r15, r16, r17, r18, r19, r20, r21, r22, r23, r24, r25, r26, r27, r28, r29, r30,
r31, r32, r33, r34, r35, c1, c2, c3, c4, o1, outside. Added Oct 30th: [This ground truth data
provides the instantaneous count of people per room (i.e. Every 15 seconds, a snapshot
of each room is magically taken at exactly the same time, and the number of people in
each room is counted. If someone passes through multiple rooms within 15 seconds, they
will not increment the count in multiple rooms, only in one room) ].
Note that the first column of data.csv is the index, and has no name.
You should use this data to learn the parameters of your model. Also, you can save the parameters to csv
files that can be loaded during testing.
Action data
get_action() must return a dictionary with the following format. Note that every numbered room named
“r” in the building has lights that you can turn on or off. All other rooms/corridors have lights that are
permanently on, which you have no control over, and which do not affect the cost.
actions_dict = {'lights1': 'off', 'lights2': 'off', 'lights3': 'off',
'lights4': 'off', 'lights5': 'off', 'lights6': 'off', 'lights7': 'off',
'lights8': 'off', 'lights9': 'off', 'lights10': 'off', 'lights11': 'off',
'lights12': 'off', 'lights13': 'off', 'lights14': 'off', 'lights15': 'off',
'lights16': 'off', 'lights17': 'off','lights18': 'off', 'lights19': 'off',
'lights20': 'off', 'lights21': 'off', 'lights22': 'off', 'lights23': 'off',
'lights24': 'off', 'lights25': 'off', 'lights26': 'off', 'lights27': 'off',
'lights28': 'off', 'lights29': 'off','lights30': 'off', 'lights31': 'off',
'lights32': 'off', 'lights33': 'off', 'lights34': 'off','lights35': 'off'}
3
The outcome space of all actions is (‘on’,’off’).
In the provided example_solution.py, there is an example code stub that shows an example of how to set
up your code.
Figure 1 shows the floor plan specification.
Cost specification
If a light is on in a room for 15 seconds, it usually costs you about 1 cent. The exact price of electricity goes
up and down, but luckily, the electricity provider lists the current price online, and this price is included
in the sensor_data. If there are people in a room and there is no light on, it costs you 4 cents per person
every 15 seconds, because of lost productivity. Added Oct 30: [The cost can be calculated exactly
using the complete training data, so it is also based on an instantaneous count of the number
of people in each room.]
Your goal is to minimise the total cost of lighting plus lost productivity, added up over the whole day. You do
not need to calculate this cost, the testing code will calculate it using the actions returned by your function,
and the true locations of people (unavailable to you). The file example_test.py shows exactly how the cost
is calculated.
Testing specification
Your program must be submitted as a python file called solution.py. During testing, solution.py will be
placed in a folder with test.py. A simpler version of test.py has been provided (called example_test.py),
so you can confirm that testing will work. A more elaborate version of test.py will be used to grade your
solution.
Report
Your report should cover the following points:
• What algorithms you used, a brief description of how they work, and their time complexity.
• A short justification of the methods you used (if you tried different variations, describe them).
• Any assumptions you made when creating your model.
The report must be less than 2000 words (around 4 pages of text). The only accepted format is PDF.
Marking Criteria
This assignment will be marked according to the following criteria:
1. 50% of the mark will be determined by the cost incurred by your code after several days of simulated
data. The mapping from cost to marks will be determined after the assignment has been submitted.
2. 20% of the mark will be determined by the description of the algorithms used, and a short justification
of the methods used.
3. 10% of the mark will be determined by a description of the assumptions and/or simplifications you
made in your model, and whether those assumptions would be effective in the real-world.
4. 20% of the mark will be determined by the quality, readability and efficiency of the code.
Items 2 and 3 will be assessed using the report. Items 1 and 4 will be assessed using python file.
4
Figure 1: Floor plan (note that dotted grey lines denote the boundaries of areas when the boundary is
unclear).
5
Bonus Marks
Bonus marks will be given to the top 10 performing programs (10 percentage points for 1st place, 1 percentage
point for 10th place).
6
软件开发、广告设计客服
QQ:99515681
邮箱:99515681@qq.com
工作时间:8:00-23:00
微信:codinghelp
热点项目
更多
代写math 1151, autumn 2024 w...
2024-11-14
代做comp4336/9336 mobile dat...
2024-11-14
代做eesa01 lab 2: weather an...
2024-11-14
代写comp1521 - 24t3 assignme...
2024-11-14
代写nbs8020 - dissertation s...
2024-11-14
代做fin b377f technical anal...
2024-11-14
代做ceic6714 mini design pro...
2024-11-14
代做introduction to computer...
2024-11-14
代做cs 353, fall 2024 introd...
2024-11-14
代做phy254 problem set #3 fa...
2024-11-14
代写n1569 financial risk man...
2024-11-14
代写csci-ua.0202 lab 3: enco...
2024-11-14
代写econ2226: chinese econom...
2024-11-14
热点标签
mktg2509
csci 2600
38170
lng302
csse3010
phas3226
77938
arch1162
engn4536/engn6536
acx5903
comp151101
phl245
cse12
comp9312
stat3016/6016
phas0038
comp2140
6qqmb312
xjco3011
rest0005
ematm0051
5qqmn219
lubs5062m
eee8155
cege0100
eap033
artd1109
mat246
etc3430
ecmm462
mis102
inft6800
ddes9903
comp6521
comp9517
comp3331/9331
comp4337
comp6008
comp9414
bu.231.790.81
man00150m
csb352h
math1041
eengm4100
isys1002
08
6057cem
mktg3504
mthm036
mtrx1701
mth3241
eeee3086
cmp-7038b
cmp-7000a
ints4010
econ2151
infs5710
fins5516
fin3309
fins5510
gsoe9340
math2007
math2036
soee5010
mark3088
infs3605
elec9714
comp2271
ma214
comp2211
infs3604
600426
sit254
acct3091
bbt405
msin0116
com107/com113
mark5826
sit120
comp9021
eco2101
eeen40700
cs253
ece3114
ecmm447
chns3000
math377
itd102
comp9444
comp(2041|9044)
econ0060
econ7230
mgt001371
ecs-323
cs6250
mgdi60012
mdia2012
comm221001
comm5000
ma1008
engl642
econ241
com333
math367
mis201
nbs-7041x
meek16104
econ2003
comm1190
mbas902
comp-1027
dpst1091
comp7315
eppd1033
m06
ee3025
msci231
bb113/bbs1063
fc709
comp3425
comp9417
econ42915
cb9101
math1102e
chme0017
fc307
mkt60104
5522usst
litr1-uc6201.200
ee1102
cosc2803
math39512
omp9727
int2067/int5051
bsb151
mgt253
fc021
babs2202
mis2002s
phya21
18-213
cege0012
mdia1002
math38032
mech5125
07
cisc102
mgx3110
cs240
11175
fin3020s
eco3420
ictten622
comp9727
cpt111
de114102d
mgm320h5s
bafi1019
math21112
efim20036
mn-3503
fins5568
110.807
bcpm000028
info6030
bma0092
bcpm0054
math20212
ce335
cs365
cenv6141
ftec5580
math2010
ec3450
comm1170
ecmt1010
csci-ua.0480-003
econ12-200
ib3960
ectb60h3f
cs247—assignment
tk3163
ics3u
ib3j80
comp20008
comp9334
eppd1063
acct2343
cct109
isys1055/3412
math350-real
math2014
eec180
stat141b
econ2101
msinm014/msing014/msing014b
fit2004
comp643
bu1002
cm2030
联系我们
- QQ: 9951568
© 2021
www.rj363.com
软件定制开发网!