首页
网站开发
桌面应用
管理软件
微信开发
App开发
嵌入式软件
工具软件
数据采集与分析
其他
首页
>
> 详细
Python程序讲解、data留学生编程辅导、Python编程设计调试 辅导留学生 Statistics统计、回归、迭代|辅导Web开发
项目预算:
开发周期:
发布时间:
要求地区:
Natural Language Engineering:
Assessed Coursework 2
Submission format: You should submit one file that should either be a Python notebook
or a zip file containing a Python notebook and any other files (e.g., images
or Python files) that you want to include in the notebook.
Due date: Your work should be submitted on the module’s Canvas site before 4pm
on Thursday 26th November. This is Thursday of week 9. The standard late
penalties apply.
Return date: Marks and feedback will be provided on Canvas on Thursday December
17th for all submissions that are submitted by the due date.
Weighting This assessment contributes 20% of the mark for the module.
Overview
For this assignment you are asked to complete a python notebook (‘NLEassignment1.ipynb‘)
which is provided with these guidelines. It is based on activities that you have already
completed in labs during weeks 1-7 of the module. Any code you have developed
during the labs can be submitted as part of your answers to the questions in the assignment.
To score highly on this assignment you will need to demonstrate that you:
• understand the theory and your code;
• can write and document high quality python code;
• can develop code further to solve related problems;
• can carry out experiments and display results in a coherent way;
• can analyse and interpret results; and
• can draw conclusions and understand limitations of the technology.
For this report you should submit a single Python notebook containing all of your answers
to all of the questions in ‘NLEassignment2.ipynb‘. You may import from standard
libraries and the ‘sussex nltk‘ resources which you have been provided with. If
you wish to import any other code, it must be included in a zip file with your notebook.
It must be possible for the assessors to run your Python notebook.
Marking Criteria and Requirements
Your submission will be marked out of 100. The assignment question is broken down
into 5 parts, all parts should be answered and the breakdown of marks between parts
is specified in the notebook. General and part specific criteria are given below. Please
read these guidelines carefully and ask if you have any questions.
1
General: 20 marks available
20 marks are available for the overall quality of your assignment. When awarding
these marks the following general guidelines will be considered.
• In order to avoid misconduct, you should not talk about these coursework
questions with your peers. If you are not sure what a question is asking
you to do or have any other questions, please ask me or one of the Teaching
Assistants.
• Your report should be no more than 2000 words in length excluding code
and the content of graphs, tables and any references.
• You should specify the length of your report. 2000 is a strict limit.
• You should use a formal writing style.
• All graphs should have a title and have each axis clearly labelled.
• In all parts, marks will be awarded for the quality of your written answers
as well as your code.
• Written / textual answers MUST be included in Markdown cells. Otherwise,
you may score 0 for these answers.
• Code on its own does not count as an explanation or a discussion. Nor do
code comments. Code should be commented but explanation and discussion
MUST be given as text in Markdown cells (see previous point!).
• Do not add external text (e.g. code, output) as images.
• Your code must be applied to and your explanations must refer to the unique
set of examples generated by entering your candidate number at the top of
the notebook. This must be your own candidate number. Otherwise you
may score 0.
• You should submit your notebook with the code having been run (i.e., with
the output displayed rather than cleared)
• It must be possible for the assessors to run your Python notebook.
Part 1: 10 marks available
Run generate features(sentences[:5]). With reference to the code and
the specific examples, explain how the output was generated [10 marks]
The following breakdown of marks will be applied
• Correct general explanation [2 marks]
• Correct explanation which refers to examples in the output [4 marks]
• Correct explanation which refers to steps in the code [4 marks]
Part 2: 10 marks available
Write code and find the 1000 most frequently occurring words that are
in your sample; AND have at least one noun sense according to WordNet
[10 marks]
2
The following breakdown of marks will be applied
• Clear and effective use of code to find most frequently occurring words in
sample [3 marks]
• Clear and effective use of code to identify words with at least one noun
sense in WordNet [3 marks]
• Clear and effective use of code to combine the conditions and display the
required words [4 marks]
Part 3: 20 marks available
Consider the code above which outputs the path similarity score, the
Resnik similarity score and the Lin similarity score for a pair of concepts
in WordNet. Answer the following questions [20 marks]
The following breakdown of marks will be applied
Part a: Clear explanation of each of the similarity scores and what the number calculated
means [6 marks]
Part b: Clear and effective use of code to find the semantic similarity of a pair of
words [2 marks]
Part b: Clear and effective use of code to find semantic similarity with a parameter
to specify the measure of semantic similarity [2 marks]
Part b: Explanation and justification of the strategy used for words which have
multiple senses [2 marks]
Part c: Clear and effective use of code to find semantic similarity of every pair of
words [4 marks]
Part c: Justification of choice of semantic similarity measure [1 mark]
Part d: Clear and effective use of code to identify the 10 most similar words to the
most frequent word in the corpus [3 marks]
Part 4: 15 marks available
The construction and use of distributional vector representations to
find similar words [15 marks]
The following breakdown of marks will be applied
Part a: Clear and effective use of code to construct distributional vector representations
of words in the corpus with a parameter to specify context size. [4
marks]
Part a: Clear and correct explanation of how you calculate the value of association
between each word and each context feature [4 marks]
Part b: Correct use of code to construct representations of the 1000 words identified
in Q2 with a window size of 1 [3 marks]
Part c: Clear and correct use of code and representations to find the 10 words which
are distributionally most similar to the most frequent word in the corpus. [4
marks]
3
Part 5: 25 marks available
Plan and carry out an investigation into the correlation between semantic
similarity according to WordNet and distributional similarity
with different context window sizes. You should make sure that you
include a graph of how correlation varies with context window size
and that you discuss your results. [25 marks]
The following breakdown of marks will be applied
• Description of plan of how to carry out the investigation [5 marks]
• Clear and effective use of code to carry out the investigation [3 marks]
• Correct calculation of correlation between WordNet similarity and distributional
similarity for at least one context window size [4 marks]
• Correct calculation of correlation between WordNet similarity and distributional
similarity for different window sizes [3 marks]
• Presentation of results [5 marks]
• Discussion of results / conclusions [5 marks]
4
软件开发、广告设计客服
QQ:99515681
邮箱:99515681@qq.com
工作时间:8:00-23:00
微信:codinghelp
热点项目
更多
代写math 1151, autumn 2024 w...
2024-11-14
代做comp4336/9336 mobile dat...
2024-11-14
代做eesa01 lab 2: weather an...
2024-11-14
代写comp1521 - 24t3 assignme...
2024-11-14
代写nbs8020 - dissertation s...
2024-11-14
代做fin b377f technical anal...
2024-11-14
代做ceic6714 mini design pro...
2024-11-14
代做introduction to computer...
2024-11-14
代做cs 353, fall 2024 introd...
2024-11-14
代做phy254 problem set #3 fa...
2024-11-14
代写n1569 financial risk man...
2024-11-14
代写csci-ua.0202 lab 3: enco...
2024-11-14
代写econ2226: chinese econom...
2024-11-14
热点标签
mktg2509
csci 2600
38170
lng302
csse3010
phas3226
77938
arch1162
engn4536/engn6536
acx5903
comp151101
phl245
cse12
comp9312
stat3016/6016
phas0038
comp2140
6qqmb312
xjco3011
rest0005
ematm0051
5qqmn219
lubs5062m
eee8155
cege0100
eap033
artd1109
mat246
etc3430
ecmm462
mis102
inft6800
ddes9903
comp6521
comp9517
comp3331/9331
comp4337
comp6008
comp9414
bu.231.790.81
man00150m
csb352h
math1041
eengm4100
isys1002
08
6057cem
mktg3504
mthm036
mtrx1701
mth3241
eeee3086
cmp-7038b
cmp-7000a
ints4010
econ2151
infs5710
fins5516
fin3309
fins5510
gsoe9340
math2007
math2036
soee5010
mark3088
infs3605
elec9714
comp2271
ma214
comp2211
infs3604
600426
sit254
acct3091
bbt405
msin0116
com107/com113
mark5826
sit120
comp9021
eco2101
eeen40700
cs253
ece3114
ecmm447
chns3000
math377
itd102
comp9444
comp(2041|9044)
econ0060
econ7230
mgt001371
ecs-323
cs6250
mgdi60012
mdia2012
comm221001
comm5000
ma1008
engl642
econ241
com333
math367
mis201
nbs-7041x
meek16104
econ2003
comm1190
mbas902
comp-1027
dpst1091
comp7315
eppd1033
m06
ee3025
msci231
bb113/bbs1063
fc709
comp3425
comp9417
econ42915
cb9101
math1102e
chme0017
fc307
mkt60104
5522usst
litr1-uc6201.200
ee1102
cosc2803
math39512
omp9727
int2067/int5051
bsb151
mgt253
fc021
babs2202
mis2002s
phya21
18-213
cege0012
mdia1002
math38032
mech5125
07
cisc102
mgx3110
cs240
11175
fin3020s
eco3420
ictten622
comp9727
cpt111
de114102d
mgm320h5s
bafi1019
math21112
efim20036
mn-3503
fins5568
110.807
bcpm000028
info6030
bma0092
bcpm0054
math20212
ce335
cs365
cenv6141
ftec5580
math2010
ec3450
comm1170
ecmt1010
csci-ua.0480-003
econ12-200
ib3960
ectb60h3f
cs247—assignment
tk3163
ics3u
ib3j80
comp20008
comp9334
eppd1063
acct2343
cct109
isys1055/3412
math350-real
math2014
eec180
stat141b
econ2101
msinm014/msing014/msing014b
fit2004
comp643
bu1002
cm2030
联系我们
- QQ: 9951568
© 2021
www.rj363.com
软件定制开发网!