首页
网站开发
桌面应用
管理软件
微信开发
App开发
嵌入式软件
工具软件
数据采集与分析
其他
首页
>
> 详细
代做program、代写Python编程设计
项目预算:
开发周期:
发布时间:
要求地区:
CIS – General Lecture Project
Objective:
In this project, you will employ Python programming to conduct an analysis of a text-based dataset using Natural Language Processing (NLP) techniques. You are required to prepare a report comprising 300-500 words alongside a Python script containing a minimum of 100 lines of code.
Assignment Description:
Dataset Selection:
Students are encouraged to select a text dataset that captures their interest. Below are some suggested datasets:
Movie Reviews: A collection of movie reviews (e.g., IMDB, Rotten Tomatoes).
Tweets: A set of tweets regarding a specific topic (e.g., political sentiment, public opinion on social issues).
Product Reviews: Customer feedback data from e-commerce platforms like Amazon or eBay.
News Articles: A collection of news articles on a particular subject (e.g., sports, politics, technology).
Books or Articles: A dataset of books or articles suitable for analysis regarding topic modeling or keyword extraction.
Python Code Requirements:
The analysis should consist of at least 100 lines of Python code.
The code must encompass:
oText Preprocessing: Tokenization, removal of stopwords, stemming/lemmatization, and vectorization techniques such as TF-IDF or word embeddings.
oNLP and/or ML Techniques: Implement NLP and/or ML algorithms.
oData Visualization: Illustrate the distribution of key terms or phrases.
oBasic Evaluation: Assess model performance utilizing accuracy, precision, recall, or other pertinent metrics.
Written Report:
Word Count: 300-500 words
The report should include the following sections:
1. Introduction:
Provide a brief introduction to the selected dataset and elucidate its relevance to your interests.
Outline the problem being tackled using NLP, such as sentiment analysis or topic modeling.
2. Literature Review:
Present a concise overview of existing research or methodologies associated with your analysis or general NLP tasks, exploring areas like:
oThe application of NLP in analyzing sentiment on social media.
oSurveys of sentiment analysis methodologies employing machine learning.
oPrevious studies utilizing NLP in evaluating product reviews, movie critiques, or customer feedback.
oHighlight common techniques or algorithms (e.g., Random Forest, Naive Bayes, SVM, or advanced deep learning models like BERT) employed in your analysis.
3. Methodology and Analysis:
Detail the steps undertaken to analyze the dataset:
oText Preprocessing: What text-cleaning measures were implemented?
oModeling: Which model(s) did you select for data analysis? Offer insights into the construction of your machine learning model (e.g., feature engineering, algorithmic choice).
oDiscuss the outcomes of your analysis:
oWhat trends were observed in the data?
oWhat predictions did your model yield (e.g., sentiment classification)?
oDid you uncover any noteworthy patterns or correlations in the textual data?
4. Research Proposal:
Suggest potential avenues for future research or enhancements to your analysis:
Is there a possibility to refine the model by integrating more sophisticated NLP techniques (e.g., transformer models like BERT)?
Could additional features (e.g., user metadata) be incorporated to boost prediction accuracy?
Evaluation Criteria:
(10%) Code Quality and Efficiency: The Python code should be well-written, clean, and efficient, with clear comments to explain each section.
(10%) Text Preprocessing: The quality and thoroughness of your text preprocessing steps will be evaluated.
(20%) Modeling and Analysis: How well you perform the analysis or other NLP tasks will be a key evaluation factor. This includes the appropriate choice of algorithm and the explanation of its use.
(10%) Visualization: The visualizations should clearly convey the results of your analysis, whether that be i.e. sentiment distributions, word clouds, or feature importance.
(50%) Written Report: The report should be well-structured, clear, and demonstrate a strong understanding of the task, literature, and findings.
Additional Guidelines:
Dataset Size: The dataset should have at least a few thousand data points (e.g., tweets, reviews, articles) for meaningful analysis.
Data Sources: Ensure the dataset is publicly available, and you clearly cite the source in the report.
Ethical Considerations: When using social media or other publicly available datasets, ensure to respect user privacy and ethical guidelines
Example Topics:
1.Sentiment Analysis of Movie Reviews:
Dataset: IMDB reviews or Rotten Tomatoes movie reviews.
Topic: Perform sentiment analysis on movie reviews to classify them as positive or negative and explore how sentiment correlates with movie success (box office performance).
2.Analysis of Tweets on a Political Topic:
Dataset: A collection of tweets from Twitter about a political issue or candidate.
Topic: Analyze the topics of tweets and identify public opinion trends regarding a political event or figure.
3.Product Review Sentiment Analysis:
Dataset: Product reviews from Amazon, eBay, or another e-commerce platform.
Topic: Analyze the sentiment of customer reviews to predict product success or identify areas for improvement.
4.Topic Modeling in News Articles:
Dataset: A collection of news articles on various topics.
Topic: Perform topic modeling to identify key themes in news coverage or detect emerging trends in current events.
By the end of this project, students will have hands-on experience with NLP techniques and machine learning models and will gain insights into how these methods can be applied to real-world data. The project will also help students improve their ability to interpret and communicate their findings effectively through code and a written report.
Please submit the following:
A 300 - 500 word report in PDF format
At least 100 lines of Python code in .py format (.ipynb acceptable as well)
软件开发、广告设计客服
QQ:99515681
邮箱:99515681@qq.com
工作时间:8:00-23:00
微信:codinghelp
热点项目
更多
代写tft00112m-a ai and its a...
2025-01-10
代做ems702u/p statistical th...
2025-01-10
代做ulms766 marketing manage...
2025-01-10
代做finn2071 intermediate fi...
2025-01-10
代写cmt117 knowledge represe...
2025-01-10
代做125.810 case studies in ...
2025-01-10
代写digital leadership proje...
2025-01-10
代做civ2235—structural mate...
2025-01-10
代写cege0015: environmental ...
2025-01-10
代做ulms 766 marketing manag...
2025-01-10
代写cmt120 fundamentals of p...
2025-01-10
代写5qqmn532 asset managemen...
2025-01-10
代写cybr 372 applications of...
2025-01-10
热点标签
mktg2509
csci 2600
38170
lng302
csse3010
phas3226
77938
arch1162
engn4536/engn6536
acx5903
comp151101
phl245
cse12
comp9312
stat3016/6016
phas0038
comp2140
6qqmb312
xjco3011
rest0005
ematm0051
5qqmn219
lubs5062m
eee8155
cege0100
eap033
artd1109
mat246
etc3430
ecmm462
mis102
inft6800
ddes9903
comp6521
comp9517
comp3331/9331
comp4337
comp6008
comp9414
bu.231.790.81
man00150m
csb352h
math1041
eengm4100
isys1002
08
6057cem
mktg3504
mthm036
mtrx1701
mth3241
eeee3086
cmp-7038b
cmp-7000a
ints4010
econ2151
infs5710
fins5516
fin3309
fins5510
gsoe9340
math2007
math2036
soee5010
mark3088
infs3605
elec9714
comp2271
ma214
comp2211
infs3604
600426
sit254
acct3091
bbt405
msin0116
com107/com113
mark5826
sit120
comp9021
eco2101
eeen40700
cs253
ece3114
ecmm447
chns3000
math377
itd102
comp9444
comp(2041|9044)
econ0060
econ7230
mgt001371
ecs-323
cs6250
mgdi60012
mdia2012
comm221001
comm5000
ma1008
engl642
econ241
com333
math367
mis201
nbs-7041x
meek16104
econ2003
comm1190
mbas902
comp-1027
dpst1091
comp7315
eppd1033
m06
ee3025
msci231
bb113/bbs1063
fc709
comp3425
comp9417
econ42915
cb9101
math1102e
chme0017
fc307
mkt60104
5522usst
litr1-uc6201.200
ee1102
cosc2803
math39512
omp9727
int2067/int5051
bsb151
mgt253
fc021
babs2202
mis2002s
phya21
18-213
cege0012
mdia1002
math38032
mech5125
07
cisc102
mgx3110
cs240
11175
fin3020s
eco3420
ictten622
comp9727
cpt111
de114102d
mgm320h5s
bafi1019
math21112
efim20036
mn-3503
fins5568
110.807
bcpm000028
info6030
bma0092
bcpm0054
math20212
ce335
cs365
cenv6141
ftec5580
math2010
ec3450
comm1170
ecmt1010
csci-ua.0480-003
econ12-200
ib3960
ectb60h3f
cs247—assignment
tk3163
ics3u
ib3j80
comp20008
comp9334
eppd1063
acct2343
cct109
isys1055/3412
math350-real
math2014
eec180
stat141b
econ2101
msinm014/msing014/msing014b
fit2004
comp643
bu1002
cm2030
联系我们
- QQ: 9951568
© 2021
www.rj363.com
软件定制开发网!