首页
网站开发
桌面应用
管理软件
微信开发
App开发
嵌入式软件
工具软件
数据采集与分析
其他
首页
>
> 详细
代写COMM3501、代做R编程设计
项目预算:
开发周期:
发布时间:
要求地区:
UNSW Business School
COMM3501 Quantitative Business Analytics
A4 Individual Assignment (40%)
Due date: Monday 5th August 2024, 12:00 PM (noon) week 11
1. Assignment overview
In this assessment, you will analyse a dataset with an emphasis on practical business analytics and
develop authentic outputs. The task aims to enhance your problem-solving skills in real-world
scenarios. It is also intended to develop your skills in research, critical thinking and problem
solving, your data analysis and programming skills, and your ability to communicate your ideas and
solutions concisely and coherently.
2. Assignment scenario
You are an analyst at a data analytics consulting firm. Your manager has tasked you with providing
a report to an American client. The client is a major U.S. wireless telecommunications company
which provides cellular telephone service. They require assistance in developing a statistical model
to predict customer churn, establish a target customer profile for implementing a proactive churn-
management program, and rolling the solution out to their customer-facing call centres.
These days, the telecommunications industry faces fierce competition in satisfying its customers.
Churn is a marketing term, referring to a current customer deciding to take their business
elsewhere in the current context, switching from one mobile service provider to another. As with
many other sectors, churn is an important issue for the wireless telecommunications industry. For
this client, the role of the desired churn model is not only to accurately predict customer churn,
but also to understand customer behaviours.
3. Assignment details
3.1. Task details
Your main tasks will involve: data manipulation and cleaning; statistical modelling; writing a
technical report. Your client also wants a non-technical description of the characteristics of
customers that churned, to assist in the development of a risk-management strategy, i.e., a
proactive churn-management program.
In your report, your manager wants you to include: some details on your data manipulation,
cleaning, and descriptive analysis; a brief summary and comparison of the models you fitted; a
2
detailed description of your selected model/s and interpretation of the results; your main findings,
recommendations and conclusions.
The client is familiar with machine learning. All your modelling results should be included, mostly
in an appendix to the report.
In addition, among the 10,000 customers in the eval_data.csv evaluation dataset, you must
identify 3000 customers which you believe are most likely to churn.
See the submission details section and marking criteria section for more information.
3.2. Data Description
The data provides details of 30,000 customers in the training dataset, and 10,000 customers in the
evaluation dataset:
1. training_data.csv
2. eval_data.csv
The datasets can be downloaded from the Moodle website in the A4 Individual Project C A4
Datasets section.
For each of the observations in the training dataset, there is information on 44 attributes
describing the customer care service details, customer demography and personal details, etc.
These are described below.
Similar, but not identical, datasets are provided here. You may also wish to have a look at the
following analysis based on the Kaggle datasets to give you an idea: Churn Prediction (weblink).
This analysis is just a brief example and is not based on your datasets. Different and more variables
may be of interest for your analysis. Extra readings are given in the Resources section.
3.2.1. training_data.csv (Training dataset)
This dataset provides insights about the customers and whether they are churned customers.
Variable Name Description
CustomerID A unique ID assigned to each customer/subscriber
Churn Is churned? (categorical: no , yes )
MonthlyRevenue Mean monthly revenue for the company
MonthlyMinutes Mean monthly minutes of use
TotalRecurringCharge Mean total recurring charges (recurring billing)
OverageMinutes Mean overage minutes of use
RoamingCalls Mean number of roaming calls
DroppedCalls Mean number of dropped voice calls
BlockedCalls Mean number of blocked voice calls
UnansweredCalls Mean number of unanswered voice calls
CustomerCareCalls Mean number of customer care calls
ThreewayCalls Mean number of three-way calls
OutboundCalls Mean number of outbound voice calls
InboundCalls Mean number of inbound voice calls
DroppedBlockedCalls Mean number of dropped or blocked calls
CallForwardingCalls Mean number of call forwarding calls
CallWaitingCalls Mean number of call waiting calls
MonthsInService Months in Service
ActiveSubs Number of Active Subscriptions
ServiceArea Communications Service Area
Handsets Number of Handsets Issued
CurrentEquipmentDays Number of days of the current equipment
AgeHH1 Age of first Household member
AgeHH2 Age of second Household member
ChildrenInHH Presence of children in Household (yes or no)
HandsetRefurbished Handset is refurbished (yes or no)
HandsetWebCapable Handset is web capable (yes or no)
TruckOwner Subscriber owns a truck (yes or no)
RVOwner Subscriber owns a recreational vehicle (yes or no)
BuysViaMailOrder Subscriber Buys via mail order (yes or no)
RespondsToMailOffers Subscriber responds to mail offers (yes or no)
OptOutMailings Subscriber opted out mailings option (yes or no)
OwnsComputer Subscriber owns a computer (yes or no)
HasCreditCard Subscriber has a credit card (yes or no)
RetentionCalls Number of calls previously made to retention team
RetentionOffersAccepted Number of previous retention offers accepted
ReferralsMadeBySubscriber Number of referrals made by subscriber
IncomeGroup Income group
OwnsMotorcycle Subscriber owns a motorcycle (yes or no)
MadeCallToRetentionTeam Customer has made call to retention team (yes or no)
CreditRating Credit rating category
PrizmCode Living area
Occupation Occupation category
MaritalStatus Married (yes or no or unknown)
3.2.2. eval_data.csv (Evaluation dataset)
The evaluation dataset comprises 10,000 current customers. From these 10,000 customers, select
3000 which you believe are most likely to churn. This evaluation dataset has the same format as
the training dataset but doesn t include the column Churn. The true values for the column Churn
will be released after the due date of the assignment.
3.3. Software
You may choose which software package or program to use, e.g., R or python. The code enabling
you to perform most of the computing can be found in the course learning activities.
3.4. Resources
- Extra information on the original dataset and on the context can be found here: link 1 and
link 2
- Data manipulation with R with the dplyr package (weblink)
- Tidy data in R (weblink)
- Exploratory Data Analysis with R (weblink)
- Data visualisation in R with ggplot2 for fancy plots (weblink)
- He and Garcia (2009), for strategies for dealing with imbalanced data in classification
problems
- Yadav and Roychoudhury (2018), for some strategies to deal with missing attribute values in
R (available on Moodle)
- If you are interested in using R Markdown, here is a guide for creating PDF documents
(weblink)
- For any code-related questions, google.com or stackoverflow.com are pretty helpful!
3.5. Marking criteria
You will be assessed against the following criteria:
1. Data manipulation, cleaning, and descriptive analysis
2. Modelling
3. Recommendations and discussion
4. Report writing
5. Predictive accuracy
The mark allocation and details for each marking criteria are given below and in the rubric. The
materials you submit should be your own. Familiarise yourself with the UNSW policies for
plagiarism before submitting.
3.5.1. Criteria 1-3
There are potentially multiple valid approaches to this task, so you must choose an approach that
is both justifiable and justified.
You may also wish to engage in extra research beyond the course content. Please feel free to do
so. Although the marks for each component of the assignment are capped, innovations are
encouraged.
Any assumptions must be clearly identified and justified, if used. Sufficient details, e.g.,
calculations and results, must be provided. Include an appendix to the report for non-essential but
useful results; however, the appendix will not be directly assessed. Ensure that the body of your
report is self-contained and addresses all marking criteria.
3.5.2. Criteria 4
Communication of quantitative results in a concise and easy-to-understand manner is a skill that is
vital in practice. As such, marks will be given for report writing. To maximize your marks for this
component, you may wish to consider issues such as: table size/readability, figure
axes/formatting, text readability, grammar/spelling, page layout, and referencing of external
sources.
Include a brief introduction section in your report.
A maximum page limit of 8 pages is applicable to the main body of the report. This limit includes
tables and graphs, but excludes the cover page, table of contents, references, and any appendices.
There is no limit to the length of the appendix. Exceeding the page limit will attract a proportional
penalty to the overall assignment mark. Your report must be a self-contained document (i.e., not
multiple files), with all pages in portrait format.
Consider how the overall look, feel and readability of your document is affected by choices like
margin size, line and paragraph spacing, typeface/font, and text size. If in doubt, don t stray too far
from the defaults in your word processor / typesetting program, or use something like the
following settings: margins of 2.54cm for each edge, 1.15 line spacing, Calibri size 11 text.
3.5.3. Criteria 5
Provide a comma-separated values (CSV) file following the format in the sample file provided on
Moodle (selected_customers_example_for_submission.csv), predicting the 3000
(out of 10,000) customers in the evaluation dataset which you believe are the most likely to churn.
See the submission section for details.
The accuracy of your predictions on the evaluation data will have a (minor) impact on your mark.
The marks you get for the accuracy criterion will be given by the following formula.
Marks = {
5
No. churned customers identified, if No. churned customers identified <
(No. churned customers identified ? ), if No. churned customers identified ,
where we will take as the maximum number of churned customers correctly identified by a
student in the class, and as the number of churned customers you would correctly identify on
average if your prediction algorithm were to just return a pure random sample of the 10,000
customers in the evaluation dataset. Therefore, if your prediction accuracy is below that expected
by random sampling, your mark for this component will scale from 0 to 5 based on how many
predictions were correct. If your prediction accuracy is above that expected by random sampling,
then your mark is scaled from 5 to 10 based on the accuracy.
4. Assignment submissions
Your final submission should include:
1) A technical report in .docx or .pdf format
2) Your sample of predicted churn customers in a CSV file named
selected_customers_yourStudentzID.csv *
3) Reproducible codes with brief instructions on how to use them, e.g., R script/s with
comments (this item will not be assessed).
Upload your final submission using the submission links on Moodle. Check your report displays
properly on-screen once it is submitted.
* If your zID were z1234567, you would call the file selected_customers_z1234567.csv
5. References
He, Haibo, and Edwardo A. Garcia. 2009. Learning from imbalanced data. IEEE Transactions on
Knowledge and Data Engineering 21 (9): 1263 C84. https://doi.org/10.1109/TKDE.2008.239.
Yadav, Madan Lal, and Basav Roychoudhury. 2018. Handling missing values: A study of popular
imputation packages in R. Knowledge-Based Systems 160 (April): 104 C18.
https://doi.org/10.1016/j.knosys. 2018.06.012.
软件开发、广告设计客服
QQ:99515681
邮箱:99515681@qq.com
工作时间:8:00-23:00
微信:codinghelp
热点项目
更多
代写data driven business mod...
2024-11-12
代做acct1101mno introduction...
2024-11-12
代做can207 continuous and di...
2024-11-12
代做dsci 510: principles of ...
2024-11-12
代写25705 financial modellin...
2024-11-12
代做ccc8013 the process of s...
2024-11-12
代做intro to image understan...
2024-11-12
代写eco380: markets, competi...
2024-11-12
代写ems726u/p - engineering ...
2024-11-12
代写cive5975/cw1/2024 founda...
2024-11-12
代做csci235 – database syst...
2024-11-12
代做ban 5013 analytics softw...
2024-11-12
代写cs 17700 — lab 06 fall ...
2024-11-12
热点标签
mktg2509
csci 2600
38170
lng302
csse3010
phas3226
77938
arch1162
engn4536/engn6536
acx5903
comp151101
phl245
cse12
comp9312
stat3016/6016
phas0038
comp2140
6qqmb312
xjco3011
rest0005
ematm0051
5qqmn219
lubs5062m
eee8155
cege0100
eap033
artd1109
mat246
etc3430
ecmm462
mis102
inft6800
ddes9903
comp6521
comp9517
comp3331/9331
comp4337
comp6008
comp9414
bu.231.790.81
man00150m
csb352h
math1041
eengm4100
isys1002
08
6057cem
mktg3504
mthm036
mtrx1701
mth3241
eeee3086
cmp-7038b
cmp-7000a
ints4010
econ2151
infs5710
fins5516
fin3309
fins5510
gsoe9340
math2007
math2036
soee5010
mark3088
infs3605
elec9714
comp2271
ma214
comp2211
infs3604
600426
sit254
acct3091
bbt405
msin0116
com107/com113
mark5826
sit120
comp9021
eco2101
eeen40700
cs253
ece3114
ecmm447
chns3000
math377
itd102
comp9444
comp(2041|9044)
econ0060
econ7230
mgt001371
ecs-323
cs6250
mgdi60012
mdia2012
comm221001
comm5000
ma1008
engl642
econ241
com333
math367
mis201
nbs-7041x
meek16104
econ2003
comm1190
mbas902
comp-1027
dpst1091
comp7315
eppd1033
m06
ee3025
msci231
bb113/bbs1063
fc709
comp3425
comp9417
econ42915
cb9101
math1102e
chme0017
fc307
mkt60104
5522usst
litr1-uc6201.200
ee1102
cosc2803
math39512
omp9727
int2067/int5051
bsb151
mgt253
fc021
babs2202
mis2002s
phya21
18-213
cege0012
mdia1002
math38032
mech5125
07
cisc102
mgx3110
cs240
11175
fin3020s
eco3420
ictten622
comp9727
cpt111
de114102d
mgm320h5s
bafi1019
math21112
efim20036
mn-3503
fins5568
110.807
bcpm000028
info6030
bma0092
bcpm0054
math20212
ce335
cs365
cenv6141
ftec5580
math2010
ec3450
comm1170
ecmt1010
csci-ua.0480-003
econ12-200
ib3960
ectb60h3f
cs247—assignment
tk3163
ics3u
ib3j80
comp20008
comp9334
eppd1063
acct2343
cct109
isys1055/3412
math350-real
math2014
eec180
stat141b
econ2101
msinm014/msing014/msing014b
fit2004
comp643
bu1002
cm2030
联系我们
- QQ: 9951568
© 2021
www.rj363.com
软件定制开发网!