首页
网站开发
桌面应用
管理软件
微信开发
App开发
嵌入式软件
工具软件
数据采集与分析
其他
首页
>
> 详细
讲解CSE 231程序、辅导Python语言编程、program设计程序讲解 讲解留学生Processing|辅导Python编程
项目预算:
开发周期:
发布时间:
要求地区:
CSE 231 Spring 2021
Computer Project #07
Assignment Overview
This assignment focuses on the implementation of Python programs to read files and process data by
using lists and functions.
It is worth 55 points (5.5% of course grade) and must be completed no later than 11:59 PM on
Monday, March 15.
Assignment Deliverable
The deliverable for this assignment is the following file:
proj07.py – the source code for your Python program
Be sure to use the specified file name and to submit it for grading via Mimir before the project
deadline.
Assignment Background
One commonly hears reference to “the one percent” referring to the people whose income is in the
top 1% of incomes. What is the data behind that number and where do others fall? Using the
National Average Wage Index (AWI), an index used by the Social Security Administration to gauge
individual's earnings for the purpose of calculating their retirement benefit, we can answer such
questions.
In this project, you will process AWI data. Example data for 2019 is provided in the file
year2019.txt (2019 is the most recent year of complete data). The data is a table with the first
row as the title and the second row defining the data fields; remaining rows are data. The URL for
the data is: https://www.ssa.gov/cgi-bin/netcomp.cgi?year=2019
Here is the second line of data from the file followed by descriptions of the data. Notice that some
data are ints and some are floats:
5,000.00 — 9,999.99 12,620,757 32,801,513 19.37150 93,403,927,820.81 7,400.82
Column 0 is bottom of this income range.
Column 1 is the dash separating the bottom of the range from the top (see note below).
Column 2 is the top of this income range (see note below).
Column 3 is the number of individuals in the income range.
Column 4 is the cumulative number of individuals in this income range and all lower ranges.
Column 5 is the Column 4 value represented as a cumulative percentage of all individuals.
Column 6 is the combined income of all the individuals in this range of income.
Column 7 is the average income of individuals in this range of income.
Note: The final row of the file is different than all the others. You must account for that.
Assignment Specifications
The program must provide following functions to extract some statistics.
a) def open_file():
Prompts the user to enter a year number for the data file. The program will check whether
the year is between 1990 and 2019 (both inclusive). If year number is valid, the program will
try to open data file with file name ‘yearXXXX.txt’, where XXXX is the year. Appropriate
error message should be shown if the data file cannot be opened or if the year number is
invalid. The year is invalid if it is not a number between 1990 and 2019, inclusively. The
invalid year error is shown in this case. If the loop is correct but the file does not exist, the
other error will be output. This function will loop until it receives proper input and
successfully opens the file. It returns a file pointer and year. Hint: use string concatenation
to construct the file name.
i. Parameters: None
ii. Display: prompt and error message
iii. Return: file pointer and int
b) def handle_commas(s,T) int or float or None
The parameters are s, a string, and T, a string. The expected values of T are int and
float; any other value returns None. If the value of T is int, the string s will be
converted to an int and that int value will be returned. Similar for float. If a value of
s cannot be converted to an int or float, None will be returned (hint: use tryexcept).
Note: this is the same function we had in Project 5.
i. Parameters: str, str
ii. Display: nothing
iii. Returns: int or float or None
c) def read_file(fp):
The function uses the file pointer parameter to read the data file. This function returns a list
of tuples where each tuple is the data on one line of the file, and is a mix of ints and floats as
follows:
tup = ((float, float), int, int, float, float, float)
the tuple is filled with the following data:
( (column 0, column 2), column 3, column 4, column 5, column 6, column 7)
Note that the numbers have commas that you should handle (Hint: use the handle_commas
function). There are also two header lines to skip. Also, the last line of the file has words
where data is supposed to be. Find which column this affects, and record that column as
None
i. Parameter: file pointer
ii. Display: nothing
iii. Return: list of tuples
d) def get_range(data_list, percent):
Takes a list of data (output from the read_file function) and a percent and returns data
for the first data line whose cumulative percentage (Column 5 in the data file) is greater than
or equal to the percent parameter. The function should return a tuple of the salary range
(Columns 0 and 2 in the file data) the cumulative percentage value (Column 5 in the data
file) and the average income (Column 7 in the data file):
( (column 0, column 2), column 5, column 7)
For testing using the 2014 data and a percent value of 90 your function will return
((90000.0, 94999.99), 90.80624, 92420.5)
i. Parameters: list of tuples, float
ii. Display: nothing
iii.Return: tuple
e) def get_percent(data_list, income):
Takes a list of data (output from the read_file function) and an income and returns the
income range (Columns 0 and 2 in the file) that the specified income is in the income range
(Columns 0 and 2 in the file) and the corresponding cumulative percentage (Column 5 in the
file).( (column 0, column 2), column 5 )
For testing using the 2014 data and an income value of 150,000 your function will return
((150000.0, 154999.99), 96.87301)
i. Parameters: list of tuples, float
ii. Display: nothing
iii. Return: tuple
f) def find_average(data_list):
Takes a list of data (output from the read_file function) and returns the average salary.
Round the result to cents (i.e. two decimal places) before returning the value.
Hints:
i. This is NOT (!) the average of the last column of data. It is not mathematically valid to
find an average by finding the average of averages—for example, in this case there are
many more in the lowest category than in the highest category.
ii. How many wage earners are considered in finding the average (denominator)? There
are a couple of ways to determine this. I think the easiest uses the “cumulative number”
column (Column 4 in the file), but using Column 3 is not hard and may make more
sense to some students.
iii. How does one find the total dollar value of income (numerator)? Notice that Column 6
in the file is the combined income of all the individuals in this range of income.”
For testing your function notice that for the 2014 data the average should be $44,569.20.
That value is listed on the web page referenced above.
iv. Parameters: list of tuples
v. Display: nothing
vi. Return: float # rounded to two decimal places
g) def find_median(data_list):
Takes a list of data (output from the read_file function) and returns the median income.
Unfortunately, this file of data is not sufficient to find the true median so we need to
approximate it (at least 50%).
i. Here is the rule we will use: find the data line whose cumulative percentage (Column 5)
is closest to 50% and return its average income (Column 7). If two data lines are equally
close, return the smaller.
ii. Hint: Python’s abs() function (absolute value) is potentially useful here.
iii. Hint: your get_range() function should be useful here. The get_range()
function returns the first tuple where the cumulative percentage is higher than a
particular percentage. For the median the percentage is 50%.
iv. For testing your function, using our rule, the median income for the 2014 data is
$27,457.00
v. Parameters: list of tuples
vi. Display: nothing
vii. Return: float
h) def do_plot(x_vals,y_vals,year) provided by us takes two equal-length lists of
numbers and plots them. You have to fill the two labels (replace the empty string with the
appropriate string. Note that if you plot the whole file of data, the income ranges are so
skewed that the result is a nearly vertical plot at the leftmost edge so close to the edge that
you cannot see it in the plot—it looks like nothing was plotted. Plotting the lowest 40
income ranges results in a more easily readable plot.
i) def main():
a) Open the file
b) Print the year.
c) Read the file
d) Print the average income.
e) Print the median income.
f) Prompt for plotting (yes/no).
If yes, plot the data: cumulative percentage (Column 5 in the file (y values)) vs. income
(Column 0 in the file (x values)). Call the do_plot() function to plot the data. Plot the
lowest 40 income ranges.
g) Loop, prompting for either “r” for range , “p” for percent, or nothing
i. r: prompt for a percent and output the income that is below that percent. The percent
needs to be valid (between 0 and 100 inclusive). Hint: Call the get_range()
function to get the range of income about that percentage. The bottom income range
is what we are looking for.
ii. p: prompt for an income and output the percent that earned more. The income needs
to be valid (positive). Hint: Call the get_percent() function to get the
corresponding cumulative percentage.
iii. if only a carriage-return is entered, halt the program
This is a new and different requirement. Hint: if someone simply hits the Enter key,
what will be the value input?
Assignment Notes
1. Items 1-9 of the Coding Standard will be enforced for this project.
2. Files for year2000.txt, year2014.txt and year2019.txt are provided so that you
can test your program.
3. For output you need to insert commas. There is a format specification, e.g. if you might have
formatted a floating-point value without commas as {:<12.2f} you can simply insert a comma
before the dot as in {:<12,.2f}.
Sample Output
Test 1
Enter a year where 1990 <= year <= 2019: 2019
For the year 2019:
The average income was $51,916.27
The median income was $32,452.59
Do you want to plot the data (yes/no): no
Enter a choice to get (r)ange, (p)ercent, or nothing to stop: r
Enter a percent: 90
90.00% of incomes are below $100,000.00 .
Enter a choice to get (r)ange, (p)ercent, or nothing to stop: p
Enter an income: 100000
An income of $100,000.00 is in the top 90.01% of incomes.
Enter a choice to get (r)ange, (p)ercent, or nothing to stop:
Test 2 (no plotting)
Enter a year where 1990 <= year <= 2019: 2000
For the year 2000:
The average income was $30,846.09
The median income was $22,458.80
Do you want to plot the data (yes/no): no
Enter a choice to get (r)ange, (p)ercent, or nothing to stop: r
Enter a percent: 40
40.00% of incomes are below $15,000.00 .
Enter a choice to get (r)ange, (p)ercent, or nothing to stop: p
Enter an income: 50000
An income of $50,000.00 is in the top 87.41% of incomes.
Enter a choice to get (r)ange, (p)ercent, or nothing to stop:
Test 2 (plotting)
Enter a year where 1990 <= year <= 2019: 2000
For the year 2000:
The average income was $30,846.09
The median income was $22,458.80
Do you want to plot the data (yes/no): yes
Enter a choice to get (r)ange, (p)ercent, or nothing to stop:
Test 3
Enter a year where 1990 <= year <= 2019: xxx
Error in year. Please try again.
Enter a year where 1990 <= year <= 2014: 1900
Error in year. Please try again.
Enter a year where 1990 <= year <= 2014: 1999
Error in file name: year1999.txt Please try again.
Enter a year where 1990 <= year <= 2014: 2014
For the year 2014:
The average income was $44,569.20
The median income was $27,457.00
Do you want to plot the data (yes/no): no
Enter a choice to get (r)ange, (p)ercent, or nothing to stop: r
Enter a percent: 70
70.00% of incomes are below $45,000.00 .
Enter a choice to get (r)ange, (p)ercent, or nothing to stop: p
Enter an income: 150000
An income of $150,000.00 is in the top 96.87% of incomes.
Enter a choice to get (r)ange, (p)ercent, or nothing to stop:
Function Test: read_data
year2014.txt
[((0.01, 4999.99), 22574440, 22574440, 14.27075, 46647919125.68, 2066.4),
((5000.0, 9999.99), 13848841, 36423281, 23.02549, 102586913092.61, 7407.62),
((10000.0, 14999.99), 12329270, 48752551, 30.81961, 153566802438.45, 12455.47),
((15000.0, 19999.99), 11505776, 60258327, 38.09315, 200878198035.07, 17458.9),
((20000.0, 24999.99), 10918555, 71176882, 44.99547, 245317570246.88, 22467.95),
((25000.0, 29999.99), 10192863, 81369745, 51.43903, 279865461187.05, 27457.0),
((30000.0, 34999.99), 9487840, 90857585, 57.4369, 307828947411.16, 32444.58),
((35000.0, 39999.99), 8578215, 99435800, 62.85974, 321200755103.44, 37443.78),
((40000.0, 44999.99), 7553972, 106989772, 67.63509, 320563569965.15, 42436.43),
((45000.0, 49999.99), 6542882, 113532654, 71.77126, 310391706424.23, 47439.6),
((50000.0, 54999.99), 5723269, 119255923, 75.38931, 300016377448.51, 52420.46),
((55000.0, 59999.99), 4846517, 124102440, 78.4531, 278354367841.41, 57433.9),
((60000.0, 64999.99), 4201232, 128303672, 81.10897, 262203932128.68, 62411.2),
((65000.0, 69999.99), 3573471, 131877143, 83.36799, 240948179180.4, 67426.93),
((70000.0, 74999.99), 3094739, 134971882, 85.32437, 224145278103.36, 72427.85),
((75000.0, 79999.99), 2684481, 137656363, 87.0214, 207853372824.62, 77427.77),
((80000.0, 84999.99), 2297338, 139953701, 88.4737, 189370862869.17, 82430.56),
((85000.0, 89999.99), 1975400, 141929101, 89.72248, 172719042418.7, 87434.97),
((90000.0, 94999.99), 1714370, 143643471, 90.80624, 158442931588.44, 92420.5),
((95000.0, 99999.99), 1486636, 145130107, 91.74604, 144858203365.61, 97440.26),
((100000.0, 104999.99), 1309068, 146439175, 92.57358, 134083282259.67,
102426.52), ((105000.0, 109999.99), 1117128, 147556303, 93.27979,
120020513136.11, 107436.67), ((110000.0, 114999.99), 977055, 148533358, 93.89745,
109855105705.14, 112434.93), ((115000.0, 119999.99), 865889, 149399247, 94.44483,
101693061676.62, 117443.53), ((120000.0, 124999.99), 773339, 150172586, 94.93371,
94660281091.31, 122404.64), ((125000.0, 129999.99), 673971, 150846557, 95.35977,
85886152964.93, 127433.01), ((130000.0, 134999.99), 595827, 151442384, 95.73643,
78899843713.01, 132420.73), ((135000.0, 139999.99), 527341, 151969725, 96.0698,
72476546845.3, 137437.72), ((140000.0, 144999.99), 466992, 152436717, 96.36501,
66519743635.12, 142443.0), ((145000.0, 149999.99), 419003, 152855720, 96.62989,
61787674520.19, 147463.56), ((150000.0, 154999.99), 384581, 153240301, 96.87301,
58607775121.57, 152393.84), ((155000.0, 159999.99), 335391, 153575692, 97.08503,
52801735517.69, 157433.37), ((160000.0, 164999.99), 296048, 153871740, 97.27218,
48087213596.86, 162430.46), ((165000.0, 169999.99), 265309, 154137049, 97.4399,
44426198104.69, 167450.78), ((170000.0, 174999.99), 239515, 154376564, 97.59131,
41304379348.95, 172450.07), ((175000.0, 179999.99), 216255, 154592819, 97.72802,
38370042895.27, 177429.62), ((180000.0, 184999.99), 200592, 154793411, 97.85483,
36588064085.78, 182400.42), ((185000.0, 189999.99), 179005, 154972416, 97.96799,
33554727208.93, 187451.34), ((190000.0, 194999.99), 165277, 155137693, 98.07247,
31807897759.84, 192452.05), ((195000.0, 199999.99), 154070, 155291763, 98.16987,
30425466536.83, 197478.2), ((200000.0, 249999.99), 1039897, 156331660, 98.82726,
230863458226.21, 222006.08), ((250000.0, 299999.99), 565105, 156896765, 99.1845,
153945762663.99, 272419.75), ((300000.0, 349999.99), 333584, 157230349, 99.39537,
107708119615.81, 322881.55), ((350000.0, 399999.99), 219923, 157450272, 99.5344,
82117070706.61, 373390.1), ((400000.0, 449999.99), 151162, 157601434, 99.62996,
63997346472.5, 423369.28), ((450000.0, 499999.99), 108881, 157710315, 99.69879,
51583042398.64, 473756.14), ((500000.0, 999999.99), 345935, 158056250, 99.91748,
230331407862.96, 665822.79), ((1000000.0, 1499999.99), 65548, 158121798,
99.95892, 78672933288.58, 1200233.92), ((1500000.0, 1999999.99), 24140,
158145938, 99.97418, 41431838733.52, 1716314.78), ((2000000.0, 2499999.99),
12137, 158158075, 99.98185, 26997226154.27, 2224373.91), ((2500000.0,
2999999.99), 6871, 158164946, 99.98619, 18747446313.27, 2728488.77), ((3000000.0,
3499999.99), 4799, 158169745, 99.98923, 15507304422.66, 3231361.62), ((3500000.0,
3999999.99), 3258, 158173003, 99.99129, 12166741762.34, 3734420.43), ((4000000.0,
4499999.99), 2353, 158175356, 99.99277, 9970953222.98, 4237549.18), ((4500000.0,
4999999.99), 1822, 158177178, 99.99393, 8633941395.34, 4738716.46), ((5000000.0,
9999999.99), 6468, 158183646, 99.99802, 43887775808.42, 6785370.41),
((10000000.0, 19999999.99), 2230, 158185876, 99.99942, 30065006121.19,
13482065.53), ((20000000.0, 49999999.99), 776, 158186652, 99.99992,
22450911983.01, 28931587.61), ((50000000.0, None), 134, 158186786, 100.0,
11564829969.82, 86304701.27)]
Function Test: find_average
Instructor: 44569.2
Student: 44569.2
Function Test: find_median
year2014.txt
Instructor: 27457.0
Student: 27457.0
--------------------
year2019.txt
Instructor: 32452.59
Student: 32452.59
Function Test: get_range
year2014.txt; get_range(data,90)
Instructor: ((90000.0, 94999.99), 90.80624, 92420.5)
Student: ((90000.0, 94999.99), 90.80624, 92420.5)
--------------------
year2014.txt,get_range(data,50)
Instructor: ((25000.0, 29999.99), 51.43903, 27457.0)
Student: ((25000.0, 29999.99), 51.43903, 27457.0)
--------------------
year2000.txt,get_range(data,90)
Instructor: ((60000.0, 64999.99), 91.31401, 62377.2)
Student: ((60000.0, 64999.99), 91.31401, 62377.2)
Function Test: get_percent
year2014.txt; get_precent(data,150000)
Instructor: ((150000.0, 154999.99), 96.87301)
Student: ((150000.0, 154999.99), 96.87301)
--------------------
year2014.txt,get_percent(data,50000)
Instructor: ((50000.0, 54999.99), 75.38931)
Student: ((50000.0, 54999.99), 75.38931)
--------------------
year2000.txt,get_percent(data,150000)
Instructor: ((150000.0, 154999.99), 98.72567)
Student: ((150000.0, 154999.99), 98.72567)
Function Test: handle_commas
s,T: 5 int
Instructor: 5
Student : 5
--------------------
s,T: 5.3 float
Instructor: 5.3
Student : 5.3
--------------------
s,T: 1,234 int
Instructor: 1234
Student : 1234
--------------------
s,T: 1,234.56 float
Instructor: 1234.56
Student : 1234.56
--------------------
s,T: 5.3 xxx
Instructor: None
Student : None
--------------------
s,T: aaa int
Instructor: None
Student : None
--------------------
s,T: 1,234.56 int
Instructor: None
Student : None
=====================================================
Scoring Rubric
Computer Project #07 Scoring Summary
General Requirements
______ 5 pts Coding Standard 1-9
(descriptive comments, function header, etc...)
Implementation:
__0__ (5 pts) open_file (manual grading)
__0__ (3 pts) Function Test handle_commas
__0__ (8 pts) Function Test read_file
__0__ (5 pts) Function Test find_average
__0__ (6 pts) Function Test find_median
__0__ (5 pts) Function Test get_range
__0__ (5 pts) Function Test get_percent
__0__ (5 pts) Test 1
__0__ (2 pts) Test 2 (no plotting)
__0__ (2 pts) Test 2 (plotting) (manual grading)
__0__ (4 pts) Test 3
软件开发、广告设计客服
QQ:99515681
邮箱:99515681@qq.com
工作时间:8:00-23:00
微信:codinghelp
热点项目
更多
代做ceng0013 design of a pro...
2024-11-13
代做mech4880 refrigeration a...
2024-11-13
代做mcd1350: media studies a...
2024-11-13
代写fint b338f (autumn 2024)...
2024-11-13
代做engd3000 design of tunab...
2024-11-13
代做n1611 financial economet...
2024-11-13
代做econ 2331: economic and ...
2024-11-13
代做cs770/870 assignment 8代...
2024-11-13
代写amath 481/581 autumn qua...
2024-11-13
代做ccc8013 the process of s...
2024-11-13
代写csit040 – modern comput...
2024-11-13
代写econ 2070: introduc2on t...
2024-11-13
代写cct260, project 2 person...
2024-11-13
热点标签
mktg2509
csci 2600
38170
lng302
csse3010
phas3226
77938
arch1162
engn4536/engn6536
acx5903
comp151101
phl245
cse12
comp9312
stat3016/6016
phas0038
comp2140
6qqmb312
xjco3011
rest0005
ematm0051
5qqmn219
lubs5062m
eee8155
cege0100
eap033
artd1109
mat246
etc3430
ecmm462
mis102
inft6800
ddes9903
comp6521
comp9517
comp3331/9331
comp4337
comp6008
comp9414
bu.231.790.81
man00150m
csb352h
math1041
eengm4100
isys1002
08
6057cem
mktg3504
mthm036
mtrx1701
mth3241
eeee3086
cmp-7038b
cmp-7000a
ints4010
econ2151
infs5710
fins5516
fin3309
fins5510
gsoe9340
math2007
math2036
soee5010
mark3088
infs3605
elec9714
comp2271
ma214
comp2211
infs3604
600426
sit254
acct3091
bbt405
msin0116
com107/com113
mark5826
sit120
comp9021
eco2101
eeen40700
cs253
ece3114
ecmm447
chns3000
math377
itd102
comp9444
comp(2041|9044)
econ0060
econ7230
mgt001371
ecs-323
cs6250
mgdi60012
mdia2012
comm221001
comm5000
ma1008
engl642
econ241
com333
math367
mis201
nbs-7041x
meek16104
econ2003
comm1190
mbas902
comp-1027
dpst1091
comp7315
eppd1033
m06
ee3025
msci231
bb113/bbs1063
fc709
comp3425
comp9417
econ42915
cb9101
math1102e
chme0017
fc307
mkt60104
5522usst
litr1-uc6201.200
ee1102
cosc2803
math39512
omp9727
int2067/int5051
bsb151
mgt253
fc021
babs2202
mis2002s
phya21
18-213
cege0012
mdia1002
math38032
mech5125
07
cisc102
mgx3110
cs240
11175
fin3020s
eco3420
ictten622
comp9727
cpt111
de114102d
mgm320h5s
bafi1019
math21112
efim20036
mn-3503
fins5568
110.807
bcpm000028
info6030
bma0092
bcpm0054
math20212
ce335
cs365
cenv6141
ftec5580
math2010
ec3450
comm1170
ecmt1010
csci-ua.0480-003
econ12-200
ib3960
ectb60h3f
cs247—assignment
tk3163
ics3u
ib3j80
comp20008
comp9334
eppd1063
acct2343
cct109
isys1055/3412
math350-real
math2014
eec180
stat141b
econ2101
msinm014/msing014/msing014b
fit2004
comp643
bu1002
cm2030
联系我们
- QQ: 9951568
© 2021
www.rj363.com
软件定制开发网!