MSIN0116 Business Strategy and Analytics
Content of this assessment brief
Section
|
Content
|
A
|
Core information
|
B
|
Coursework brief and requirements
|
C
|
Module learning outcomes covered in this assessment
|
D
|
Groupwork instructions (if applicable)
|
E
|
How your work is assessed
|
F
|
Additional information
|
Section A: Core information
Submission date
|
|
30/06/2024
|
Submission time
|
|
23.59
|
Assessment is marked out of:
|
|
100
|
% weighting of this assessment within total module mark
|
60%
|
Maximum word count/page length/duration
|
3000
|
Footnotes, appendices, tables,
figures, diagrams, charts included in/excluded from word count/page length?
|
Including everything
|
Bibliographies, reference lists
included in/excluded from word count/page length?
|
Including everything
|
Penalty for exceeding word count/page length
|
Penalty for exceeding word count will be a deduction of 10 percentage points, capped at 40% for Levels 4,5, 6, and 50% for Level 7) Refer to Academic Manual Section 3: Module
Assessment - 3.13 Word Counts.
|
Penalty for late submission
|
Standard UCL penalties apply. Students should refer to
https://www.ucl.ac.uk/academic-manual/chapters/chapter-4- assessment-framework-taught-programmes/section-3-module- assessment#3.12
|
Artificial Intelligence (AI) category
|
Assistive
|
Submitting your assessment
|
Submission via Moodle. Submit one single PDF file.
|
Anonymity of identity. Normally, all submissions are anonymous unless the nature of the submission is such that anonymity is not appropriate, illustratively as in presentations or where minutes of group meetings
are required as part of a group work submission
|
The nature of this assessment is such that anonymity is required.
|
Section B: Assessment Brief and Requirements
The individual assignment includes four parts and will be given after different lectures:
• Assignment 1: Case Study: Car Recall Data (Lecture 2)
• Assignment 2: Case Study: New York City Moves To Data-Driven Crime Fighting (Lecture 3)
• Assignment 3: Case Study: Rio Olympic Games (Lecture 4)
• Assignment 4: Case Study: Online Auctions (Lecture 6)
Assignment 1 Car Recall Data
Every year, a large number of cars are recalled by the automobile firms due to safety reasons. In the U.S., all car recall data is recorded and stored by National Highway Traffic Safety Administration (NHTSA), Department of Transportation. This dataset is available to the public and contains all NHTSA safety-related defect and compliance campaigns since 1967. The data is in text format with 89431 records. All the records are TAB delimited, and all dates are in YYYYMMDD format. The data also includes 24 variables listed in the table below.
As a data analyst, you are interested in the car recalls related to Ford Focus and Honda Accord. To analyse the data, please follow the steps below.
1. Download the car recall data from Moodle.
2. Convert the text file to a data file that you can analyse (i.e., Stata, SAS, Excel, SPSS or Python) .
3. For Ford Focus and Honda Accord respectively, find out how many records in the data set are related to these two models.
4. For Ford Focus and Honda Accord respectively, how many of the recalls are initiated by the manufacturer (MFR), Office of Vehicle Safety Compliance (OVSC) or Office of Defects Investigation (ODI). Tabulate the results in a table with the frequency and percentage.
5. For Ford Focus and Honda Accord respectively, draw a bar chart to demonstrate the number of cars that are affected for each model year. (Hint: use the variable “YEARTXT” and “POTAFF”)
In 2016, Vauxhall decided to recall all its Zafira B model in the UK, involving 234,938 cars manufactured from 2009 to 2014. The root cause behind this large recall is that all affected cars used the same faulty thermal fuse that may cause fire1
. In fact, sharing the same component across different products is a common practice in manufacturing industry. However, when the shared component fails, the firms have to recall a large number of products, resulting in very high cost and negative impact on the firms’ public image.
6. Use the car recall data, verify that component sharing indeed exists. (Hint: you can create a new variable called “sharing”, which counts how many different car models are using the same defective component, then draw a histogram of “sharing”. You can also study how many cars are recalled for each of the defective component.)
7. Briefly discuss the advantages and disadvantages of component sharing in the context of product recalls (Word limit: 300). You can make use of the reference listed below.
Assignment 2 Car Data-Driven Crime Fighting Goes Global
Nowhere have declining crime rates been as dramatic as in New York City. As reflected in the reported rates of the most serious types of crime, the city in 2015 was as safe as it had been since statistics have been kept Crimes during the preceding few years have also been historically low.
Why is this happening? Experts point to a number of factors. including demographic trends, the proliferation of surveillance cameras, and increased incarceration rates. But New York City would also argue it is because of its proactive crime prevention program along with the district attorney’s and police force’s willingness to aggressively deploy information technology.
There has been a revolution in the use of Big Data for retailing and sports, e.g. baseball and the film ‘Moneyball’) as well as for police work. New York City has been at the forefront in intensively using data for crime fighting, and its CompStat crime-mapping program has been replicated by other cities.
CompStat features a comprehensive, citywide database that records all reponed crimes or complaints, arrests, and summonses in each of the city's 76 precincts, including their time and location. The CompStat system analyzes the data and produces a week[y report on crime complaint and arrest activity at the precinct, patrol borough, and citywide Ievels. CompStat data can be displayed on maps showing crime and arrest locations, crime hot spots, and other relevant information to help precinct commanders and NYPD's senior leadership quickly identify patterns and trends and develop a targeted strategy for fighting crime, such as dispatching more foot patrols to high-crime neighbourhoods.
Dealing with more than 105,000 cases per year in Manhattan, New York's district attorneys did not have enough information to make fine-grained decisions about charges, bail, pleas, or sentences. They couldn't quickly separate minor delinquents from serious offenders.
In 2010 New York created a Crime Strategies Unit (CSU) to identify and address crime issues and target priority offenders for aggressive prosecution. Rather than information being left on thousands of legal pads in the offices of hundreds of assistant district attorneys, CSU gathers and maps crime data for Manhattan's 22 precincts to visually depict criminal activity based on multiple identifiers such as gang affiliation and type of crime. Police commanders supply a list of each precinct's 25 worst offenders. which is added to a searchable database that now includes more than 9,000 chronic offenders. A large percentage are recidivists who have been repeatedly convicted of grand larceny, active gang members, and other priority targets. These are the people law enforcement wants to know about if they are arrested.
This database is used for an arrest alert system. When someone considered a priority defendant is picked up (even on a minor charge or parole violation) or arrested in another borough of the city, any interested prosecutor, parole officer, or police intelligence officer is automatically sent a detailed e-mail. The system can use the database to send arrest alerts for a particular defendant, a particular gang. or a particular neighbourhood or housing project, and the database can be sorted to highlight patterns of crime ranging from bicycle theft to homicide.
The alert system helps assistant district attorneys ensure that charging decisions, bail applications, and sentencing recommendations address that defendant’s impact on criminal activity in the community. The information gathered by CSU and disseminated through the arrest alert system differentiates among those for whom incarceration is an imperative from a community-safety standpoint and those defendants for whom alternatives to incarceration are appropriate and will not negatively affect overall community safety. If someone leaves a gang, goes to prison for a long time, moves out of the city or New York state, or dies, the data in the arrest alert system are edited accordingly.
Information developed by CSU helped the city's Violent Criminal Enterprises Unit break up the most violent of Manhattan's 30 gangs. Since 2011, 17 gangs have been dismantled.
Using Big Data and analytics to predict not only where crime will occur, but who will likely commit a crime. has spread to cities across the globe to the UK, Germany, France, Singapore and elsewhere. In the UK, Kent Police have been using "pre-crime" software since in 2015. The proprietary software, called PredPol, analyzes a historical database of crimes using date, place, time, and category of offense. PredPol then generates daily schedules for the deployment of police to the most crime-prone areas of the city. PredPol does not predict who will likely commit a crime, but instead where the crimes are likely to happen based on past data. Using decades worth of crime reports, the PredPol system identified areas with high probabilities of various types of crime and creates maps of the city with color coded boxes indicating the areas to focus on.
It's just a short step to predicting who is most likely to commit a crime. or a terrorist act. Predicting who will commit a crime requires even bigger Big Data than criminal records and crime locations. Law enforcement systems being developed now parallel those used by large hotel chains who collect detailed data on their customers personal preferences, and even their facial images. Using surveillance cameras throughout a city, along with real-time analytics, will allow police to identify where former, or suspected, criminals are located and traveling. These tracking data will be combined with surveillance of social media interactions of the persons involved. The idea is to allocate police to those areas where ''crime prone" people are located. In 2016, the UK adopted the Investigatory Powers Bill, which legalizes a global web and telecommunications surveillance system, and a government database that stores the web history of every citizen. This data and analysis could be used to identify people who are most likely to commit a crime or plot a terrorist attack. Civil liberties groups around the globe are concerned that these systems operate without judicial or public oversight and can easily be abused by authorities.
Case Study Questions
1. What are the benefits of intelligence-driven prosecution for crime fighters and the general public?
2. What problems does this approach to crime fighting pose?
3. What management. organization. and technology issues should be considered when setting up information systems for intelligence-driven pro section?
(Source: Laudon, K.C. and Laudon, J.P., 2004. Management information systems: Managing the digital firm. Pearson Educación.)
Assignment 3 Rio Olympic Games
In 2016, the Summer Olympic Games were held in Rio. We are interested in the performance of the countries in this game. Please investigate the following questions. The data file, “Rio Olympic Medal Data.xlsx”, can be download from Moodle.
1. Researchers argue that a country’s GDP can impact the number of total medals and the number of gold medals. Please use the regression model to study the effect of 1% change of GDP on the number of total medals and gold medals, respectively. Report and Explain your results.
2. Other researchers think that besides GDP, a country’s Population may also influence the performance in the 2016 Rio Olympic Games. So we need to consider both GDP and Population in our regression. Please study the influence of Population and 1% change of GDP on total medals and gold medals, respectively. Report and Explain your results.
Courneya and Carron (1992) propose that home advantage exists in sports competition, namely the home team usually performs much better. Moreover, for Olympic games, researchers argue that such home advantage may even last until the next Olympic games.
3. Assuming that you are now focusing on the performance of China (2008 home team) and Team GB (2012 home team) in the Summer Olympic Games from 1984 (Los Angeles) to 2016 (Rio). Based on the historical data, for each of the two countries respectively, draw a bar chart for the total medals, gold medals, and the country’s rankings based on total medals and gold medals to visualise the two countries’ performance from 1984 to 2016. Can you observe the home advantage and its lasting effect? Discuss your results.
4. Can you verify your results about home advantage in the Tokyo 2020 Olympic Games?
(Hint: Olympic Games data can be easily found online: http://www.medalspercapita.com/#golds:1984)
Further reading
• Courneya, K.S. & Carron, A.V. (1992). The home advantage in sport competitions: A literature review. Journal of Sport and Exercise Psychology, 14(1), 13-27.
• Forrest, D., Sanz, I., & Tena, J. D. D. (2010). Forecasting national team medal totals at the Summer Olympic Games. International Journal of Forecasting, 26(3), 576-588.
• Shibli, S., Gratton, C., & Bingham, J. (2012). A forecast of the performance of Great Britain and Northern Ireland in the London 2012 Olympic Games. Managing Leisure, 17(2-3), 274-290.
Assignment 4 Online Auctions
Online keyword auctions in search engines have become a billion-dollar business. As a data analyst, you are interested in the online auctions data. Please download the data and answer the following questions.
(Students are encouraged to use Stata to finish this mini case, especially Q4. Stata is available via Desktop@UCL Anywhere. More information can be found here: https://www.ucl.ac.uk/isd/services/computers/remote-access/desktopucl-anywhere )1. For the variable PRICE, report the descriptive statistics (i.e., mean, median, mode, range, minimum and maximum).
2. Find out how many bids are placed manually and by an automatic bidding program. Tabulate the results in a table with the frequency and percentage.
3. Count how may different ACCOUNT_ID there are in the data. Each ACCOUNT_ID may bid for multiple phrases. Find the ACCOUNT_ID that places bids in the most phrases. How many different PHRASE_ID does this ACCOUNT_ID bid for? What are they?
4. Count how may different PHRASE_ID there are in the data. Draw a histogram based on the average price of each PHRASE_ID.
5. For PHRASE_ID=1, visualize the price trend over time for the bidder with ACCOUNT_ID 741 and 4265. What can you infer from the trend of the two bidders? (Hint: focus on PHRASE_ID=1, draw a scatter plot for the bidding price of bidders 741 and 4265 overtime, respectively.)
6. Read Example 1 in the reading material, “Stata-Kmeans Clustering”. Now, focus on PHRASE_ID=1, follow the same steps as in Example 1, and perform. clustering analysis for the variable PRICE and ACCOUNT_ID using the online auction data.
• Generate a summary table and draw a graph matrix for the two variables: PRICE and ACCOUNT_ID.
• Perform. clustering analysis and set the number of clusters to 3.
• Show a graph matrix for the clusters, i.e., label the data points using the cluster number.
• Interpret the graph matrix, i.e., why the data has clusters in the way as shown in the graph.
Data description
This dataset includes advertiser bid data in the following format:
FILE: ( LINE '\n' ) +
LINE: TIMESTAMP '\t' PHRASE_ID '\t' ACCOUNT_ID '\t' PRICE '\t' AUTO
TIMESTAMP: MM/DD/YYYY HH:MM:SS
PHRASE_ID: Int
ACCOUNT_ID: Int
PRICE: Float
AUTO: 0 or 1
For the Auto field, 0 means that the bid was placed manually, 1 that the bid was placed by an automatic bidding program. Bids are given for 15-minute increments. The original data set has 18 million records. Only the first 1 million records are provided in this excerpt of data set. Price is denominated in US dollars.
Section C: Module Learning Outcomes covered in this Assessment
This assessment contributes towards the achievement of the following stated module Learning Outcomes as highlighted below:
Business Strategy and Analytics is one of the two parts of the core module MSIN0116 Decision-making and Analytics. This module exposes students to key issues in developing and executing business strategy with a focus on the applications of data analytics, i.e., in the technology and digital industries. It explores how to analyse the external environment, the internal processes of the organisation, and the economics and strategy of technology business models. This module also investigates how leading companies and organisations are using data and analytics to improve their strategy processes as well as fundamental tenets of data and analytics, such as data summarisation and visualisation, data-based decision making, experiments, and counterfactual thinking.