BBS 39 FT
FIN3020S
Introduction to Machine Learning
Copyright April 2024
WELCOME MESSAGE
I am delighted to introduce you to the Introduction to Machine Learning module.
This module is a management-level introduction to the topic of machine learning informed alert models, with a focus on the deployment and accountability of these models to mitigate operational risks and other fundamental risks in banking and finance. Operational risk machine learning informed alert models are deployed, for the most part, to detect, prevent and even predict suspicious and possibly fraudulent transactions. Instances include credit card fraud, elder financial exploitation and money laundering. These alert models are also deployed to mitigate or prevent cyber security crime and to inform. credit extension decisions. You will learn why these types of risks are centrally important in modern banking and how to monitor, mitigate and manage these risks.
The module will show students how to make sense of impactful, vast and complex data sets, i.e. “Big Data”, with a view to better understanding and anticipating operational risks that have emerged in the field of banking. Vast data sets are of no value to an organisation without algorithmic work to elicit meaning from the data. Machine learning will be introduced via the software RapidMiner. The alert models and monitoring skills introduced in this module are in high demand in industry. The module’s content is important not only for those students who wish to progress their careers in the financial services industry. It is also designed for those planning to pursue work in other industry sectors, and who wish to avail of and develop their quantitative risk management skill sets to mitigate a variety of risks.
On successful completion of this module, you will have achieved two outcomes: (1) developed a critical understanding of machine learning informed alert models in a banking context and (2) have learned how to evaluate the performance, ethics and reliability of these machine learning informed alert models.
This module will require your available free time for study over the coming weeks. I will do all that I can to help and support you. The success of a student’s performance in this course will depend, to a large extent, on his or her time and effort devoted to the course materials. It is highly recommended that students maintain a consistent routine of studies and complete the course assignments at the earliest opportunity. Many management decisions are based on quantitative analysis and students should endeavor to understand the meaning behind the numbers, which in turn assist in the making of financial decisions and mitigating risks.
Please read this study guide carefully. You should find the answer to any question you might have about the course in this document. However, should you require any clarification on any matter pertaining to this module, don’t hesitate to contact me.
I am looking forward to meeting you.
PART 1: INTRODUCTION
This Study Guide is designed to provide you with details of this module; the learning outcomes; plus, delivery and assessment arrangements. The Study Guide consists of 6 parts.
Part 1 gives background details to the subject area and the broad aims of the module are set out.
Part 2 consists of the module outline. In this part the (a) module learning outcomes, (b) the themes and topics to be explored are explained along with the (c) learning supports to be used.
Part 3 gives details of the module delivery arrangements. It sets out the session arrangements and the expectations in relation to your prior preparation and student engagement.
Part 4 provides details of the assessment techniques used in this module explaining the assessment components, their rationale.
Part 5 explains the UCD grading policy and grade descriptors drawing on the university document are given for each assessment component for the module.
Part 6 provides extended details of the themes and topics covered in this module, including weblinks and references.
Background Details
Introduction to Machine Learning is an increasingly important part of any syllabus of a contemporary finance degree. This module seeks to provide broad theoretical and practical knowledge of the principles, tools and methodologies used by financial managers at financial institutions from the machine learning ‘toolbox’ today. Further, the module seeks to develop a student’s ability to critically assess, evaluate, and analyse financial and operational risks. The knowledge that students gain in this course leads to careers in the finance and treasury departments of a corporation or in risk management in banking. Advance preparation for classes and workshops will be an important feature of this module, with readings, references and a slide deck for reflection assigned to students prior to the commencement of lectures.
Module Aims
The aim of this module is to provide students with an overview of the theory and practice of how a financial institution can manage risks.
On successful completion of this module, you will have achieved two outcomes: (1) developed a critical understanding of machine learning informed alert models in a banking context and (2) have learned how to evaluate the performance, ethics and reliability of these machine learning informed alert models.
Participants will be expected to contribute critical reflection of their experiences along with collaborative interpretation of such experiences in both classroom and study group settings. The module draws on student prior learning and work experience and combines insights from basic mathematics, accounting, economics, statistics, and other areas. The assessment tasks for this module have been designed with this in mind as detailed later in the study guide. Many of the quantitative aspects covered in this module can be completed using RapidMiner.
Programme Goals
Table 1 – Programme Goals Table
Programme Code: FIN3020S
Programme Title: BBS
Pathway: Finance
Programme Goals
Specify the overall programme goal and insert a one-line description of each goal.
|
Programme Learning Outcomes
Specify the learning outcomes associated with each programme goal.
On successful completion of the programme students should be able to:
|
FIN3020S
Introduction to Machine Learning
Module Components
Indicate which aspects of the module will help accomplish the relevant Programme Goal
|
1)
|
Programme Goal 1:
Informed Thinkers: Our graduates will be knowledgeable on management theory and will be able to apply this theory to business problems (Knowledge).
|
Programme Learning Outcome 1a:
Explain current theoretical underpinnings of what drives various banking risks.
|
Lectures and group assignment
|
Programme Learning Outcome 1b:
Apply appropriate methods, tools and techniques for identifying, analysing and resolving banking problems within functional and across functional banking areas.
|
Lectures, team assignment, and final exam.
|
2)
|
Programme Goal 2:
Communication, Analytical and Critical Thinking Skills: Our graduates will have well developed skills of communication, analysis and critical thinking (Skills and Competencies).
|
Programme Learning Outcome 2a
Prepare a short presentation (written and/or oral) on a banking machine learning informed alert model proposal.
|
Team assignment
In class activities
|
Programme Learning Outcome 2b:
Analyse specific banking use cases and formulate a report detailing the issues and recommended actions.
|
Individual written examination
In class activities
|
Programme Learning Outcome 2c:
Conduct secondary research on machine learning in finance-related issues and report on the findings and draw appropriate conclusions.
|
Individual written examination
Team assignment
|
3)
|
Programme Goal 3:
Personal and Professional Development: Our graduates will demonstrate a commitment to personal and professional excellence and development (Skills, Competencies and Attitudes).
|
Programme Learning Outcome 3a:
Develop collaborative learning and team-work skills by engaging in module-related team activities.
|
In-class activities
Team assignment
|
Programme Learning Outcome 3b:
Demonstrate capacity for problem solving collaboratively and individually.
|
In-class activities
Team assignment
|
4)
|
Programme Goal 3:
Ethical Awareness: Our graduates will demonstrate an awareness of ethical issues in business and their impact on society (Attitudes).
|
Programme Learning Outcome 3a:
Demonstrate an awareness of ethical values and business issues concerning machine learning informed alert models and the advancement of the broader societal ‘good’.
|
Team assignment
|
Programme Learning Outcome 3b:
Illustrate an understanding of how business decisions might influence society and the wider community at large.
|
Team assignment
|
PART 2: MODULE OUTLINE
Module Title: Introduction to Machine Learning
Module Code: FIN3020S
No. of ECTS: 10.0
Module Learning Outcomes
On completing the Introduction to Machine Learning module, students will be expected to be able to:
1. Understand what a machine learning informed alert model is and how one can use it to better inform. certain types of decisions to mitigate operational and other banking risks.
2. Be aware of various alert model algorithms to elicit information from data with a view to better informing decisions in banking and financial services settings.
3. Have learned how to evaluate the performance and reliability of alert models, whether proprietorial alert models or otherwise.
4. Be aware of ethical aspects of data mining and alert model deployment as well as the economic, institutional and regulatory contexts of these models in respect to various applications.
Workshops and topics
The module covers multiple topics across six workshops.
Workshop 1 (Remote): Introduction to Machine Learning (with Prof Muckley)
· Overview of the Module
· A schema to introduce machine learning informed alert models
· Mini Lecture 1: AI Use cases across industry sectors and in banking and finance
· Introduction to RapidMiner Tutorial (1)
Workshop 2 (with Prof Muckley)
· Revision of key aspects in workshop 1
· Introduction to RapidMiner Tutorial (2)
· Mini Lecture 2: Special Focus in Elder Financial Exploitation and AI
• Mini Lecture 3: What is the machine learning innovation?
• Mini Lecture 4: What is the role of management?
• Mini Lecture 5: Why do machine learning models fail?
• Preparation for Team Assignment
Workshop 3 (with Prof Muckley)
• Revision of key aspects in workshop 1-2
• Introduction to RapidMiner Tutorial (3)
• Mini Lecture 6: Two Illustrative algorithms: KNN and Decision Trees
• Mini Lecture 7: Data, Software, Terminology and Career Opportunities
• Mini Lecture 8: Performance Evaluation 1 of 2
• Preparation for Team Assignment
Workshop 4 (with Prof Muckley)
• Revision of key aspects in workshops 1-3
• Introduction to RapidMiner Tutorial (4)
• Mini Lecture 9: Performance evaluation 2 of 2
• Ethics of machine learning
• Data visualisation
• Preparation for Team Assignment
Workshop 5 (with Prof Muckley)
· Revision of key aspects in workshops 1-4
· Introduction to RapidMiner Tutorial (5)
· Assessment: Practice Team presentation
Workshop 6 (with Mr Yusra Anas)
· Revision of workshop 1
Workshop 7 (with Mr Yusra Anas)
· Revision of workshop 2
Workshop 8 (with Mr Yusra Anas)
· Revision of workshop 3
Workshop 9 (with Mr Yusra Anas)
· Revision and formal presentations of group assignment.
Workshop 10 (with Mr Yusra Anas)
· Revision of workshops
Workshop 11 (Remote): Conclusion (with Prof Muckley)
· Revision of key aspects in workshops 1-5
PART 3: MODULE DELIVERY SCHEDULE
The module delivery relies on students’ ability to engage in prior preparation, to seek confirmation and clarification as appropriate and to be actively engaged during the sessions.
Session
|
Date
|
Time (Singapore)
|
Lecturer
|
1
|
25 Apr 2024*
|
5.00pm to 6.00pm
|
Prof Muckley
|
2
|
13 May 2024
|
1.00pm to 4.00pm
|
Prof Muckley
|
3
|
14 May 2024
|
9.30am to 12.00pm
|
Prof Muckley
|
4
|
14 May 2024
|
1.00pm to 4.00pm
|
Prof Muckley
|
5
|
15 May 2024
|
9.30am to 12.00pm
|
Prof Muckley
|
6
|
15 May 2024
|
1.00pm to 4.00pm
|
Prof Muckley
|
7
|
16 May 2024
|
9.30am to 12.00pm
|
Prof Muckley
|
8
|
16 May 2024
|
1.00pm to 4.00pm
|
Prof Muckley
|
9
|
17 May 2024
|
9.30am to 12.00pm
|
Prof Muckley
|
10
|
23 May 2024
|
12.00pm to 3.00pm
|
Mr Yusra Anas
|
11
|
24 May 2024
|
8.30am to 11.30am
|
Mr Yusra Anas
|
12
|
30 May 2024
|
8.30am to 11.30am
|
Mr Yusra Anas
|
13
|
31 May 2024
|
8.30am to 11.30am
|
Mr Yusra Anas
|
14
|
6 Jun 2024
|
8.30am to 11.30am
|
Mr Yusra Anas
|
15
|
7 Jun 2024
|
8.30am to 11.30am
|
Mr Yusra Anas
|
16
|
10 Jun 2024
|
9.00am to 11.00am
|
Mr Yusra Anas
|
17
|
11 Jun 2024*
|
5.00pm to 6.00pm
|
Prof Muckley
|
*This class is conducted on online.
Class schedule
Preparation Required in Advance of Sessions / Seminars
Students are expected to read each of the assigned references in advance of each lecture.
Student Engagement
During the sessions, students are expected to be able to discuss issues arising from the assigned topics and readings as scheduled above. Session participation is a vital element in the design of this module and students learning is enhanced by doing so. Therefore, all students are expected to engage in class discussion and debate in order to facilitate the formation of their critical judgements.
To support your learning, a deck of slides will be available which (on certain occasions) may need to be upgraded / modified during or following the sessions depending on the issues raised.
Students are advised to consider using virtual platforms for student-to-student work or team assignment preparation.
Office Hours Arrangements
I will be available both before and after class on each day should you wish to meet me individually to discuss any aspect of this module. However, I am also available during the class (i.e., during breaks) or you can email me. This contact also does not end upon completion of the last class.
PART 4: ASSESSMENT DETAILS
This module has two assessment components with specific weightings and marks awarded totalling 100. These can be seen below:
Assignments
Table 3: Assessment Details
Assessment Components
|
Weighting
|
Individual / Team
|
Continuous Assessment
|
40%
|
Team
|
Final Exam
|
60%
|
Individual
|
Note: the Assessment Breakdown for your module may have additional components or have a specific weighting for several components. You can confirm this with CDL at any time.
The purpose of each assessment is as follows:
· Continuous Assessment (team): is aimed at application of the knowledge acquired in the module to practical problems. This is a team project that is designed to assess your knowledge of key theoretical issues discussed during class and how this theory is put into practice by a corporation.
· Final individual written exam: is aimed to test student understanding of the theoretical and quantitative aspects of the module.
Students are expected to complete the assignment and ensure that it is submitted by the specified date. All submissions must be typed, be well laid out, written in an academic style. with appropriate headings (introduction, main part and concluding comments) and sections. Please ensure that all submissions are entirely your own work – for UCD’s policy on plagiarism click on the link below (please see Appendix 2 for further information on Plagiarism and the policy on the Late Submission of Coursework):
http://libguides.ucd.ie/academicintegrity
Module Assessment Components
In the following pages, further details of each assessment component are presented along with expectations in relation to prior preparation and completion.
(i) Continuous Assessment Team Assignment One (40% of the total grade)
Written assignment (strict maximum 2,500 words including tables, graphs and references): worth 40% of overall module grade. All assignments must be typed and submitted via the module Brightspace page. Late assignments will not be accepted. Please use the Harvard system of referencing. A link to the Harvard referencing style. can be found here: https://libguides.ucd.ie/harvardstyle.
A team work report (no more than 300 words) summarising each member’s contribution should be included. All assignments submitted will be automatically processed through ‘Urkund’ (the plagiarism detection software in use throughout the university).
This assignment is designed to assess your knowledge of key issues discussed in class.
Instructions
You should work in a group of 4-5 students, and each group should nominate an individual to submit a single project report. Maximum word-count for the report is 2,000 words. Front page of submitted assignment should detail membership of the group (i.e., students’ names and numbers) as well as the assignment title, and the assignment word count.
Please submit this project by two days before Workshop 10 via module Brightspace page. Project presentation with a duration of 10 minutes will take place in class, Workshop 10. Detail of the presentation is in question (K) below. State ‘Group No.’ and ‘Telemarketing and machine learning’ in the subject of the mail, and cc: all group members. Attach any programme code (e.g., Rapid Miner processes) and related materials to the email. Please do not collaborate, in this assignment, across groups.
Assessment and grades
You will be assessed on your ability to respond to questions raised below, i.e., to use machine learning informed alert model algorithms, critically evaluate the performance of these methods, and coherently report your findings. This project counts for 40% of your overall module grade.
Assignment Context
An important source of income for banks is the term deposit, i.e., deposits by customers at a fixed rate for a fixed time. This capital can be used to disburse loans at a higher interest rate. The bank, hence, uses marketing techniques to target customers to save via term deposits. For example: email, advertisement, telephonic and digital marketing. Telephonic marketing (i.e., phone calls) remains an effective way to acquire term deposit customers, especially if enabled with machine learning. Banks can use data and machine learning informed alert models to identify customers who are more likely to save via a term deposit, and to inform. a telephonic marketing campaign accordingly.
The dataset, in this assignment, is related to the direct telemarketing campaigns (phone calls) of a European banking institution. The classification goal is to predict if the customer will subscribe to a term deposit (Term Deposit = 1). Tapping into the repertoire of your Machine Learning modelling, evaluation and deployment knowledge, provide recommendations to the bank’s Retail Marketing department to achieve its goal.
Questions
(a) Perform, justify and describe an exploratory analysis of the dataset, e.g., of descriptive statistics, outliers, and correlations of interest. (2 marks)
(b) Fit two machine learning models (e.g., a logistic regression, random forest, decision tree, k-NN etc) on the dataset, with Term Deposit as the outcome variable of interest. (2 marks)
(c) To report the models’ performances, compute confusion matrices with Term Deposit=1 as the positive class. How do the True Positive and False Positive rates vary over the two models? (2 marks)
(d) Would you recommend for deployment one of these machine learning models, to a bank’s marketing campaign, based on these performance metrics? (4 marks)
(e) Partition the dataset randomly into training (70%) and test (30%) samples. Fit your models on the training dataset and report the performance of the models in the test set. (2 marks)
(f) Conduct a cross validation test of your select models out-of-sample prediction performance. (2 marks)
(g) Plot the ROC for your models on a graph and compute the respective AUCs (Area Under the
Curve) performance metrics. (2 marks)
(h) Explain the information relayed in your answers to questions (e), (f) and (g). (4 marks)
(i) Comment on whether there is an important ethical aspect, which should be considered, in
relation to the deployment of your preferred model.
(2 marks)
(j) Report any other performance evaluation and discussion that you view as useful to the bank, in its aim to determine factors pertaining to the term deposit uptake rate. (3 marks)
(k) Each group should design and report a short deck of (maximum 6) slides containing suitable Figures to communicate the principal findings above to a board of directors. Your slide deck should be concise and clear and must be carefully designed to offer insights for busy decision-makers with diverse skillsets and levels of numerical ability. The provided answers for these questions should be tangible, concrete and memorable, while also portraying some of the richness of the underlying dataset. The core challenge is to surface insightful answers to these questions that maintain legibility for busy executives without eliding or distorting the complexity of the numerical data. (15 marks) Total: 40 marks
Notes:
Consider utilising additional references outside of the course material. Ensure that you can justify all decisions made in the assignment, using quantitative and qualitative rationale where possible. To attempt the above questions, you may wish to use Rapid Miner ‘Help’ functionality: (1) review Rapid Miner tutorials and (2) it may be helpful to work through certain examples in Rapid Miner Academy’s ‘Get Started’ learning path (e.g., Importing Data in RapidMiner, Testing a Model, Validating a Model). Finally, if you import data to Rapid Miner keep the categorization of the data structure intact – verify that the meaning of labels on data remains intact. Good luck!