COMP9727: Recommender Systems
Project: Project Pitch and Design
Due Date (Design): Week 5, Sunday, June 30, 5:00 p.m.
Due Date (Pitch): Week 5, Friday, June 28, 5:00 p.m.
Value: 20% (Design)
The project design is Part 1 of the major assessment item for this course, a team-based project in which students work in teams of 3–4 (possibly from different tutorial groups) to build and evaluate a recommender system in a domain of your choice. The project design is an individual report consisting of a project proposal, which is also presented briefly to the class as a “pitch” to assist in the formation of teams. This means that not everyone’s individual proposal will be implemented, but the idea is to form. teams of students with similar proposals that can be combined. There are no marks assigned to the pitch.
The project design should address all aspects of building and evaluating a recommender system, ideally with time estimates and milestones of anticipated progress taking into account the time available, though note that the design is only a proposal, so while some preliminary exploratory data analysis is needed, no experimentation is required at this stage. Part 2 of the project is to build and evaluate the recommender system as a team, and Part 3 (an individual report drawing on the team’s work) involves analysis, evaluation and interpretation (possibly a small user study). It is desirable that teams have a mix of skills and experience, for example machine learning, neural networks/deep learning and/or human computer interaction.
Note that standard UNSW late penalties apply.
Project Design
The project design should be a concrete proposal addressing the following.
1. Scope. (5 marks) What is the domain of the recommender system, and who are its intended users? How many items will be presented to the users at one time, and through what sort of user interface (e.g. web or mobile)? The design should include some simple mockups of a possible user interface and some description of the user interaction (including how user feedback is to be obtained and used), but the user interface does not need to be built for the project implementation, though you should be able to simulate user interactions in a user study, similar to the assignment. If the system is to be used in a dynamic scenario, explain how the recommendation model will be updated and how cold start problems will be addressed. Briefly consider the business model and how the recommendations might generate revenue. Important: You cannot choose content-based movie recommendation as a topic, and you cannot choose any topic or method that you are using, or have used, in a major project in any other course.
2. Dataset(s). (5 marks) The aim is to use a realistic dataset for this recommendation problem. However, you need a dataset of sufficiently good quality and quantity in order to build a good recommendation model (compare Tutorial 3 where data is not of high quality – or quantity). You can choose a dataset from sites such as Kaggle or Hugging Face (or elsewhere), but be aware of the limitations of these datasets: (i) the dataset may be limited, and not reflect all the actual data from the original site (the dataset may represent only a subset of users and/or interactions), (ii) the dataset may be unrealistic (already sanitized) so may miss the full complexity of the real problem, (iii) these sites are set up to host “competitions” that usually involve developing models for predefined tasks with predefined metrics or training/test set splits, etc., and thus cover only one aspect of recommendation, and encourage models that overfit the given data for that given scenario. Especially note that performing well on a prediction task such as hosted on Kaggle is not sufficient or even necessary for a good recommender system. Note also that you cannot propose or use solution code from any such sites for use in your project, though you can use libraries that implement general machine learning methods similar to scikit-learn, surprise, keras, etc.
3. Method(s). (5 marks) As in the assignment, the aim is to propose a number of methods that you think are appropriate for the recommendation problem and user base as outlined above, that will also work with the chosen dataset(s), and justify why you think these methods will be suitable. Consider the different types of recommender system: content-based, collaborative filtering, knowledge-based recommender systems and hybrid recommender systems (and if relevant, sequential and context-aware recommendation). You should propose a basic approach with one or more variants, as appropriate, perhaps three related methods or several ways of combining methods to produce a hybrid recommender system, but bear in mind that your plan should allow you enough time to evaluate the methods and the recommender system. It is not necessary at this stage to have done any experimentation with the methods, and it is possible for you later to try different methods in the actual project.
4. Evaluation. (5 marks) Discuss suitable metrics for assessing the performance of both the recommendation model and resulting recommender system, taking into account how and when recommendations will be presented to users. There should be a variety of metrics for both the recommendation model (that focus on historical data) and the recommender system (that focus on user feedback and/or interactions). It may be appropriate to use top-N metrics (for some N) and/or per-user metrics, i.e. metrics averaged over a set of users. If there are multiple metrics, say which are the most important or discuss the tradeoffs between the different metrics and how you will choose the “best” model/system using the chosen metrics. Also consider the computational requirements of the recommender system and especially any requirement for recommendations to be computed in near-real time, or for the model to be dynamically updated/retrained. Also outline the design of a user study (involving more than the assignment) for evaluating the recommender system with real users, through a simulated user interface. Explain what user feedback you will solicit about the recommender system (for example questionnaire). Note that this should be a very informal study of a kind not requiring ethics approval.
Pitch
In the lecture sessions and some of the tutorial sessions in Week 5, students will present to the class a 1-minute “pitch” of up to 3 slides outlining the basic ideas of the recommender system as proposed in the report. The lecture theatre sessions will be hybrid so students can attend on Zoom, however those present in the lecture theatre will be given the opportunity to present first. Important: For those pitches held in the lecture theatre during scheduled class time, the presentations will be recorded. This is standard practice for any assessment item.
Submission and Assessment
• Please include your name and zid at the start of the documents.
• Submit your design document and pitch as .pdf files using the following commands:
give cs9727 design design.pdf
give cs9727 pitch pitch.pdf
You can check that your submissions have been received using the commands:
9727 classrun -check design
9727 classrun -check pitch
• Assessment criteria include completeness of proposal and presentation quality.
Plagiarism
Remember that ALL work submitted for this assignment must be your own work and no sharing or copying is allowed. You may discuss the assignment with other students but must not collaborate on developing your project design or pitch. You may use datasets from the Internet only with suitable attribution of the source. You may not use ChatGPT or any similar software to generate any part of your design or pitch. Do not use public repositories on sites such as github or file sharing sites such as Google Drive to save any part of your work – make sure your code repository or cloud storage is private and do not share any links. This also applies after you have finished the course, as we do not want next year’s students accessing your solution, and plagiarism penalties can still apply after the course has finished.
All submitted assignments will be run through plagiarism detection software to detect similarities to other submissions, including from past years. You should carefully read the UNSW policy on academic integrity and plagiarism (linked from the course web page), noting, in particular, that collusion (working together on an assignment, or sharing parts of assignment solutions) is a form. of plagiarism.
Finally, do not use any contract cheating “academies” or online “tutoring” services. This counts as serious misconduct with heavy penalties up to automatic failure of the course with 0 marks, and expulsion from the university for repeat offenders.