### Key facts:

- Instructor: William L. Hamilton
- Email: wlh@cs.mcgill.ca
- Term: Winter 2021
- Prerequisites: COMP 251, MATH 222, MATH 223, and MATH 323
- Restriction: Not open to students who are taking or have taken COMP 551

### Description

The course will introduce the core concepts of machine learning, with an emphasis on the computational, statistical and mathematical foundations of the field. We will study models for both supervised learning and unsupervised learning, introducing these models alongside foundational machine learning concepts, such as maximum likelihood estimation, regularization, information theory, and gradient-based optimization. The course concludes with a brief introduction to neural networks and deep learning.

### Should you take COMP 451 or COMP 551?

COMP 451 is intended as a first course in machine learning for undergraduate students in a computer science program who plan to pursue further research or academic study on advanced machine learning topics. It emphasizes the statistical, computational, and mathematical foundations of machine learning (e.g., with a focus on formal definitions and mathematical proofs). It provides the theoretical foundations that students need to succeed in 500-level and graduate-level machine learning courses. In contrast, COMP 551 is intended for undergraduate and graduate students in computer science---as well as students in other programs with sufficient computer programming and mathematical background---who want an introduction to the practical side of machine learning. COMP 551 emphasizes good methods and practices for the deployment of real systems (e.g., software design principles, validation methods, and handling of large datasets).

In short, a student interested in taking more advanced machine learning courses (e.g., COMP 652, COMP 579, or special topics courses) would benefit substantially from taking COMP 451, while COMP 551 provides a practical and rigorous standalone introduction to machine learning.

### List of topics (subject to minor changes)

- Supervised learning
- Maximum likelihood
- Naive Bayes
- Logistic regression
- Gradient descent
- Linear regression
- Regularization
- Information theory
- Decision trees
- Representation learning / PCA
- Clustering
- Latent variable models
- Neural networks
- Boltzman machines