Names and Numbers:
Overview:
Access to computing resources anywhere/anytime has become a reality.
Companies use the cloud
not only as Infrastructure as a Service (IaaS) but also for the
Platforms as a Service (PaaS) the cloud providers offer.
We
upload our documents to the cloud, play games onlines, make phone
calls via the Internet, deliver our classes via video
conferencing, and run large-scale data analytics on public and private
clouds. Business applications rely on distributed component-based
architectures often connected through web-service interfaces. There also exist plenty of examples of
homogeneous DS implementations (e.g. distributed filesystems) in facilitating scalability beyond conventional hardware limitations.
Building such
complex distributed infrastructures is challenging and has to consider
many different aspects such as communication, data management, asynchronous behavior of the different
components in
the system, architectural considerations, failures etc.
Over the last
20 years many fundamental building blocks have been developed that
build the backbone of current distributed infrastructure.
These building blocks are the main focus of this course.
Topics that will likely be covered:
- Overview
- Communication Paradigmms
- basic network protocols
- synchronous and asynchronous communication
- client / server communication basics
- RMI: remote method invocation
- group communication
- ...
- Synchronization
- physical clocks
- logical clocks
- Distributed Data Management
- data partitioning strategies
- distributed computation (map-reduce vs. distributed SQL)
- distributed transactions and commit protocols
- Data replication
- scalability vs. fault-tolerance vs. performance
- consistency levels
- Paxos
- Performance (how to measure it and how to understand the results)
- Possible Case Studies (the exact ones will be determined later
- Google File System
- Zookeeper
- Blockchain
- Distributed Machine Learning
Flavor of the course:
You will work both theoretically and practically. For each topic covered
in the course, we will first look at the foundations, develop some
basic algorithms, discuss trade-offs of possible solutions, and then
look how the issues are handled in real systems. There will be
significant programming tasks where you will implement some of the
algorithms and explore a distributed system/application in practice.
You are expected to actively participate in class discussions.
Learning Outcomes:
By the end of the course, you will be able to
- recognize the fundamental challenges encountered in distributed environments and the principle building blocks to address them
- compare the possible solutions to these problems and analyze their trade-offs
- develop algorithms for concrete problem scenarios
- apply the approaches and mechanisms introduced to new problem scenarios
- implement distributed algorithms and evaluate their performance
Delivery format:
This is an in-person class. Participation in class is expected and will lead to an engaging learning environment.
Prerequisites:
COMP-251 and COMP-310 (or equivalent). Knowledge of computer network
principles will be very useful, but a networks course is not a pre-requisite. Otherwise, given that this is a 500-level course, it is expected that you have taken a fair amount of CS courses before.
Marking Scheme:
The exact marking scheme will be provided at first day of class and available via myCourses.
It will be a combination of written assignments, programming tasks, quizzes and 2 midterms.
Both programming tasks and written assignments will be done in groups.
Recommended Textbook:
Distributed Systems: Concepts and Design by G. Coulouris,
J. Dollimore, T. Kindberg and G. Blair. Addison-Wesley, 5th ed.
This is a good textbook but starts to be a bit outdated. The fundamentals are well covered, but the advanced topics somewhat less. Still, if you have not yet taken a lot of system's courses or done system's related work, this textbook is a very good read.
A note on academic integrity
McGill University values academic integrity. Therefore all students must understand
the meaning and consequences of cheating, plagiarism and other academic offences
under the
Code of Student Conduct and Disciplinary Procedures . (Approved by Senate on 29 January 2003) (See McGill’s
guide to academic honesty for more information.)
L'université McGill attache une haute importance à l’honnêteté académique. Il incombe par conséquent à tous les étudiants de comprendre ce que l'on entend par tricherie, plagiat et autres infractions académiques, ainsi que les conséquences que peuvent avoir de telles actions, selon le
Code de conduite de l'étudiant et procédures disciplinaires. (Énoncé approuvé par le Sénat le 29 janvier 2003) (pour de plus amples renseignements, veuillez consulter
le guide pour l’honnêteté académique de McGill.)
French/English
In accord with McGill University's Charter of Students' Rights, students in this
course have the right to submit in English or in French any written work that is to
be graded. This does not apply to courses in which acquiring proficiency in a language is one of the objectives. (Approved by Senate on 21 January 2009)
Conformément à la Charte des droits de l’étudiant de l’Université McGill, chaque étudiant a le droit de soumettre en français ou en anglais tout travail écrit devant être noté, sauf dans le cas des cours dont l’un des objets est la maîtrise d’une langue. (Énoncé approuvé par le Sénat le 21 janvier 2009)