COMP-614: Distributed Data Management
Project Guidelines
Outline
The project work is an essential part of this course. Its aim is to get
you started with doing independent research / work in the area of
distributed systems and distributed information systems. During the
first part of the project you will do some literature research within a
specific topic and write
a survey report about your readings (around 15 to 25 pages, 12pt, 1.5
spacing). You will present the survey to the class in form of a lecture
with discussion. During the second part of the project you
should
elaborate a new problem and its solution within this research
area. There are three alternatives:
- A research proposal for a relatively complex problem. I do not
expect a complete solution but a description of the approach you would
take to obtain the solution. Your work should demonstrate an
understanding of the research area and an insight into the problem such
that given enough time (2 to 5 more
months), you could carry it to its logical conclusion and complete the
research.
The delivery is a report.
- A complete solution to a small-sized, more specialized problem.
The delivery is a report.
- An implementation and evaluation of algorithms presented in some
research paper. The delivery is a report with an overview of the
implementation and the evaluation results.
The resulting report should also contain around 15-25 pages.
Schedule
- survey lectures start begin of february
- 17-February: turn in of survey report
- 17-March: 1-3 page plan of what you want to do in your
research report
- 10-April: turn in of research report
Talk Schedule, see mycourses
On Pursuing Research
Literature Search
Below you will find several possible research topics from which you can
pick one. For each of the topics I will later provide at least three
research
papers that can serve as starting points for your literature study. If you find other papers related to the topic, that you think are
better, feel free to change. For your talk and your basis for your own
study, feel free to look at more papers or even choose different
papers. You should
look at the proceedings of the following conferences. For databases
oriented topics
ICDE, VLDB, SIGMOD ... for more distributed system topics Middleware,
ICDCS,
DSN, SOSP. You do not need to look at journals but if you see an
interesting sounding journal paper, the following journals are good:
IEEE and ACM Transactions on.... journals (transactions on database
systems, transactions on computer systems, etc.), Information
Systems, IEEE TKDE, VLDB Journal. A good starting point for you own
literature search is google scholar or
Michael
Ley's Computer Science Bibliography . The first one is maintained
automatically, and hence, also contains a lot of duplicate information,
etc. The latter is
more structured and consistent. Citeseer is also a good starting point. You
will
find many papers online. You will also have access to many of the research
papers
through the McGill
library system. A more recent resource is
McGill has subscribed to the digital libraries for ACM, IEEE, and
Springer. You can go directly to their webpages and have access if you
connect from a computer within McGill
Research Topics
- Data Partitioning (13-Feb)
- C. Curino, Y. Zhang, E. Jones, and S. Madden. Schism: a
workload-driven
approach to database replication and partitioning. In VLDB, 2010
- A. Pavlo, C. Curino, and S. Zdonik. Skew-aware automatic database
partitioning
in shared-nothing, parallel OLTP systems. In SIGMOD, 2012.
- R. Taft, E. Mansour, M. Serafini, J. Duggan, A. Elmore,
A. Aboulnaga, A. Pavlo,
and M. Stonebraker. E-Store: Fine-grained elastic partitioning for
distributed
transaction processing systems. In VLDB, 2014
- Logging Architectures (15-Feb)
- S. Ghanbari, A. B. Hashemi, and C. Amza. Stage-aware anomaly
detection through
tracking log points. In ACM/IFIP/USENIX International Middleware
Conference, pages
253–264, 2014.
- G. Lee, J. J. Lin, C. Liu, A. Lorek, and D. V. Ryaboy. The unified
logging
infrastructure for data analytics at twitter. PVLDB, 5(12):1771–1780,
2012.
- A. Rabkin and R. H. Katz. Chukwa: A system for reliable large-scale
log
collection. In Uncovering the Secrets of System Administration: LISA
Conf., 2010.
- Scalable Publish/Subscribe Architectures (20-Feb)
-
Barazzutti, R.; Heinze, T.; Martin, A.; Onica, E.; Felber, P.; Fetzer,
C.;
Jerzak, Z.; Pasin, M. & Riviere, E. Elastic Scaling of a
High-Throughput
Content-Based Publish/Subscribe Engine ICDCS, 2014, 567-576
-
Zhao, Y.; Kim, K. & Venkatasubramanian, N. DYNATOPS: A Dynamic
Topic-based Publish/Subscribe Architecture DEBS, 2013, 75-86
-
Setty, V.; Van Steen, M.; Vitenberg, R. & Voulgaris, S. PolderCast:
Fast, Robust, and Scalable Architecture for P2P Topic-based Pub/Sub
International Middleware Conference (Middleware), 2012, 271-291
- Database Systems and Data Mining (22-Feb) (selection of three still to be done)
- Query Processing on Modern Hardware (15-Mar)
- Smart Cities (20-Mar) (selection of three still to be done)
- Valérie Issarny, Vivien Mallet, Kinh Nguyen, Pierre-Guillaume Raverdy, Fadwa Rebhi, Raphael Ventura:
Dos and Don'ts in Mobile Phone Sensing Middleware: Learning from a
Large-Scale Experiment, ACM Middleware 2016.
- Joy Dutta, Chandreyee Chowdhury, Sarbani Roy, Asif Iqbal Middya, Firoj Gazi:
Towards Smart City: Sensing Air Quality in City based on Opportunistic Crowd-sensing. ICDCN 2017: 42
- Marco Bonola, Lorenzo Bracciale, Pierpaolo Loreti, Raul Amici, Antonello Rabuffi, Giuseppe Bianchi:
Opportunistic communication in smart city: Experimental insight with small-scale taxi fleets as data carriers. Ad Hoc Networks 43: 43-55 (2016)
- Jean-Gabriel Krieg, Gentian Jakllari, Hadrien Toma, André-Luc Beylot:
Unlocking the smartphone's senses for smart city parking. ICC 2016: 1-7
- Yi Gao, Wei Dong, Kai Guo, Xue Liu, Yuan Chen, Xiaojin Liu, Jiajun Bu, Chun Chen:
Mosaic: A low-cost mobile sensing system for urban air quality
monitoring. INFOCOM 2016: 1-9
- Xi Chen, Linghe Kong, Xue Liu, Lei Rao, Fan Bai, Qiao Xiang:
How cars talk louder, clearer and fairer: Optimizing the communication
performance of connected vehicles via online synchronous
control. INFOCOM 2016: 1-9
- Data Consistency (22-Mar)
-
Marcos K. Aguilera, Douglas B. Terry:
The Many Faces of Consistency. 3-13, Data Engineering Bulletin, 39(1),
2016
-
Divy Agrawal, Amr El Abbadi, Kenneth Salem:
A Taxonomy of Partitioned Replicated Cloud-based Database
Systems. 4-9, Data Engineering Bulletin, 38(1),
2015