HELLO, I AM NIRMAL KANAGASABAI







-
- 2018 - Graduate Excellence Fellowship Award
- 2018 - Graduate Research Enhancement and Travel (GREAT) Award
- 2015 - Infosys Performance Appreciation - 'Spot Award'
- 2015 - Govt. of Puducherry Prize
- 2014 - Best All-Rounder (Mr. Amrita Memorial Award)
- 2014 - TCS Best Student Award
- 2013 - Best Paper Award
- 2009 - Srinivasa Ramanujam Award
- 2008 - Child Scientist Award (Indian Science Congress)
Graduate Excellence Fellowship (GEF) Award
Awarded by the School of Computer Science to students who pursue Masters by Research (Merit-Based).
Graduate Research Enhancement and Travel (GREAT) Award
Awarded by the School of Computer Science to support research travel (conference presentation).
Infosys Performance Appreciation - 'Spot Award'
Conferred for excellence in recognition of quick turn-around time and quality of defect fixes
Govt. of Puducherry Prize
Awarded a merit scholarship for securing first rank across all years in the university examination held during the period of study (B.Tech.) conducted by Pondicherry University
Best All-Rounder (Mr. Amrita Memorial Award)
Awarded for meritorious performance in both co-curricular and extra-curricular activities during the course of study at Pondicherry Engineering College (PEC), India
TCS Best Student Award
Adjudged by Tata Consultancy Services (TCS), India for academic excellence during the course of study at Pondicherry Engineering College (PEC), India
Best Paper Award
Conferred for a paper published at the National Conference on Bio-Inspired Science and Technology titled, ‘Forensic Fingerprint Enhancement through Morphological Image Processing’ for its complexity and diverse use of applications
Srinivasa Ramanujam Award
Awarded by the Srinivasa Ramanujam Mathematics Academy for securing a distinction
Child Scientist Award (Indian Science Congress)
Represented the Union Territory of Puducherry at the 15th National Children’s Science Congress and 95th Indian Science Congress for a project work on Bio-Diversity
-
- 2018 - Towards Online, Collaborative, Multi-View Modelling
- 2018 - Sentiment Analysis: It's Complicated!
- - The inclusion of a new class of sentiment ('complicated') to categorize texts in the short-term sentiment analysis framework
- - Improved quality of automated sentiment analysis systems
- - Built a new, publicly available Twitter Sentiment Analysis (TSA) dataset of over 7,000 tweets over various categories that were annotated with 5x coverage
- 2016 - An Efficient Algorithm for Fast Block Motion Estimation in HEVC
- 2014 - Simplified Block Matching Algorithm for Fast Motion Estimation in HEVC
- 2014 - Enhancements of Latent Fingerprints using Morphological Filters
- 2013 - Spam Termination and Establishing Private Search Logs
- 2013 - Forensic Fingerprint Enhancement through Morphological Image Processing
Towards Online, Collaborative, Multi-View Modelling
Presented at SAM 2018 and published in the proceedings of the conference.
The paper addresses the challenges that were dealt with while implementing an online, collaborative, multi-view modelling application. Two new algorithms, 'Highlight Propagation Algorithm' and 'Delayed Deletion Algorithm' was proposed to complement the optimistic concurrency control approach that is implemented in the contemporary, real-time, collaborative editors.
See PublicationSentiment Analysis: It's Complicated!
Presented at NAACL HLT 2018 and published in the proceedings of the conference.
The paper addresses the following:
An Efficient Algorithm for Fast Block Motion Estimation in High Efficiency Video Coding
Published as a Book Chapter in "Emerging Technologies in Intelligent Applications for Image and Video Processing
The algorithm involves a direction-based approach on several distinctly identified combinations of search points. The project was split into Prediction and Refinement modules Five different schemes (EME 1 to 5) were introduced based on the combinations of search points and were experimented on a number of standard video sequences to predict the motion vector of the candidate block. The results justified that EME offered faster search minimizing search time
Simplified Block Matching Algorithm for Fast Motion Estimation in HEVC
Presented at the 4th IEEE International Conference on Recent Trends in Information Technology
The Motion Estimation is an indispensable module in the design of video encoder. It employs Block Matching algorithm which involves searching a candidate block in the entire search window of the reference frame taking up to 80% of the total video encoding time. In order to increase the efficiency, several algorithms are employed to minimize the computational time involved in block matching. The paper throws light on an efficient approach to be applied to the existing Block Matching Search techniques in HEVC which outperforms the various Block Matching algorithms.
Enhancements of Latent Fingerprints using Morphological Filters
Published in the International Journal of Engineering Research and Technology ISSN 2278-0181
Spam Termination and Establishing Private Search Logs
Published in the Second National Conference on Information Technology (NCIT 2013)
Forensic Fingerprint Enhancement through Morphological Image Processing
Presented in the National Conference on Bio-Inspired Systems and Technologies
-
- Software Engineering
- Distributed Systems
- Data Science
- Natural Language Processing (NLP)
- Applied Machine Learning
- COMP 512 - Distributed Systems
- COMP 614 - Distributed Data Management
- COMP 599 - Real-World Computing
- COMP 551 - Applied Machine Learning
- COMP 767 - Social Media Informatics
- COMP 533 - Model-Driven Software Development
- IT T35 - Data Structures
- IT T44 - Design and Analysis of Algorithms
- IT T63 - Database Management Systems
- IT E82 - Data Mining and Warehousing
- IT T73 - Component Technology
- IT T82 - Distributed Computing
- IT T81 - Service-Oriented Architecture
PROFESSIONAL AND ACADEMIC EXPERIENCE
-
McGill University
Research & Teaching Assistant
RA @ Software Engineering Lab
- Implementation of CollabCORE - Online, Collaborative Version of TouchCORE
- Persisting Models in Cloud Data Stores
- Estimating User Presence Awareness through a 'Highlight Propagation Algorithm'
- Handling cascading deletes and invalidations in a collaborative setting through a 'Delayed Deletion Algorithm'
TA - COMP 361 & COMP 533
- Reviewing the projects that the students undertake
- Hold regular meetings to evaluate the students' progress
- Grading their assignments and exams
- Observing office hours
-
Nuance Communications
Software Engg. Intern
I worked on Nuance's Mobility platform. Specifically, I worked on Dragon-Drive Connected Services (DDCS): Cloud based service for providing content to the connected car. The focus was on server-side development for weather-related content service.
-
Infosys Limited
Software Engineer
At Infosys, I worked as a Java and Web Application Developer in the Financial Services Domain. I was the part of the team who developed and managed Client Due-Diligence (CDD), Client Lifecycle Management (CLM), Know Your Customer (KYC) applications belonging to Markets and Institutional Banking (MIB) wing of Royal Bank of Scotland (RBS).
Roles and Responsibilities
- Client Interaction, requirements gathering and elicitation
- Creation of design prototypes
- Implementation using MVC Architectural pattern - Spring, Hibernate and Oracle
- Responsible for problem, incident and change management for production bugs
EDUCATION
CollabCORE
0Master's Thesis Know more!CollabCORE
Supervisors:
- Prof. Jörg Kienzle
- Prof. Omar Alam

- Online Collaborative Modelling
- Optimistic Concurrency Control
- Version Controlling
- Persisting the Models in Cloud data stores
Twitter Sentiment Analysis
2Social Media Informatics Know more!Twitter Sentiment Analysiss
Data Collection:
- Twitter's Streaming API
- Keyword Filters: #StrangersThings, #Weather, #USAirlines
Tweets Pre-Processing:
- Conversion of Tweet texts into Lower-case
- Tokenizing the sentences (using NLTK Tokenizer)
- Removing Twitter Usernames & URLs (using RE)
- Using Stop Wordsavailable in English language dictionary
Tweets Labelling (Data Coding): CrowdFlower
Sentiment Analysis:
- SentiWordNet (Lexical Resource used for 'Opinion Mining')
- VADER - Valence Aware Dictionary and sEntiment Reasoner (Lexial and Rule-based Sentiment Analysis tool - Commonly used to analyze sentiments expressed in Social Media platforms)
Differential Privacy
1Machine Learning Know more!Differential Privacy in Machine Learning
This project is a comparative study of Differential Privacy in Machine Learning. Two different Differential Privacy techniques to preserve the sensitive data are studied. They were applied on multiple data-sets like:
- MNIST (Modified National Institute of Stds. and Tech.)
- SVHN (Street View House Number)
- CIFAR-10 (Canadian Institute for Advanced Research)
Reddit Topic Classification
1Machine Learning Know more!Reddit Topic Classification
Supervised Machine Learning methods Random Forest & Stochastic Gradient Descent classifiers to classify short conversations extracted from Reddit. 8 Classes based on conversation topics (Hockey, Movies, NBA, News, NFL, Politics, Soccer and WorldNews).
Data Cleaning and Feature Extraction:
- Label Encoding (Fit and Transform) & Decoding (Inverse Transform) using Scikit-Learn Proceprocessing label Encoder
- Lemmatization (using WordNetLemmatizer) - NLTK Package (Done to increase the accuracy)
- Term Frequency-Inverse Document Frequency (TF-IDF) approach - feature weighting
Scraping Static and Dynamic Webpages
2Social Media Informatics Know more!Scraping Static and Dynamic Webpages
Two different types of web-pages (Static and Dynamic) were chosen and different strategies were used to scrape content from them. Also, a small analysis was done on the scraped data and the results are presented.
- Static Website: Who's Dated Who
- Dynamic Website: Empeopled and Boardest
- Who's Dated Who, despite being a static website, in order to fetch the URLs of all the web-pages, and the other two dynamic websites had Infinite Scrolling (where new posts are loaded dynamically when the user scrolls down to the bottom of the page)
- BeautifulSoup, Python library for screen scraping was made use of in static web-pages
- Selenium Web Driver was made use of to automatically control the browser and scroll until the last element loads
Twitter's Streaming & REST APIs
3Social Media Informatics Know more!Twitter's Streaming & REST APIs
A comparative study on Twitter's REST and Streaming APIs and observing the Bias in Twitter Data Collection
- User's Popularity: "No. of Followers he/she possesses"
- Keyword Filter: '#Mexico', '#Earthquake'
- No. of Tweets collected using Streaming API: 50,375
- No. of Tweets from same set of users: 16,345
- No. of Unique User IDs: 34,025
- Sampling: 2 * 10000 User_IDs
Network Analysis
2Social Media Informatics Know more!Network Analysis
Analyzing Relationships in the Who-Dated-Who database, evaluate the abundance of overlapping dating partners and measuring the frequency of overlapping daters.
- Data Collection: Infinite Scrolling and Static Scraping
- Network Analysis: NetworkX & GEPHI
- NetworkX, a Python package was used for creation, manipulation and study of structure and dynamic of the network
- GEPHI, an Open-source Graph Visualization and Exploration tool was used
Component-based Distributed Information System
3Distributed Systems Know more!Component-based Distributed Travel Reservation System
A component-based distributed travel reservation system was developed using both TCP and RMI. More functionalities that were added on the RMI flavored system includes:
- Distribution and Scalability
- Transactions and Distributed Concurrency Control
- Reliability using 2-Phase Commit Protocol
- Performance Analysis
Linear Regression - Predicting Completion Times
1Machine Learning Know more!Predicting Participant's Completion Times in 2017 Miami Marathon using Linear Regression
The dataset was collected from Athlinks and it consisted of detailed profiling of participants who have enrolled for Miami Marathon in the past, their finising times, etc.
- Feature Selection: Age (Continuous), Gender (Binary), AvgTime[Year] (Continuous) - Completion time in seconds for the entire marathon, AvgTimeForAllMarathons (Continuous) - completion time in seconds across all years with data for given participant, TotalNoOfRaces (continuous) - Equalivalent to Number of races participated prior to this Marathon
- Addressing Missing Data: (Done through Predictive Mean Matching) - By substituting the missing data with the person's mean (Average) time across all years
High-Efficiency Video Coding
4Bachelor's Thesis Know more!An efficient algorithm for Fast-Block Motion Estimation in High-Efficiency Video Coding
The algorithm involves a direction-based approach on several distinctly identified combinations of search points. The project was split into Prediction and Refinement modules Five different schemes (EME 1 to 5) were introduced based on the combinations of search points and were experimented on a number of standard video sequences to predict the motion vector of the candidate block. The results justified that EME offered faster search minimizing search time
- Presented at the 4th IEEE International Conference on Recent Trends in Information Technology
- Published as a Book Chapter in "Emerging Technologies in Intelligent Applications for Image and Video Processing"
Miscellaneous Projects
5Websites & Other small projects Know more!Miscellaneous Projects
Contains a list of websites and projects that have been carried out (during both Bachelor's and Master's)
- Dynamic and Responsive Websites: Designed eight websites using HTML5, CSS3 and JavaScript
- Modelling Re-usable concerns for singleton approaches
- Industry Project: NIMBUS Drive (File De-Duplication in Cloud Storage) - Unisys Cloud 20/20 V5
- Industry Project: Employee Record Management System (DIET, Puducherry, India)
Morphological Image Processing
4Image Processing Know more!Enhancement of latent fingerprints through Morphological Image Processing
Using Morphological Image Processing filters like 'Erosion', 'Dilation', 'Opening' and 'Closing' to enhance latent fingerprints.
- Presented and won Best Paper award in the National Conference on Bio-Inspired Systems and Technologies
- Published in the "International Journal of Engineering Research and Technology"
Dragon Connected Services (DCS)
6Nuance Communications, Canada Know more!Dragon Connected Services (DCS)
During Summer Internship (2017): Worked in Dragon Connected Services (DCS): Cloud based service for providing content to the connected car. I was working on server-side development for weather-related content services.
Financial Services: KYC & Client Due-Diligence
6Infosys Limited, India Know more!Financial Services:
Client: Royal Bank of Scotland
Was a part of the Know Your Customer (KYC) and Client Due-Diligence team
Roles and Responsibilities:
- Client Interaction, requirements gathering and elicitation
- Creation of design prototypes
- Implementation using MVC Architectural pattern
- Responsible for problem, incident and change management for production bugs