Using Deep Learning to Digitize Humans from Webcams

Paul Kruszewski - Wrnch.com

Dec. 8, 2017, 2:30 p.m. - Dec. 8, 2017, 3:30 p.m.

ENGMC 103

Hosted by: Clark Verbrugge


In order for machines (computers, robots, drones, intelligent buildings, etc.) to serve humans better, machines need to be able to interact with humans in a more natural way. Verbal communication offers one interface, but the majority of human communication is actually non-verbal, and mainly visual. Our goal is to teach machines to read human body language by digitizing humans into cyberspace using ordinary RGB cameras. We present the "wrnch engine," an AI system comprised of a series of inter-connected CNNs and RNNs that takes in RGB video and extracts various human-related information, including 3D pose, 3D shape, gesture and activity recognition. We will walk through our learning pipeline which features a unique synthetic data generation system based off of a video game engine. Real-time demos will be presented.

Paul is serial AI entrepreneur, who has been hustling and hacking since he was 12, when he leveraged a $250 livestock sale into a $1000 TRS-80 Color Computer to program video games and calculate Pi to as many digits as possible. He went on to obtain a Ph.D. in computer science from McGill University during which time the book "The Algorithmic Beauty of Plants" hooked him on computer graphics & AI. He was founder of "AI.implant," a company applying AI (flocking behaviours and path finding) to create and simulate huge crowds of interacting autonomous characters, "GRIP," a company using AI (behaviour trees) to create high fidelity autonomous characters capable of rich and complex behaviours, and most recently founded "wrnch" to use AI (deep learning and computer vision) to enable computers to read human body language.