In my research, I’m broadly interested in the intersection of computer vision, natural language processing and psychology: I aim to build intelligent agents that understand the visual world beyond recognition (labels) or captions (sentences) without the need for explicit human supervision through expensive annotations.
This entails developing approaches that do things such as:
self-supervised predictive learning for video event segmentation
commonsense reasoning to ground perception and prior knowledge
generative modeling for building knowledge from the ground-up
Much of my group's current work focuses on analyzing, modeling, and synthesizing complex video scenes and the semantic structure that can describe them. I also work on applying machine learning to other domains such as IoT security and running deep learning algorithms on constrained platforms like FPGA.
Latent Space Modeling for Cloning Encrypted PUF-based Authentication. IFIP International Internet of Things (IoT) Conference, Fall 2019
The Role of Commonsense Reasoning in Visual Understanding. Oklahoma State University. Fall 2018
Going Deeper with Semantics: Exploiting Semantic Contextualization for Interpretation of Human Activity in Videos. Technical Seminar Series, Statistical Shape Analysis and Modeling Group, Florida State University. Fall 2018
Video Event Understanding with Pattern Theory. Robotics Technical Seminar Series, Department of Mechanical Engineering,University of South Florida. Spring 2018
Inherently Explainable Model for Video Activity Recognition. AAAI Workshop On Reasoningand Learning for Human-Machine Dialogues, 2018
Leveraging ConceptNet to Reduce Training Requirements for Video Descriptions, Seminar in AI, University of South Florida, Spring 2017
Towards a Knowledge-based Approach to Video Comprehension. Conference on Computer and Robot Vision (CRV), Spring 2017