Research

Topics

Deep Learning Theory

Deep Learning Theory encompasses the foundational mathematical principles that underpin modern neural networks and their capabilities. This field investigates nonconvex optimization techniques essential for training deep networks with billions of parameters, despite the theoretical challenges of finding global minima in highly complex loss landscapes. Learning dynamics research explores how different network architectures and training protocols affect convergence, stability, and performance over time. The concept of implicit bias helps explain why overparameterized networks tend to converge to specific solutions despite having infinitely many possible solutions that fit training data. Generalization research addresses the fundamental question of why deep networks perform well on unseen data despite their vast capacity to overfit, developing theoretical frameworks that connect architecture design, optimization algorithms, and statistical learning principles.

Trustworthy Machine Learning

Trustworthy Machine Learning focuses on developing reliable and accountable AI systems that can be safely deployed in critical real-world applications. Interpretability research aims to create models and methods that allow humans to understand how AI systems reach particular decisions, addressing the “black box” problem through techniques like feature attribution, concept-based explanations, and model distillation. Robustness investigations develop algorithms and frameworks that maintain performance under various challenges, including adversarial attacks (subtle input manipulations designed to fool models), distribution shifts (when deployment data differs from training data), and noisy or incomplete inputs that might occur in practical scenarios. Together, these components establish the theoretical and practical foundations needed to develop AI systems that can be trusted with high-stakes decisions in healthcare, transportation, security, and other critical domains.

Parsimonious Representation Learning

Parsimonious Representation Learning focuses on discovering compact, efficient ways to represent complex data while preserving essential information. Matrix factorization techniques decompose high-dimensional data matrices into lower-dimensional components, revealing latent structures and enabling applications like recommendation systems and dimensionality reduction. Subspace clustering methods identify and group data points that lie near lower-dimensional linear or affine subspaces within the ambient space, allowing for more accurate clustering of high-dimensional data with complex geometric structures. Manifold learning approaches discover nonlinear, low-dimensional structures that capture the intrinsic geometry of data, assuming that high-dimensional observations often lie on or near a lower-dimensional manifold, thus enabling more effective visualization, compression, and feature extraction while respecting the underlying data geometry.

Continual Learning

Continual Learning addresses the challenge of developing machine learning systems that can acquire knowledge incrementally over time without forgetting previously learned information—a capability that comes naturally to humans but poses significant difficulties for artificial systems. This field explores strategies to overcome catastrophic forgetting, where neural networks tend to overwrite earlier knowledge when trained on new tasks, through techniques like regularization methods that identify and protect important parameters, replay mechanisms that strategically revisit past experiences, and architectural approaches that allocate specific network components to different tasks. Continual learning research spans theoretical investigations of knowledge transfer and interference, algorithmic innovations for balancing stability and plasticity, and practical applications in scenarios where models must adapt to changing environments or sequentially presented tasks, such as in robotics, personalized recommendation systems, and healthcare monitoring.

Optimization

Optimization research in machine learning develops mathematical frameworks and algorithms to efficiently find optimal parameters or solutions across diverse learning problems. Optimization on manifolds extends traditional optimization techniques to handle constraints where solutions must lie on curved mathematical spaces, enabling applications in computer vision, robotics, and scientific computing. Optimization for learning focuses on developing specialized algorithms tailored to the unique challenges of training machine learning models, addressing issues like saddle points, local minima, and the interplay between optimization dynamics and generalization performance. The intersection of optimization and dynamical systems provides theoretical tools to analyze convergence properties and training trajectories, treating optimization algorithms as discrete or continuous dynamical systems. Distributed optimization techniques enable training models across multiple machines or devices while minimizing communication costs, becoming increasingly important for large-scale learning problems and federated learning scenarios where data privacy is paramount.

3D Vision

3D Vision research focuses on enabling computers to understand and reconstruct the three-dimensional world from visual data. Structure from Motion techniques recover both camera poses and 3D scene geometry from sequences of 2D images by identifying corresponding points across frames and solving geometric optimization problems. Motion segmentation methods separate multiple moving objects in dynamic scenes, distinguishing independent motion patterns from camera-induced apparent motion, which is crucial for autonomous navigation and video analysis. 3D scene analysis encompasses a broader set of techniques for understanding spatial relationships, object arrangements, and scene semantics in three dimensions, including depth estimation, volumetric reconstruction, and scene parsing that enables applications ranging from augmented reality and robotics to architectural modeling and autonomous driving systems.

Video

Video research in computer vision addresses the challenges of analyzing and generating temporal visual content with coherent spatial-temporal relationships. Video generation techniques create realistic or stylized video sequences using generative models that capture both appearance and motion dynamics, with applications in entertainment, simulation, and data augmentation. Action recognition methods identify human activities in video by modeling temporal patterns and motion cues, while action detection further localizes when and where specific activities occur within longer, untrimmed videos. Action segmentation extends these capabilities by precisely delineating the temporal boundaries between different activities in continuous video streams, breaking complex sequences into meaningful segments. Together, these video understanding technologies enable applications ranging from surveillance and sports analytics to human-computer interaction and automated video indexing.

Image

Image-focused computer vision research develops algorithms for understanding and manipulating still visual content across various levels of abstraction. Image generation techniques create novel visual content through generative adversarial networks, diffusion models, and other approaches that model the underlying distribution of natural or domain-specific images. Object detection methods identify and localize multiple objects within images, providing bounding boxes and class labels that enable scene understanding for applications like autonomous driving and retail analytics. Pose estimation techniques recover the spatial configuration of articulated objects, particularly human bodies or hands, enabling applications in animation, gesture recognition, and human activity analysis. Object and semantic segmentation approaches partition images into semantically meaningful regions by classifying each pixel, providing fine-grained scene decomposition that supports applications ranging from medical image analysis to computational photography and augmented reality.

Vision and Language

Vision and Language research bridges visual perception and natural language understanding to create systems that can reason about images and text in an integrated manner. Visual Question Answering develops models that can respond to natural language questions about image content, requiring multi-modal reasoning that connects visual features with linguistic concepts. Visual Grounding techniques locate objects or regions in images based on natural language descriptions, enabling applications like interactive image editing and robotic manipulation guided by verbal commands. Scene interpretation methods extract structured representations of visual scenes, identifying objects, their attributes, and their relationships to support higher-level reasoning. Image captioning systems generate natural language descriptions of visual content, requiring both visual understanding and linguistic generation capabilities to produce relevant, accurate, and contextually appropriate textual summaries of images for applications in accessibility, content indexing, and multimodal communication.

Biomedical Image Analysis

Biomedical Image Analysis employs computer vision and machine learning techniques to interpret and extract clinically relevant information from medical imaging data. Diffusion MRI analysis methods process specialized magnetic resonance signals to map tissue microstructure and neural fiber pathways in the brain, enabling studies of connectivity patterns in healthy development and neurological disorders. Explainable radiology research develops interpretable AI systems for medical image interpretation that not only provide diagnostic predictions but also justify their conclusions with visual evidence and reasoning that clinicians can verify and trust. Microscopy image analysis techniques automatically process cellular and tissue images at various scales, enabling quantification of morphological features, tracking of cellular dynamics, and identification of pathological patterns that support both clinical diagnostics and basic biological research, ultimately enhancing precision medicine through quantitative biomarkers and computational pathology.

Computer Vision for Health

Computer Vision for Health applies visual understanding technologies to healthcare challenges, creating systems that monitor, assess, and support human wellbeing. Surgical activity analysis techniques automatically recognize phases, gestures, and instrument usage during medical procedures through video analysis, enabling applications in surgical training, workflow optimization, and intraoperative decision support. Movement diagnosis systems use computer vision to quantify and characterize motor behaviors relevant to neurological and developmental conditions, providing objective assessment tools for conditions like autism spectrum disorder, where subtle movement patterns may serve as early biomarkers. Similar techniques support therapeutic monitoring in Tourette syndrome by quantifying tic frequency and severity, while rehabilitation applications track patient movements during physical therapy to provide feedback on exercise quality, measure progress over time, and personalize treatment protocols. These vision-based health systems reduce assessment subjectivity and increase accessibility of specialized healthcare expertise.

Hybrid Systems

Hybrid Systems research addresses dynamical systems that combine continuous evolution with discrete state transitions, creating mathematical frameworks for systems that switch between different operating modes. Observability studies in this domain investigate when and how the internal states of hybrid systems can be reconstructed from external measurements, which is crucial for monitoring and controlling complex systems like power grids with switching topologies or robotic systems with contact dynamics. Identification methods develop techniques to construct mathematical models of hybrid systems from experimental data, learning both the continuous dynamics within each mode and the discrete switching logic between modes. These theoretical foundations support applications in cyber-physical systems, including autonomous vehicles that switch between different control laws, smart manufacturing systems with multiple operating regimes, and biomedical devices like artificial pancreas systems that must adjust their behavior based on discrete physiological states.

Multi-agent Systems

Multi-agent Systems research studies collections of autonomous entities that interact with each other and their environment, developing frameworks for coordination, competition, and emergent behavior. Pursuit-evasion games model strategic interactions between pursuing and evading agents, addressing questions of optimal strategies, capture conditions, and equilibrium solutions with applications in security, robotics, and computational modeling of biological systems. Consensus on manifolds extends traditional agreement protocols to scenarios where agents must coordinate on curved mathematical spaces like rotation groups or spheres, which arise naturally in applications like satellite attitude synchronization, distributed camera networks, and coordinated motion planning. This field combines ideas from game theory, control theory, and distributed computing to develop theoretical guarantees and practical algorithms for emerging technologies like drone swarms, autonomous vehicle teams, and distributed robotic systems that must cooperatively solve complex tasks.

Linear Systems

Linear Systems theory provides fundamental tools for analyzing and designing systems governed by linear differential or difference equations, forming the foundation for many control and signal processing applications. Geometric approaches examine system properties through the lens of linear subspaces and transformations, revealing intrinsic structural features that inform controller design and system analysis. Sparsity considerations address scenarios where system matrices have many zero entries due to physical constraints or limited interactions between components, leading to computationally efficient algorithms and insights for large-scale systems like power networks or neural connectivity models. Observability research investigates conditions under which a system’s internal states can be reconstructed from measured outputs, including minimal sensor placement, robustness to noise, and reconstruction algorithms that enable state estimation and monitoring in applications ranging from autonomous vehicles to industrial process control and infrastructure management.

Publication by Topics

Show all

60 entries « 1 of 2 »

2022

Liangzu Peng; Manolis C. Tsakiris; René Vidal

ARCS: Accurate Rotation and Correspondences Search Proceedings Article

In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.

BibTeX | Tags: Computer Vision

Liangzu Peng; Mahyar Fazlyab; René Vidal

Semidefinite relaxations of truncated least-squares in robust rotation search: Tight or not Proceedings Article

In: European Conference on Computer Vision, 2022.

BibTeX | Tags: Computer Vision, Trustworthy Machine Learning

2021

Shangzhi Zhang; Chong You; René Vidal; Chun-Guang Li

Learning a Self-Expressive Network for Subspace Clustering Proceedings Article

In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.

BibTeX | Tags: Computer Vision

2020

J. Bruna; E. Haber; G. Kutyniok; R. Vidal; T. Pock

Special Issue on the Mathematical Foundations of Deep Learning in Imaging Science Journal Article

In: Journal of Mathematical Imaging and Vision, vol. 62, pp. 277-278, 2020.

BibTeX | Tags: Computer Vision, Machine Learning

Tianjiao Ding; Yunchen Yang; Zhihui Zhu; Daniel P Robinson; René Vidal; Laurent Kneip; Manolis C Tsakiris

Robust Homography Estimation via Dual Principal Component Pursuit Proceedings Article

In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 6080–6089, 2020.

BibTeX | Tags: 3D Vision, Computer Vision, Linear System, Optimization, Trustworthy Machine Learning

H. Lobel; R. Vidal; A. Soto

CompactNets: Compact Hierarchical Compositional Networks for Visual Recognition Journal Article

In: Computer Vision and Image Understanding, vol. 191, 2020.

BibTeX | Tags: Computer Vision, Image, Parsimonious Representation Learning

E. Mavroudi; B. B. Haro; R. Vidal.

Representation Learning on Visual-Symbolic Graphs for Video Understanding Proceedings Article

In: European Conference on Computer Vision, 2020.

BibTeX | Tags: Computer Vision, Machine Learning, Video

2019

Connor Lane; Benjamin D. Haeffele; René Vidal

Adaptive online $k$-subspaces with cooperative re-initialization Proceedings Article

In: IEEE International Conference on Computer Vision Workshops, 2019.

BibTeX | Tags: Computer Vision

E. Mavroudi; B. B. Haro; R.Vidal

Neural Message Passing on Hybrid Spatio-Temporal Visual and Symbolic Graphs for Video Understanding Journal Article

In: Arxiv, vol. abs/1905.07385, 2019.

BibTeX | Tags: Computer Vision, Video

Connor Lane; Ron Boger; Chong You; Manolis Tsakiris; Benjamin Haeffele; Rene Vidal

Classifying and Comparing Approaches to Subspace Clustering with Missing Data Proceedings Article

In: IEEE International Conference on Computer Vision Workshops, 2019.

BibTeX | Tags: Computer Vision

2018

S. Mahendran; H. Ali; R. Vidal

A mixed classification-regression framework for 3D pose estimation from 2D images Proceedings Article

In: British Machine Vision Conference, 2018.

BibTeX | Tags: Computer Vision

S. Mahendran; H. Ali; R. Vidal

Convolutional Networks for Object Category and 3D Pose Estimation from 2D Images Proceedings Article

In: European Conference on Computer Vision, 2018.

BibTeX | Tags: Computer Vision

E. Mavroudi; D. Bhaskara; S. Sefati; H. Ali; R. Vidal

End-to-End Fine-Grained Action Segmentation and Recognition Using Conditional Random Field Models and Discriminative Sparse Coding Proceedings Article

In: IEEE Winter Conference on Applications of Computer Vision (WACV), 2018.

BibTeX | Tags: Computer Vision

2017

B. Afsari; R. Vidal

Bundle Reduction and the Alignment Distance on Spaces of State-Space LTI Systems Journal Article

In: IEEE Transactions on Automatic Control, vol. 62, no. 8, pp. 3804-3819, 2017.

BibTeX | Tags: Computer Vision

S. Mahendran; H. Ali; R. Vidal

3D Pose Regression using Convolutional Neural Networks Proceedings Article

In: IEEE International Conference on Computer Vision Workshop on Recovering 6D Object Pose, 2017.

BibTeX | Tags: Computer Vision

E. Mavroudi; L. Tao; R. Vidal

Deep Moving Poselets for Video Based Action Recognition Proceedings Article

In: IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 111–120, 2017.

BibTeX | Tags: Computer Vision

2016

F. Yellin; B. Haeffele; R. Vidal

System and method for object detection in holographic lens-free imaging by convolutional dictionary learning and encoding Miscellaneous

US Patent WO2018085657A1, 2016.

BibTeX | Tags: Computer Vision, Machine Learning

C. Lea; A. Reiter; R. Vidal; G. D. Hager

Segmental Spatiotemporal CNNs for Fine-grained Action Segmentation Proceedings Article

In: European Conference on Computer Vision, 2016.

BibTeX | Tags: Computer Vision

2015

Manolis C Tsakiris; René Vidal

Dual principal component pursuit Proceedings Article

In: Proceedings of the IEEE International Conference on Computer Vision Workshops, 2015.

BibTeX | Tags: Computer Vision

E. Jahangiri; R. Vidal; L. Younes; D. Geman

Object-Level Generative Models for 3D Scene Understanding Proceedings Article

In: SUNw: Scene Understanding Workshop, 2015.

BibTeX | Tags: Computer Vision

C. Lea; G. D. Hager; R. Vidal

An Improved Model for Segmentation and Recognition of Fine-Grained Activities with Application to Surgical Training Tasks Proceedings Article

In: IEEE Winter Conference on Applications of Computer Vision, pp. 1123–1129, 2015.

BibTeX | Tags: Computer Vision

H. Lobel; R. Vidal; A. Soto

Learning Shared, Discriminative, and Compact Representations for Visual Recognition Journal Article

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, iss. 99, no. 11, pp. 2218-2231, 2015.

BibTeX | Tags: Computer Vision, Machine Learning, Parsimonious Representation Learning

E. Schwab; M. A. Yassa; M. Weiner; R. Vidal

Using Automatic HARDI Feature Selection, Registration, and Atlas Building to Characterize the Neuroanatomy of Beta-Amyloid Pathology Proceedings Article

In: MICCAI Workshop on Computational Diffusion MRI, 2015.

BibTeX | Tags: Biomedical Image Analysis, Computer Vision

M. C. Tsakiris; R. Vidal

Dual Principal Component Pursuit Proceedings Article

In: ICCV Workshop on Robust Subspace Learning and Computer Vision, pp. 10–18, 2015.

BibTeX | Tags: Computer Vision, Linear System, Machine Learning, Trustworthy Machine Learning

M. C. Tsakiris; R. Vidal

Filtrated Spectral Algebraic Subspace Clustering Proceedings Article

In: ICCV Workshop on Robust Subspace Learning and Computer Vision, pp. 28–36, 2015.

BibTeX | Tags: Computer Vision, Linear System, Machine Learning, Trustworthy Machine Learning

2014

F. Ofli; R. Chaudhry; G. Kurillo; R. Vidal; R. Bajcsy

Sequence of the Most Informative Joints (SMIJ): A New Representation for Human Skeletal Action Recognition Journal Article

In: Journal of Visual Communication and Image Representation, vol. 25, no. 1, pp. 24-38, 2014.

BibTeX | Tags: Computer Vision, Image, Video

D. Rother; S. Schütz; R. Vidal

Hypothesize and Bound: A Computational Focus of Attention Mechanism for Simultaneous N-D Segmentation, Pose Estimation and Classification Using Shape Priors Journal Article

In: International Journal of Computer Vision, 2014.

Links | BibTeX | Tags: 3D Vision, Computer Vision, Machine Learning

L. Tao; F. Porikli; R. Vidal

Sparse Dictionaries for Semantic Segmentation Proceedings Article

In: European Conference on Computer Vision, 2014.

BibTeX | Tags: Computer Vision, Parsimonious Representation Learning

2013

R. Chaudhry; G. Hager; R. Vidal

Dynamic Template Tracking and Recognition Journal Article

In: International Journal of Computer Vision, vol. 105, no. 1, pp. 19-48, 2013.

BibTeX | Tags: Computer Vision, Dynamical Systems, Video

A. Jain; S. Chatterjee; R. Vidal

Coarse-to-fine Semantic Video Segmentation using Supervoxel Trees Proceedings Article

In: IEEE International Conference on Computer Vision, 2013.

BibTeX | Tags: Computer Vision

H-A. Lobel; A. Soto; R. Vidal

Hierarchical Joint Max-Margin Learning of Mid and Top Level Representations for Visual Recognition Proceedings Article

In: IEEE International Conference on Computer Vision, 2013.

BibTeX | Tags: Computer Vision, Machine Learning

F. Ofli; R. Chaudhry; G. Kurillo; R. Vidal; R. Bajcsy

Berkeley MHAD: A Comprehensive Multimodal Human Action Database Proceedings Article

In: IEEE Workshop on Applications of Computer Vision, 2013.

BibTeX | Tags: Computer Vision

E. Yoruk; R. Vidal

A 3D Wireframe Model for Efficient Object Localization and Pose Estimation Proceedings Article

In: ICCV Workshop on 3D Representation and Recognition, 2013.

BibTeX | Tags: 3D Vision, Computer Vision

2012

B. Afsari; R. Chaudhry; A. Ravichandran; R. Vidal

Group Action Induced Distances for Averaging and Clustering Linear Dynamical Systems with Applications to the Analysis of Dynamic Visual Scenes Proceedings Article

In: IEEE Conference on Computer Vision and Pattern Recognition, 2012.

BibTeX | Tags: Computer Vision

E. Elhamifar; G. Sapiro; R. Vidal

See All by Looking at A Few: Sparse Modeling for Finding Representative Objects Proceedings Article

In: IEEE Conference on Computer Vision and Pattern Recognition, 2012.

BibTeX | Tags: Computer Vision

A. Jain; L. Zappella; P. McClure; R. Vidal

Visual Dictionary Learning for Joint Object Categorization and Segmentation Proceedings Article

In: European Conference on Computer Vision, 2012.

BibTeX | Tags: Computer Vision, Machine Learning

D. Perrone; A. Ravichandran; R. Vidal; P. Favaro

Image Priors for Image Deblurring with Uncertain Blur Proceedings Article

In: British Machine Vision Conference, 2012.

BibTeX | Tags: Computer Vision

2011

D. Singaraju; L. Grady; A. Sinop; R. Vidal

Continuous Valued MRFs for Image Segmentation Book Section

In: Advances in Markov Random Fields for Vision and Image Processing, MIT Press, 2011.

BibTeX | Tags: Computer Vision, Image

R. Tron; R. Vidal

Distributed Computer Vision Algorithms Journal Article

In: IEEE Signal Processing Magazine, vol. 28, no. 2, pp. 32–45, 2011.

BibTeX | Tags: Computer Vision, Multi-agent Systems

E. Elhamifar; R. Vidal

Robust Classification using Structured Sparse Representation Proceedings Article

In: IEEE Conference on Computer Vision and Pattern Recognition, 2011.

BibTeX | Tags: Computer Vision

Paolo Favaro; René Vidal; Avinash Ravichandran

A Closed Form Solution to Robust Subspace Estimation and Clustering Proceedings Article

In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 1801 -1807, 2011.

BibTeX | Tags: Computer Vision

A. Ravichandran; R. Vidal

Video Registration Using Dynamic Textures Journal Article

In: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 158–171, 2011.

BibTeX | Tags: Computer Vision, Dynamical Systems, Video

D. Rother; R. Vidal

A Hypothesize-and-Bound Algorithm for Simultaneous Object Classification, Pose Estimation and 3D Reconstruction from a Single 2D Image Proceedings Article

In: ICCV Workshop on 3D Representation and Recognition, 2011.

BibTeX | Tags: 3D Vision, Computer Vision

R. Tron; R. Vidal

Distributed Computer Vision Algorithms Through Distributed Averaging Proceedings Article

In: IEEE Conference on Computer Vision and Pattern Recognition, 2011.

BibTeX | Tags: Computer Vision, Multi-agent Systems

2009

R. Chaudhry; A. Ravichandran; G. Hager; R. Vidal

Histograms of Oriented Optical Flow and Binet-Cauchy Kernels on Nonlinear Dynamical Systems for the Recognition of Human Actions Proceedings Article

In: IEEE Conference on Computer Vision and Pattern Recognition, 2009.

BibTeX | Tags: Computer Vision

Ehsan Elhamifar; René Vidal

Sparse Subspace Clustering Proceedings Article

In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2790-2797, 2009.

BibTeX | Tags: Computer Vision

2008

A. Ravichandran; R. Vidal

Video Registration using Dynamic Textures Proceedings Article

In: European Conference on Computer Vision, 2008.

BibTeX | Tags: Computer Vision, Dynamical Systems, Video

R. Tron; R. Vidal

Distributed Face Recognition via Consensus on SE(3) Proceedings Article

In: Workshop on Omnidirectional Vision, 2008.

BibTeX | Tags: Computer Vision

2007

H. E. Cetingül; R. Chaudhry; R. Vidal

A System Theoretic Approach to Synthesis and Classification of Lip Articulation Proceedings Article

In: International Workshop on Dynamical Vision, 2007.

BibTeX | Tags: Computer Vision

A. Ravichandran; R. Vidal

Mosaicing Nonrigid Dynamical Scenes Proceedings Article

In: International Workshop on Dynamic Vision, 2007.

BibTeX | Tags: Computer Vision, Dynamical Systems

60 entries « 1 of 2 »