Topics

Deep Learning Theory

Deep Learning Theory encompasses the foundational mathematical principles that underpin modern neural networks and their capabilities. This field investigates nonconvex optimization techniques essential for training deep networks with billions of parameters, despite the theoretical challenges of finding global minima in highly complex loss landscapes. Learning dynamics research explores how different network architectures and training protocols affect convergence, stability, and performance over time. The concept of implicit bias helps explain why overparameterized networks tend to converge to specific solutions despite having infinitely many possible solutions that fit training data. Generalization research addresses the fundamental question of why deep networks perform well on unseen data despite their vast capacity to overfit, developing theoretical frameworks that connect architecture design, optimization algorithms, and statistical learning principles.

Trustworthy Machine Learning

Trustworthy Machine Learning focuses on developing reliable and accountable AI systems that can be safely deployed in critical real-world applications. Interpretability research aims to create models and methods that allow humans to understand how AI systems reach particular decisions, addressing the “black box” problem through techniques like feature attribution, concept-based explanations, and model distillation. Robustness investigations develop algorithms and frameworks that maintain performance under various challenges, including adversarial attacks (subtle input manipulations designed to fool models), distribution shifts (when deployment data differs from training data), and noisy or incomplete inputs that might occur in practical scenarios. Together, these components establish the theoretical and practical foundations needed to develop AI systems that can be trusted with high-stakes decisions in healthcare, transportation, security, and other critical domains.

Parsimonious Representation Learning

Parsimonious Representation Learning focuses on discovering compact, efficient ways to represent complex data while preserving essential information. Matrix factorization techniques decompose high-dimensional data matrices into lower-dimensional components, revealing latent structures and enabling applications like recommendation systems and dimensionality reduction. Subspace clustering methods identify and group data points that lie near lower-dimensional linear or affine subspaces within the ambient space, allowing for more accurate clustering of high-dimensional data with complex geometric structures. Manifold learning approaches discover nonlinear, low-dimensional structures that capture the intrinsic geometry of data, assuming that high-dimensional observations often lie on or near a lower-dimensional manifold, thus enabling more effective visualization, compression, and feature extraction while respecting the underlying data geometry.

Continual Learning

Continual Learning addresses the challenge of developing machine learning systems that can acquire knowledge incrementally over time without forgetting previously learned information—a capability that comes naturally to humans but poses significant difficulties for artificial systems. This field explores strategies to overcome catastrophic forgetting, where neural networks tend to overwrite earlier knowledge when trained on new tasks, through techniques like regularization methods that identify and protect important parameters, replay mechanisms that strategically revisit past experiences, and architectural approaches that allocate specific network components to different tasks. Continual learning research spans theoretical investigations of knowledge transfer and interference, algorithmic innovations for balancing stability and plasticity, and practical applications in scenarios where models must adapt to changing environments or sequentially presented tasks, such as in robotics, personalized recommendation systems, and healthcare monitoring.

Optimization

Optimization research in machine learning develops mathematical frameworks and algorithms to efficiently find optimal parameters or solutions across diverse learning problems. Optimization on manifolds extends traditional optimization techniques to handle constraints where solutions must lie on curved mathematical spaces, enabling applications in computer vision, robotics, and scientific computing. Optimization for learning focuses on developing specialized algorithms tailored to the unique challenges of training machine learning models, addressing issues like saddle points, local minima, and the interplay between optimization dynamics and generalization performance. The intersection of optimization and dynamical systems provides theoretical tools to analyze convergence properties and training trajectories, treating optimization algorithms as discrete or continuous dynamical systems. Distributed optimization techniques enable training models across multiple machines or devices while minimizing communication costs, becoming increasingly important for large-scale learning problems and federated learning scenarios where data privacy is paramount.

3D Vision

3D Vision research focuses on enabling computers to understand and reconstruct the three-dimensional world from visual data. Structure from Motion techniques recover both camera poses and 3D scene geometry from sequences of 2D images by identifying corresponding points across frames and solving geometric optimization problems. Motion segmentation methods separate multiple moving objects in dynamic scenes, distinguishing independent motion patterns from camera-induced apparent motion, which is crucial for autonomous navigation and video analysis. 3D scene analysis encompasses a broader set of techniques for understanding spatial relationships, object arrangements, and scene semantics in three dimensions, including depth estimation, volumetric reconstruction, and scene parsing that enables applications ranging from augmented reality and robotics to architectural modeling and autonomous driving systems.

Video

Video research in computer vision addresses the challenges of analyzing and generating temporal visual content with coherent spatial-temporal relationships. Video generation techniques create realistic or stylized video sequences using generative models that capture both appearance and motion dynamics, with applications in entertainment, simulation, and data augmentation. Action recognition methods identify human activities in video by modeling temporal patterns and motion cues, while action detection further localizes when and where specific activities occur within longer, untrimmed videos. Action segmentation extends these capabilities by precisely delineating the temporal boundaries between different activities in continuous video streams, breaking complex sequences into meaningful segments. Together, these video understanding technologies enable applications ranging from surveillance and sports analytics to human-computer interaction and automated video indexing.

Image

Image-focused computer vision research develops algorithms for understanding and manipulating still visual content across various levels of abstraction. Image generation techniques create novel visual content through generative adversarial networks, diffusion models, and other approaches that model the underlying distribution of natural or domain-specific images. Object detection methods identify and localize multiple objects within images, providing bounding boxes and class labels that enable scene understanding for applications like autonomous driving and retail analytics. Pose estimation techniques recover the spatial configuration of articulated objects, particularly human bodies or hands, enabling applications in animation, gesture recognition, and human activity analysis. Object and semantic segmentation approaches partition images into semantically meaningful regions by classifying each pixel, providing fine-grained scene decomposition that supports applications ranging from medical image analysis to computational photography and augmented reality.

Vision and Language

Vision and Language research bridges visual perception and natural language understanding to create systems that can reason about images and text in an integrated manner. Visual Question Answering develops models that can respond to natural language questions about image content, requiring multi-modal reasoning that connects visual features with linguistic concepts. Visual Grounding techniques locate objects or regions in images based on natural language descriptions, enabling applications like interactive image editing and robotic manipulation guided by verbal commands. Scene interpretation methods extract structured representations of visual scenes, identifying objects, their attributes, and their relationships to support higher-level reasoning. Image captioning systems generate natural language descriptions of visual content, requiring both visual understanding and linguistic generation capabilities to produce relevant, accurate, and contextually appropriate textual summaries of images for applications in accessibility, content indexing, and multimodal communication.

Biomedical Image Analysis

Biomedical Image Analysis employs computer vision and machine learning techniques to interpret and extract clinically relevant information from medical imaging data. Diffusion MRI analysis methods process specialized magnetic resonance signals to map tissue microstructure and neural fiber pathways in the brain, enabling studies of connectivity patterns in healthy development and neurological disorders. Explainable radiology research develops interpretable AI systems for medical image interpretation that not only provide diagnostic predictions but also justify their conclusions with visual evidence and reasoning that clinicians can verify and trust. Microscopy image analysis techniques automatically process cellular and tissue images at various scales, enabling quantification of morphological features, tracking of cellular dynamics, and identification of pathological patterns that support both clinical diagnostics and basic biological research, ultimately enhancing precision medicine through quantitative biomarkers and computational pathology.

Computer Vision for Health

Computer Vision for Health applies visual understanding technologies to healthcare challenges, creating systems that monitor, assess, and support human wellbeing. Surgical activity analysis techniques automatically recognize phases, gestures, and instrument usage during medical procedures through video analysis, enabling applications in surgical training, workflow optimization, and intraoperative decision support. Movement diagnosis systems use computer vision to quantify and characterize motor behaviors relevant to neurological and developmental conditions, providing objective assessment tools for conditions like autism spectrum disorder, where subtle movement patterns may serve as early biomarkers. Similar techniques support therapeutic monitoring in Tourette syndrome by quantifying tic frequency and severity, while rehabilitation applications track patient movements during physical therapy to provide feedback on exercise quality, measure progress over time, and personalize treatment protocols. These vision-based health systems reduce assessment subjectivity and increase accessibility of specialized healthcare expertise.

Hybrid Systems

Hybrid Systems research addresses dynamical systems that combine continuous evolution with discrete state transitions, creating mathematical frameworks for systems that switch between different operating modes. Observability studies in this domain investigate when and how the internal states of hybrid systems can be reconstructed from external measurements, which is crucial for monitoring and controlling complex systems like power grids with switching topologies or robotic systems with contact dynamics. Identification methods develop techniques to construct mathematical models of hybrid systems from experimental data, learning both the continuous dynamics within each mode and the discrete switching logic between modes. These theoretical foundations support applications in cyber-physical systems, including autonomous vehicles that switch between different control laws, smart manufacturing systems with multiple operating regimes, and biomedical devices like artificial pancreas systems that must adjust their behavior based on discrete physiological states.

Multi-agent Systems

Multi-agent Systems research studies collections of autonomous entities that interact with each other and their environment, developing frameworks for coordination, competition, and emergent behavior. Pursuit-evasion games model strategic interactions between pursuing and evading agents, addressing questions of optimal strategies, capture conditions, and equilibrium solutions with applications in security, robotics, and computational modeling of biological systems. Consensus on manifolds extends traditional agreement protocols to scenarios where agents must coordinate on curved mathematical spaces like rotation groups or spheres, which arise naturally in applications like satellite attitude synchronization, distributed camera networks, and coordinated motion planning. This field combines ideas from game theory, control theory, and distributed computing to develop theoretical guarantees and practical algorithms for emerging technologies like drone swarms, autonomous vehicle teams, and distributed robotic systems that must cooperatively solve complex tasks.

Linear Systems

Linear Systems theory provides fundamental tools for analyzing and designing systems governed by linear differential or difference equations, forming the foundation for many control and signal processing applications. Geometric approaches examine system properties through the lens of linear subspaces and transformations, revealing intrinsic structural features that inform controller design and system analysis. Sparsity considerations address scenarios where system matrices have many zero entries due to physical constraints or limited interactions between components, leading to computationally efficient algorithms and insights for large-scale systems like power networks or neural connectivity models. Observability research investigates conditions under which a system’s internal states can be reconstructed from measured outputs, including minimal sensor placement, robustness to noise, and reconstruction algorithms that enable state estimation and monitoring in applications ranging from autonomous vehicles to industrial process control and infrastructure management.

Publication by Topics

3D Vision AI in Medicine Biomedical Image Analysis Computer Vision Deep Learning Theory Dynamic textures Dynamical Systems Hybrid system identification Hybrid Systems Image Linear System Machine Learning Multi-agent Systems Optimization Parsimonious Representation Learning Trustworthy Machine Learning Video Vision and Language

Show all

60 entries « ‹ 1 of 2 › »

2022

Liangzu Peng; Manolis C. Tsakiris; René Vidal

ARCS: Accurate Rotation and Correspondences Search Proceedings Article

In: IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.

Research

Topics

Machine Learning

Computer Vision

AI in Medicine

Dynamical Systems

Publication by Topics

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2009

2008

2007