Research — VIDAL Lab

§ 03·1

Research areas & topics

Click an area on the left; click a topic on the right to filter the publication list below.

Machine Learning

Deep learning theory, trustworthy AI, generative models, parsimonious representation learning, continual learning, and optimization for non-convex loss landscapes.

Deep theory
Trustworthy
Generative
Optimization
Continual

ii.

Computer Vision

Vision–language models, human motion understanding, video analysis, image analysis, and geometric 3D vision — with an emphasis on interpretability and generalization.

VLMs
Motion
3D
Video
Interpretable

iii.

Dynamics & Robotics

Methods for analyzing and learning linear, hybrid, and multi-agent systems — from theoretical guarantees to deployment on physical platforms.

Hybrid
Multi-agent
Control
Safety

iv.

AI for Healthcare

Translating advances in AI into clinical systems for radiology, cardiology, neurology, hematology, and surgery — together with clinicians at Penn Medicine and beyond.

Radiology
Cardiology
Surgery
Neurology

Deep Learning Theory

We study the mathematical principles behind modern neural networks. Our lab focuses on understanding the optimization landscape and generalization properties of positively homogeneous networks, analyzing the learning dynamics and implicit bias of gradient-based methods, and providing principled explanations to deep learning phenomena such as neural collapse, low-rank adaptation, and learning on the edge of stability.

Trustworthy AI

We develop AI systems that are interpretable, robust, and reliable for high-stakes applications. Our information pursuit framework enables explainable-by-design models that make predictions from a sequence of informative question–answer pairs. We also analyze when robust classifiers are guaranteed to exist and design adversarial attacks that stress-test visual classifiers and large language models.

Deep Generative Models

We explore the theory and practice of deep generative models — convergence guarantees for generative-model inversion, transformer-based diffusion, flow matching aligned with the multimodal structure of real data, and accurate diffusion-based image restoration.

Parsimonious Representation Learning

We focus on discovering sparse and low-rank structure in high-dimensional data — Generalized PCA (GPCA), Sparse Subspace Clustering (SSC), Dual Principal Component Pursuit (DPCP), and nonconvex matrix and tensor factorization with global-optimality guarantees.

Continual Learning

We build models that learn tasks sequentially without forgetting. Frameworks like the Ideal Continual Learner (ICL) and LoRanPAC unify existing methods, provide theoretical guarantees, and maintain stability over long task sequences.

Optimization

We analyze convergence and global optimality of optimization algorithms for machine learning — subspace clustering, nonconvex matrix and tensor factorization, dictionary learning, generative-model inversion, and deep-learning training. We also study distributed optimization on Riemannian manifolds and accelerated methods.

Vision–Language Models

We develop vision–language models that connect visual data with semantic concepts, including concept-based interpretable classification, knowledge-pursuit prompting for zero-shot multimodal synthesis, and concept-based image-editing methods.

Human Motion Analysis

We develop methods for understanding human motion from video and skeletal data — action recognition, detection, segmentation, pose estimation, and multimodal motion representations supporting recognition, retrieval, and generation.

Video Analysis

We model dynamic visual phenomena in video — dynamic textures via dynamical systems, semantic video segmentation, robust video classification, and perpetual generation of dynamic scenes with 3D consistency.

Image Analysis

Algorithms for image segmentation, alpha matting, semantic segmentation with structured outputs, visual representation learning, and image deblurring with both classical and diffusion-based priors.

Geometric Vision

Geometric methods for recovering 3D structure and motion — multibody motion segmentation, the Hopkins 155 benchmark, robust object pose estimation, and optimization-based geometric estimation via DPCP and IRLS.

Learning and Analyzing Linear, Bilinear and Hybrid Dynamical Systems

Conditions for observability and algorithms for system identification of hybrid systems, realization theory for stochastic jump-Markov and bilinear systems, and identification of linear systems with sparse inputs.

Geometry and Distances of Spaces of Dynamical Systems

Principled distances and similarity measures for comparing dynamical systems — Binet–Cauchy kernels, group-action-induced distances, and alignment distances, with strong impact on video comparison and action recognition.

Dynamical-Systems Perspectives on Optimization

Optimization through the lens of dynamical systems — accelerated gradient descent, ADMM, conformal symplectic and relativistic optimization, nonsmooth dynamical systems, and proximal splitting methods.

Distributed Optimization and Consensus on Manifolds

Geometric methods for distributed optimization and consensus — Riemannian motion estimation on the essential manifold, Riemannian centers of mass, and consensus algorithms with theoretical guarantees.

Robotics and Autonomous Systems

Geometric and control-theoretic methods for autonomous robots — vision-based control for autonomous helicopter landing and game-theoretic pursuit–evasion strategies.

Diffusion MRI Analysis

HARDI reconstruction, restoration, registration, and segmentation via information geometry, sparse representation, and non-convex optimization — enabling biomarker discovery for neurological disorders.

Surgical Activity Recognition and Skill Assessment

Bag-of-spatiotemporal features, sparse HMMs, CRFs, and spatio-temporal deep architectures for surgical-workflow understanding — plus the JHU-ISI gesture and skill assessment benchmark.

Stem Cell-Derived Cardiomyocytes

Clustering and classification of cardiac cells based on cell morphology and contractile dynamics — shape analysis via metamorphosis models and recurrent networks for temporal contraction patterns.

Computational Microscopy & Point-of-Care Diagnostics

Holographic reconstruction, hybrid physics+deep phase retrieval, and encoder–decoder cell-detection networks for point-of-care diagnostics — automated CBC and lensless urinalysis.

Vision for Neurological & Developmental Disorders

Computer-vision methods for analyzing human movement — infant action recognition, motor-tic detection for Tourette assessment, and the CAMI motor-imitation framework as a biomarker for autism.

Medical Vision–Language & Foundation Models

Multimodal AI combining medical images and clinical text — extracting structured medical facts from radiology reports, grounding clinical concepts in chest X-rays, and building interpretable foundation models for radiology and cardiology.

Full publication list on Google Scholar

§ 03·2

Publications by topic

Select one or more topics to filter the list.

Click a topic chip to filter; click again to remove. "All" clears.

Affiliated centers

IDEAS Center ASSET Center GRASP Lab Penn Engineering Department of ESE Penn Medicine IDEAS Center ASSET Center GRASP Lab Penn Engineering

Looking to collaborate?

We welcome collaborations with clinicians, scientists, and engineers tackling problems where mathematical foundations matter.

Email Prof. Vidal See the team