CVPR 2102 TUTOTIAL/SHORT-COURSE

June 21, Providence, Rhode Island

ON

Differential Geometric Methods for Shape Analysis and Activity Recognition

 

           

Presented By:

 

Fatih Porikli, Distinguished Member Research Staff, MERL Research / Technical Staff; fatih@merl.com

Anuj Srivastava, Professor, Department of Statistics, Florida State University; anuj@stat.fsu.edu

Pavan Turaga, Assistant Professor, Electrical, Computer, and Energy Engineering, Arizona State University; pturaga@asu.edu

Ashok Veeraraghavan, Assistant Professor, Department of Electrical Engineering, Rice University; vashok@rice.edu

                 

 

 

PROGRAM SCHEDULE (PDF Files of the Presentations)

 

TIME

PRESENTER

TOPIC

8:45 – 9:00

Fatih Porikli

Motivation for using differential geometry in computer vision applications

Part 1 (http://stat.fsu.edu/~anuj/CVPR_Tutorial/Part1.pdf)

 

 

 

    9:00 – 9:30

Anuj Srivastava

Background material from differential geometry

Part 2 (http://stat.fsu.edu/~anuj/CVPR_Tutorial/Part2.pdf)

9:30 – 10:15

Anuj  Srivastava

Shape analysis of objects

Part 3(http://stat.fsu.edu/~anuj/CVPR_Tutorial/Part3.pdf)

10:15 – 10:30

 

Coffee Break

10:30 – 11:30

Ashok Veeraraghavan

Activity Recognition Using Manifolds

Pavan Turaga

Part 4(http://stat.fsu.edu/~anuj/CVPR_Tutorial/Part4.pdf)

11:30 – 12:15

Fatih Porikli

Application to Tracking/Pedestrian Detection

Part 5 (http://stat.fsu.edu/~anuj/CVPR_Tutorial/Part5.pdf)

 

 

 

 

 

COURSE DESCRIPTION

 

General Background:

 

Nonlinear manifolds have a special place in problem solutions where constraints of the problems restrict the domains to some interesting, structured sets. The differential geometry of these constrained spaces, or manifolds, guides us to reach more efficient solutions. Besides being mathematically appealing, the solutions based on the geometry of the underlying manifolds are often faster and more stable than their constrained optimization counterparts. This fact has been exploited in many branches of science and engineering, in developing methodologies, algorithms, and systems.

 

In this tutorial, we will focus on several manifolds including shape manifolds of planar closed curves, Grassmann manifolds, and manifolds of covariance matrices and affine transformations. In each case provide a mathematical background and demonstrate the use of these manifolds in shape analysis, activity classification, and pedestrian tracking applications.

 

 

This tutorial is going to focus on the following items:

 

 

 

 

 

 

 

 

 

Shape Analysis of Contours in Video Frames

We will start with general goals and challenges faced in shape analysis, followed by a summary of the basic ideas, strengths and limitations, and applications of different mathematical representations used in shape analysis of 2D and 3D objects. These representations include point sets (landmark-based shape analysis and active shape models), curves, surfaces, level sets, deformable templates, and medial representations.

 

Then, we will take a closer look at shape analysis of parameterized curves, where central issue is the shape analysis of parameterized curves while treating their parameterizations as nuisance variables. Since the common tool for removing the nuisance variables is an algebraic one, we will introduce the concepts of defining quotient spaces of manifolds under the actions of Lie groups. We will discuss the common choices of Riemannian metrics and tools for computation of geodesic paths and geodesic distances for several of these shape representations. Also, we will introduce the path-straightening algorithm computing geodesic paths in such shape manifolds.

 

Then, we study the use of Riemannian frameworks in statistical modeling of variability within shape classes. The commonly used ideas, such as Karcher mean computation, tangent PCA (TPCA) analysis, Gaussian or mixture-of-Gaussian models on TPCA, principal geodesic analysis, and hypothesis testing for shape classification using such shape models will be discussed.

 

 

 

 

 

 

 

Activity Representation using Stochastic Processes on Shape Manifolds

In this part we will discuss how several interesting features and models that are commonly used in activity analysis can be studied in the Riemannian setting. We will look at representations such as the shape of the human contour, landmark configurations of the human body, and stick figure models and associated manifold representations. Then, we will motivate the use of Riemannian geometric tools to study human activity sequences evolving on these varied manifold valued domains. We will also consider compact parametric models that are commonly used in image and video analysis tasks such as parametric linear dynamical models, covariance matrices of image patches, and subspace models of image collections. 

 

We will further discuss various classes of algorithms that can be extended to the study of human activities, and the approximations that enable fast and efficient activity recognition algorithms to be devised. We will discuss how several important classes of statistical and machine-learning algorithms such as hidden Markov models, dynamic time warping, hypothesis testing, regression, database organization etc can be extended to deal with the underlying manifold structures of such datasets. We will present motivating applications from human activity discovery, recognition, search and retrieval, and summarization to illustrate several of these concepts.

 

In order to motivate rate-invariant activity recognition, we will introduce the notion of warping functions, and metrics that remove the execution rate variability in activity analysis.

           

 

 

 

 

 

 

 

Applications in Tracking/Pedestrian Detection

 

 

Finally, we will present an important application area involving visual analytics where one attempts to detect, track, and recognize objects of interest from multiple videos, and more generally to interpret object behaviors and actions. We will describe in the tutorial the three main steps of visual analytics: detection of objects and agents, tracking of such objects and indicators from frame to frame, and evaluating tracking results to describe and infer semantic events and latent phenomena. The main challenge is the problem of variability. A visual detection and tracking system needs to generalize across huge variations in object appearance such due for instance to viewpoint, pose, facial expressions, lighting conditions, imaging quality or occlusions while maintaining specificity to not claim everything it sees are objects of interest. In addition, these tasks should preferably be performed in real-time on conventional computing platforms.

 

Mathematical formulation of certain natural phenomena in video analytics exhibits group structure on topological spaces that resemble the Euclidean space only on a small enough scale, which prevents incorporation of conventional inference methods that require global vector norms. More specifically, such underlying notions emerge in differentiable parameter spaces. We will introduce two Riemannian manifolds: the set of affine transformations and covariance matrices, and their applications in distance computation, motion estimation, object detection and recognition problems.

 

 

 

    

 

 

 

RELEVANT BIBLIOGRAPHY

 

 

We have prepared a list of some relevant papers that use tools from differential geometry for solving problems in shape analysis

and activity recognition. Here is a file with the list of papers. In case you do not have access to one of the papers, please feel free

to send an email to one of us.

List of Papers

 

 

BIOSKETCHES

Fatih Porikli

Fatih Porikli is a Distinguished Scientist at Mitsubishi Electric Research Labs (MERL), Cambridge, USA. He received his PhD from NYU Poly, NY and before joining MERL in 2000 he developed satellite imaging solutions at Hughes Research Labs and 3D capture and display systems at AT&T Research Labs. His work covers areas including computer vision, machine learning, compressive sensing, video surveillance, multimedia denoising, biomedical vision, radar signal processing, and online learning with over 100 papers and 60 patents. He received R&D100 2006 Award in the Scientist of the Year category (select group of winners) in addition to 3 Best Paper Awards and 5 professional prizes. He serves as an Associate Editor of IEEE Signal Processing Magazine (impact factor: 6.0), SIAM Imaging Sciences (2/236 math journal rank) Springer Machine Vision Applications, and EURASIP Image and Video Processing Journal among others. He served as the General Chair of IEEE AVSS 2010 and in the organizing committee of several other IEEE conferences.

 

Anuj Srivastava

Anuj Srivastava is a Professor of Statistics in Florida State University. He obtained his PhD degree in Electrical Engineering from Washington University in St. Louis in 1996 and was a visiting research associate at Division of Applied Mathematics at Brown University during 1996-1997. He joined the Department of Statistics at the Florida State University in 1997 as an Assistant Professor. He was promoted to the Associate Professor position in 2003 and to the full Professor position in 2007. He has been a visiting Professor to INRIA, France and the University of Lille, France. His areas of research include statistics on nonlinear manifolds, statistical image understanding, functional analysis, and statistical shape theory. He has published more than 150 papers in refereed journals and proceedings of refereed international conferences. He has been the associate editor for the Journal of Statistical Planning and Inference (2000-06), and the IEEE Transactions on Signal Processing (2004-06) and the IEEE Transactions on Pattern Analysis and Machine Intelligence (2005-2007).

 

Pavan Turaga

Pavan Turaga is Assistant Professor in the schools of Arts, Media, Engineering, and Electrical and Computer Engineering. He received the B.Tech. degree in from the Indian Institute of Technology Guwahati, India, in 2004, and the M.S. and Ph.D. degrees in electrical engineering from the University of Maryland, College Park in 2008 and 2009 respectively. He then spent 2 years as a Research Associate at the Center for Automation Research, UMD. His research interests are in statistics and machine learning with applications to computer vision and pattern analysis. His research work includes human activity analysis from videos, video summarization, dynamic scene analysis, and statistical inference on manifolds for these applications. He was awarded the Distinguished Dissertation Fellowship by UMD in 2009, and was selected to participate in the Emerging Leaders in Multimedia Workshop by IBM, New York in 2008.

 

Ashok Veeraraghavan

Ashok Veeraraghavan received the B.Tech. degree in electrical engineering from the Indian Institute of Technology, Madras, in 2002, and the M.S. and Ph.D. degrees from the Department of Electrical and Computer Engineering, University

of Maryland, College Park, in 2004 and 2008, respectively. He was a Research Scientist at Mitsubishi Electric Research Labs, Cambridge, MA during 2008-2011 and is currently an Assistant Professor of Electrical Engineering at Rice University. His research interests are signal, image, and video processing, computer vision, computational photography, compressed sensing and pattern recognition.