|
|
|
|
|
|
CVPR
2102 TUTOTIAL/SHORT-COURSE
June
21, Providence, Rhode Island
ON
Differential Geometric Methods for Shape Analysis and Activity Recognition

Presented By:
Fatih Porikli, Distinguished Member Research Staff, MERL Research / Technical
Staff; fatih@merl.com
Anuj Srivastava, Professor, Department of Statistics, Florida State
University; anuj@stat.fsu.edu
Pavan Turaga, Assistant Professor, Electrical,
Computer, and Energy Engineering, Arizona State University; pturaga@asu.edu
Ashok Veeraraghavan, Assistant Professor, Department
of Electrical Engineering, Rice University; vashok@rice.edu
PROGRAM SCHEDULE (PDF Files of the
Presentations)
|
TIME |
PRESENTER |
TOPIC |
|
8:45 – 9:00 |
Fatih Porikli |
Motivation for using
differential geometry in computer vision applications |
|
|
|
|
|
9:00 – 9:30 |
Anuj Srivastava |
Background material from
differential geometry |
|
9:30 – 10:15 |
Anuj Srivastava |
Shape analysis of objects |
|
10:15 – 10:30 |
|
Coffee Break |
|
10:30 – 11:30 |
Ashok Veeraraghavan |
Activity Recognition Using Manifolds |
|
Pavan Turaga |
||
|
11:30 – 12:15 |
Fatih Porikli |
Application to
Tracking/Pedestrian Detection |
COURSE DESCRIPTION
General Background:
Nonlinear
manifolds have a special place in problem solutions where constraints of the problems
restrict the domains to some interesting, structured sets. The differential
geometry of these constrained spaces, or manifolds, guides us to reach more
efficient solutions. Besides being mathematically appealing, the solutions
based on the geometry of the underlying manifolds are often faster and more
stable than their constrained optimization counterparts. This fact has been
exploited in many branches of science and engineering, in developing
methodologies, algorithms, and systems.
In this tutorial, we will focus on several manifolds
including shape manifolds of planar closed curves, Grassmann
manifolds, and manifolds of covariance matrices and affine transformations. In
each case provide a mathematical background and demonstrate the use of these manifolds
in shape analysis, activity classification, and pedestrian tracking
applications.
This tutorial is going to focus on the following
items:
|
Shape
Analysis of Contours in Video Frames |
We
will start with general goals and challenges faced in shape analysis,
followed by a summary of the basic ideas, strengths and limitations, and
applications of different mathematical representations used in shape analysis
of 2D and 3D objects. These representations include point sets
(landmark-based shape analysis and active shape models), curves, surfaces,
level sets, deformable templates, and medial representations. Then,
we will take a closer look at shape analysis of parameterized curves, where central issue is the
shape analysis of parameterized curves while treating their parameterizations
as nuisance variables. Since the common tool for removing the nuisance
variables is an algebraic one, we will introduce the concepts of defining
quotient spaces of manifolds under the actions of Lie groups. We
will discuss the common choices of Riemannian metrics and tools for
computation of geodesic paths and geodesic distances for several of these
shape representations. Also, we will introduce the path-straightening
algorithm computing geodesic paths in such shape manifolds. Then,
we study the use of Riemannian frameworks in statistical modeling of
variability within shape classes. The commonly used ideas, such as Karcher mean computation, tangent PCA (TPCA) analysis,
Gaussian or mixture-of-Gaussian models on TPCA, principal geodesic analysis,
and hypothesis testing for shape classification using such shape models will
be discussed. |
|
Activity Representation
using Stochastic Processes on Shape Manifolds |
In this part we will discuss
how several interesting features and models that are commonly used in
activity analysis can be studied in the Riemannian setting. We will look at
representations such as the shape of the human contour, landmark
configurations of the human body, and stick figure models and associated
manifold representations. Then, we will motivate the use of Riemannian
geometric tools to study human activity sequences evolving on these varied
manifold valued domains. We will also consider compact parametric models that
are commonly used in image and video analysis tasks such as parametric linear
dynamical models, covariance matrices of image patches, and subspace models
of image collections. We will further discuss various
classes of algorithms that can be extended to the study of human activities,
and the approximations that enable fast and efficient activity recognition
algorithms to be devised. We will discuss how several important classes of
statistical and machine-learning algorithms such as hidden Markov models,
dynamic time warping, hypothesis testing, regression, database organization etc can be extended to deal with the underlying manifold
structures of such datasets. We will present motivating applications from
human activity discovery, recognition, search and retrieval, and
summarization to illustrate several of these concepts. In order to motivate
rate-invariant activity recognition, we will introduce the notion of warping
functions, and metrics that remove the execution rate variability in activity
analysis. |
|
Applications in Tracking/Pedestrian Detection |
Finally, we will present an important application area involving visual
analytics where one attempts to detect, track, and recognize objects of
interest from multiple videos, and more generally to interpret object
behaviors and actions. We will describe in the tutorial the three main steps
of visual analytics: detection of objects and agents, tracking of such
objects and indicators from frame to frame, and evaluating tracking results
to describe and infer semantic events and latent phenomena. The main
challenge is the problem of variability. A visual detection and tracking
system needs to generalize across huge variations in object appearance such
due for instance to viewpoint, pose, facial expressions, lighting conditions,
imaging quality or occlusions while maintaining specificity to not claim
everything it sees are objects of interest. In addition, these tasks should
preferably be performed in real-time on conventional computing platforms. Mathematical formulation of certain natural phenomena in video analytics
exhibits group structure on topological spaces that resemble the Euclidean
space only on a small enough scale, which prevents incorporation of
conventional inference methods that require global vector norms. More
specifically, such underlying notions emerge in differentiable parameter
spaces. We will introduce two Riemannian manifolds: the set of affine
transformations and covariance matrices, and their applications in distance
computation, motion estimation, object detection and
recognition problems. |
RELEVANT BIBLIOGRAPHY
We have prepared a list of
some relevant papers that use tools from differential geometry for solving problems
in shape analysis
and
activity recognition. Here is a file with the list of papers. In case you do
not have access to one of the papers, please feel free
to
send an email to one of us.
BIOSKETCHES
Fatih Porikli
Fatih Porikli is a Distinguished Scientist at Mitsubishi Electric Research Labs (MERL), Cambridge, USA. He received his PhD from NYU Poly, NY and before joining MERL in 2000 he developed satellite imaging solutions at Hughes Research Labs and 3D capture and display systems at AT&T Research Labs. His work covers areas including computer vision, machine learning, compressive sensing, video surveillance, multimedia denoising, biomedical vision, radar signal processing, and online learning with over 100 papers and 60 patents. He received R&D100 2006 Award in the Scientist of the Year category (select group of winners) in addition to 3 Best Paper Awards and 5 professional prizes. He serves as an Associate Editor of IEEE Signal Processing Magazine (impact factor: 6.0), SIAM Imaging Sciences (2/236 math journal rank) Springer Machine Vision Applications, and EURASIP Image and Video Processing Journal among others. He served as the General Chair of IEEE AVSS 2010 and in the organizing committee of several other IEEE conferences.
Anuj Srivastava
Anuj Srivastava is a Professor of Statistics in Florida
State University. He obtained his PhD degree in Electrical Engineering from
Washington University in St. Louis in 1996 and was a visiting research
associate at Division of Applied Mathematics at Brown University during
1996-1997. He joined the Department of Statistics at the Florida State
University in 1997 as an Assistant Professor. He was promoted to the Associate
Professor position in 2003 and to the full Professor position in 2007. He has
been a visiting Professor to INRIA, France and the University of Lille, France.
His areas of research include statistics on nonlinear manifolds, statistical
image understanding, functional analysis, and statistical shape theory. He has
published more than 150 papers in refereed journals and proceedings of refereed
international conferences. He has been the associate editor for the Journal of
Statistical Planning and Inference (2000-06), and the IEEE Transactions on
Signal Processing (2004-06) and the IEEE Transactions on Pattern Analysis and
Machine Intelligence (2005-2007).
Pavan Turaga
Pavan
Turaga is Assistant Professor in the schools of Arts,
Media, Engineering, and Electrical and Computer Engineering. He received the B.Tech. degree in from the Indian
Institute of Technology Guwahati, India, in 2004, and the M.S. and Ph.D.
degrees in electrical engineering from the University of Maryland, College Park
in 2008 and 2009 respectively. He then spent 2 years as a Research Associate at
the Center for Automation Research, UMD. His research interests are in statistics
and machine learning with applications to computer vision and pattern analysis.
His research work includes human activity analysis from videos, video
summarization, dynamic scene analysis, and statistical inference on manifolds
for these applications. He was awarded the Distinguished Dissertation
Fellowship by UMD in 2009, and was selected to participate in the Emerging
Leaders in Multimedia Workshop by IBM, New York in 2008.
Ashok Veeraraghavan
Ashok
Veeraraghavan received the B.Tech.
degree in electrical engineering from the Indian
Institute of Technology, Madras, in 2002, and the M.S. and Ph.D. degrees from
the Department of Electrical and Computer Engineering, University
of Maryland, College Park, in 2004 and 2008,
respectively. He was a Research Scientist at Mitsubishi Electric Research Labs,
Cambridge, MA during 2008-2011 and is currently an Assistant Professor of
Electrical Engineering at Rice University. His research interests are signal,
image, and video processing, computer vision, computational photography,
compressed sensing and pattern recognition.