Human action-recognition using mutual invariants

Vasu Parameswaran; Rama Chellappa

doi:10.1016/j.cviu.2004.09.002

Human action-recognition using mutual invariants

Research output: Contribution to journal › Article › peer-review

Abstract

Static and temporally varying 3D invariants are proposed for capturing the spatio-temporal dynamics of a general human action to enable its representation in a compact, view-invariant manner. Two variants of the representation are presented and studied: (1) a restricted-3D version, whose theory and implementation are simple and efficient but which can be applied only to a restricted class of human action, and (2) a full-3D version, whose theory and implementation are more complex but which can be applied to any general human action. A detailed analysis of the two representations is presented. We show why a straightforward implementation of the key ideas does not work well in the general case, and present strategies designed to overcome inherent weaknesses in the approach. What results is an approach for human action modeling and recognition that is not only invariant to viewpoint, but is also robust enough to handle different people, different speeds of action (and hence, frame rate) and minor variabilities in a given action, while encoding sufficient distinction among actions. Results on 2D projections of human motion capture and on manually segmented real image sequences demonstrate the effectiveness of the approach.

Original language	English (US)
Pages (from-to)	294-324
Number of pages	31
Journal	Computer Vision and Image Understanding
Volume	98
Issue number	2
DOIs	https://doi.org/10.1016/j.cviu.2004.09.002
State	Published - May 2005
Externally published	Yes

Keywords

Human action-recognition
Model based invariants
Mutual invariants
View invariance

ASJC Scopus subject areas

Software
Signal Processing
Computer Vision and Pattern Recognition

Access to Document

10.1016/j.cviu.2004.09.002

Cite this

@article{a0d771363ecb4e438f413df37d29f5b3,

title = "Human action-recognition using mutual invariants",

abstract = "Static and temporally varying 3D invariants are proposed for capturing the spatio-temporal dynamics of a general human action to enable its representation in a compact, view-invariant manner. Two variants of the representation are presented and studied: (1) a restricted-3D version, whose theory and implementation are simple and efficient but which can be applied only to a restricted class of human action, and (2) a full-3D version, whose theory and implementation are more complex but which can be applied to any general human action. A detailed analysis of the two representations is presented. We show why a straightforward implementation of the key ideas does not work well in the general case, and present strategies designed to overcome inherent weaknesses in the approach. What results is an approach for human action modeling and recognition that is not only invariant to viewpoint, but is also robust enough to handle different people, different speeds of action (and hence, frame rate) and minor variabilities in a given action, while encoding sufficient distinction among actions. Results on 2D projections of human motion capture and on manually segmented real image sequences demonstrate the effectiveness of the approach.",

keywords = "Human action-recognition, Model based invariants, Mutual invariants, View invariance",

author = "Vasu Parameswaran and Rama Chellappa",

note = "Funding Information: Partially supported by NSF Grant ECS-02-25475. ",

year = "2005",

month = may,

doi = "10.1016/j.cviu.2004.09.002",

language = "English (US)",

volume = "98",

pages = "294--324",

journal = "Computer Vision and Image Understanding",

issn = "1077-3142",

publisher = "Academic Press Inc.",

number = "2",

}

TY - JOUR

T1 - Human action-recognition using mutual invariants

AU - Parameswaran, Vasu

AU - Chellappa, Rama

N1 - Funding Information: Partially supported by NSF Grant ECS-02-25475.

PY - 2005/5

Y1 - 2005/5

N2 - Static and temporally varying 3D invariants are proposed for capturing the spatio-temporal dynamics of a general human action to enable its representation in a compact, view-invariant manner. Two variants of the representation are presented and studied: (1) a restricted-3D version, whose theory and implementation are simple and efficient but which can be applied only to a restricted class of human action, and (2) a full-3D version, whose theory and implementation are more complex but which can be applied to any general human action. A detailed analysis of the two representations is presented. We show why a straightforward implementation of the key ideas does not work well in the general case, and present strategies designed to overcome inherent weaknesses in the approach. What results is an approach for human action modeling and recognition that is not only invariant to viewpoint, but is also robust enough to handle different people, different speeds of action (and hence, frame rate) and minor variabilities in a given action, while encoding sufficient distinction among actions. Results on 2D projections of human motion capture and on manually segmented real image sequences demonstrate the effectiveness of the approach.

AB - Static and temporally varying 3D invariants are proposed for capturing the spatio-temporal dynamics of a general human action to enable its representation in a compact, view-invariant manner. Two variants of the representation are presented and studied: (1) a restricted-3D version, whose theory and implementation are simple and efficient but which can be applied only to a restricted class of human action, and (2) a full-3D version, whose theory and implementation are more complex but which can be applied to any general human action. A detailed analysis of the two representations is presented. We show why a straightforward implementation of the key ideas does not work well in the general case, and present strategies designed to overcome inherent weaknesses in the approach. What results is an approach for human action modeling and recognition that is not only invariant to viewpoint, but is also robust enough to handle different people, different speeds of action (and hence, frame rate) and minor variabilities in a given action, while encoding sufficient distinction among actions. Results on 2D projections of human motion capture and on manually segmented real image sequences demonstrate the effectiveness of the approach.

KW - Human action-recognition

KW - Model based invariants

KW - Mutual invariants

KW - View invariance

UR - http://www.scopus.com/inward/record.url?scp=13444273043&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=13444273043&partnerID=8YFLogxK

U2 - 10.1016/j.cviu.2004.09.002

DO - 10.1016/j.cviu.2004.09.002

M3 - Article

AN - SCOPUS:13444273043

SN - 1077-3142

VL - 98

SP - 294

EP - 324

JO - Computer Vision and Image Understanding

JF - Computer Vision and Image Understanding

IS - 2

ER -

Human action-recognition using mutual invariants

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this