Embedding Task Structure for Action Detection

Michael Peven, Gregory D. Hager

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a straightforward, flexible method to enhance the accuracy and quality of action detection by expressing temporal and structural relationships of actions in the loss function of a deep network. We describe ways to represent otherwise implicit structure in video data and demonstrate how these structures reflect natural biases that improve network training. Our experiments show that our approach improves both accuracy and edit-distance of action recognition and detection models over a baseline. Our framework leads to improvements over prior work and obtains state-of-the-art results on multiple benchmarks. The code is available here.

Original languageEnglish (US)
Title of host publicationProceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages6590-6599
Number of pages10
ISBN (Electronic)9798350318920
DOIs
StatePublished - Jan 3 2024
Event2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024 - Waikoloa, United States
Duration: Jan 4 2024Jan 8 2024

Publication series

NameProceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024

Conference

Conference2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024
Country/TerritoryUnited States
CityWaikoloa
Period1/4/241/8/24

Keywords

  • Algorithms
  • Algorithms
  • and algorithms
  • formulations
  • Machine learning architectures
  • Video recognition and understanding

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Science Applications
  • Computer Vision and Pattern Recognition

Fingerprint

Dive into the research topics of 'Embedding Task Structure for Action Detection'. Together they form a unique fingerprint.

Cite this