How Effective are Self-supervised Models for Contact Identification in Videos

Malitha Gunawardhana*, Limalka Sadith, Liel David, Daniel Harari, Muhammad Haris Khan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

The exploration of video content via Self-Supervised Learning (SSL) models has unveiled a dynamic field of study, emphasizing both the complex challenges and unique opportunities inherent in this area. Despite the growing body of research, the ability of SSL models to detect physical contacts in videos remains largely unexplored, particularly the effectiveness of methods such as downstream supervision with linear probing or full fine-tuning. This work aims to bridge this gap by employing eight different convolutional neural networks (CNNs) based video SSL models to identify instances of physical contact within video sequences specifically. The Something-Something v2 (SSv2) and Epic-Kitchen (EK-100) datasets were chosen for evaluating these approaches due to the promising results on UCF101 and HMDB51, coupled with their limited prior assessment on SSv2 and EK-100. Additionally, these datasets feature diverse environments and scenarios, essential for testing the robustness and accuracy of video-based models. This approach not only examines the effectiveness of each model in recognizing physical contacts but also explores the performance in the action recognition downstream task. By doing so, valuable insights into the adaptability of SSL models in interpreting complex, dynamic visual information are contributed.

Original languageEnglish
Title of host publicationHuman Activity Recognition and Anomaly Detection - 4th International Workshop, DL-HAR 2024, and 1st International Workshop, ADFM 2024, Held in Conjunction with IJCAI 2024, Revised Selected Papers
EditorsKuan-Chuan Peng, Yizhou Wang, Ziyue Li, Zhenghua Chen, Min Wu, Jianfei Yang, Sungho Suh
PublisherSpringer Science and Business Media B.V.
Pages117-131
Number of pages15
ISBN (Print)9789819790029
DOIs
Publication statusPublished - 2025
Event4th International Workshop on Deep Learning for Human Activity Recognition, DL-HAR 2024, and 1st International Workshop on Anomaly Detection with Foundation Models, ADFM 2024, Held in Conjunction with the International Joint Conference on AI, IJCAI 2024 - Jeju, Korea, Republic of
Duration: 3 Aug 20249 Aug 2024

Publication series

SeriesCommunications in Computer and Information Science
Volume2201 CCIS
ISSN1865-0929

Conference

Conference4th International Workshop on Deep Learning for Human Activity Recognition, DL-HAR 2024, and 1st International Workshop on Anomaly Detection with Foundation Models, ADFM 2024, Held in Conjunction with the International Joint Conference on AI, IJCAI 2024
Country/TerritoryKorea, Republic of
CityJeju
Period3/8/249/8/24

All Science Journal Classification (ASJC) codes

  • General Computer Science
  • General Mathematics

Fingerprint

Dive into the research topics of 'How Effective are Self-supervised Models for Contact Identification in Videos'. Together they form a unique fingerprint.

Cite this