Video 360 Analytics for First Responders

People

Timeline

  • Fall 2022-present

Project Description

  • 360° and 2D Video Analytics in Network and Energy Constrained Environments: Real-time video analytics enables rapid, automated content understanding, significantly reducing the time required to search through captured footage. However, many real-world scenarios that could benefit from real-time analytics face constraints that remain under-explored. This project focuses on one such use case: firefighter training. Firefighters rely on video for decision-making, post-mission feedback, and developing new training scenarios. However, deploying cameras in outdoor training environments presents two major challenges: limited network connectivity due to the distance from Wi-Fi infrastructure and restricted energy availability as cameras operate on battery power. To investigate network limitations, we conducted field tests at the Illinois Fire Service Institute (IFSI) to analyze connectivity from various locations on the grounds. Additionally, we developed a streaming framework supporting multiple video codecs and systematically evaluated their impact on bandwidth usage and 360° video streaming performance under real-world conditions at IFSI. To address the problem of energy-efficient video processing we developed a system, EcoLens, that dynamically optimizes processing configurations to minimize energy consumption of the camera while preserving essential video features for deep learning inference. We first conducted an extensive offline evaluation of various configurations comprising of device CPU frequency, frame filtering features, difference thresholds, and video bitrates, to establish apriori knowledge of their impact on energy consumption and inference accuracy. Leveraging this insight, we introduced an online system that employs multi-objective Bayesian optimization to intelligently explore and adapt configurations in real time. Our approach continuously refines processing settings to meet target inference accuracy with minimal edge device energy expenditure. Experimental results demonstrated the system’s effectiveness in reducing video processing energy use while maintaining high analytical performance, offering a practical solution for smart devices and edge computing applications.

  • Efficient 360-Degree Action Detection and Summarization Framework: Effective training and debriefing are critical in high-stakes, mission-critical environments such as disaster response, military simulations, and industrial safety, where precision and minimizing errors are paramount. The traditional post-training analysis relies on manually reviewing 2D videos, a time-consuming process that lacks comprehensive situational awareness. To address these limitations, we introduce ACT360, a system that leverages 360-degree videos and machine learning for automated action detection and structured debriefing. ACT360 integrates 360YOWO, an enhanced You Only Watch Once (YOWO) model with spatial attention and equirectangular-aware convolution (EAC) to mitigate panoramic video distortions. To enable deployment in resource-constrained environments, we apply quantization and model pruning, reducing the model size by 74% while maintaining robust accuracy (mAP drop of only 1.5%, from 0.865 to 0.850) and improving inference speed. We validate our approach on a publicly available dataset of 55 labeled 360-degree videos covering seven key operational actions, recorded across various real-world training sessions and environmental conditions. Additionally, ACT360 integrates 360AIE (Action Insight Explorer), a web-based interface for automatic action detection, retrieval, and textual summarization using large language models (LLMs), significantly enhancing post-incident analysis efficiency. ACT360 serves as a generalized framework for mission-critical debriefing, incorporating EAC, spatial attention, summarization, and model optimization. These innovations apply to any training environment requiring lightweight action detection and structured post-exercise analysis.
  • ST-360: Spatial–Temporal Filtering-Based Low-Latency 360-Degree Video Analytics Framework: Recent advances in computer vision algorithms and video streaming technologies have facilitated the development of edge-server-based video analytics systems, enabling them to process sophisticated real-world tasks, such as traffic surveillance and workspace monitoring. Meanwhile, due to their omnidirectional recording capability, 360-degree cameras have been proposed to replace traditional cameras in video analytics systems to offer enhanced situational awareness. Yet, we found that providing an efficient 360-degree video analytics framework is a non-trivial task. Due to the higher resolution and geometric distortion in 360-degree videos, existing video analytics pipelines fail to meet the performance requirements for end-to-end latency and query accuracy. To address these challenges, we introduce the innovative ST-360 framework specifically designed for 360-degree video analytics. This framework features a spatial-temporal filtering algorithm that optimizes both data transmission and computational workloads. Evaluation of the ST-360 framework on a unique dataset of 360-degree first-responders videos reveals that it yields accurate query results with a 50% reduction in end-to-end latency compared to state-of-the-art methods.
  • Latency-Aware 360-Degree Video Analytics Framework for First Responders Situational Awareness: First responders operate in hazardous working conditions with unpredictable risks. To better prepare for demands of the job, first responder trainees conduct training exercises that are being recorded and reviewed by the instructors, who check for objects indicating risks within the video recordings (e.g., firefighter with an unfastened gas mask). However, the traditional reviewing process is inefficient due to unanalyzed video recordings and limited situational awareness. For better reviewing experience, a latency-aware Viewing and Query Service (VQS) should be provided. The VQS should support object searching, which can be achieved using the video object detection algorithms. Meanwhile, the application of 360-degree cameras facilitates an unlimited field of view of the training environment. Yet, this medium represents a major challenge because low-latency high-accuracy 360-degree object detection is difficult due to higher resolution and geometric distortion. In this paper, we present the Responders-360 system architecture designed for 360-degree object detection. We propose a Dynamic Selection algorithm that optimizes computation resources while yielding accurate 360-degree object inference. The results, using a unique dataset collected from a firefighting training institute, show that the Responders-360 framework achieves 4x speedup and 25% memory usage reduction compared with the state-of-the-art methods.

 

 

Publications

  • Benjamin Civjan, Bo Chen, Rui-Xiao Zhang, Klara Nahrstedt, EcoLens: Leveraging Multi-Objective Bayesian Optimization for Energy-Efficient Video Processing on Edge Devices, 11th IEEE International Conference on Smart Computing (SMARTCOMP), 2025. 
  • Aditi Tiwari, Klara Nahrstedt, ACT360: An Efficient 360-Degree Action Detection and Summarization Framework for Mission-Critical Training and Debriefing, 11th IEEE International Conference on Smart Computing (SMARTCOMP), 2025.
  • Jiaxi Li, Jingwei Liao, Bo Chen, Anh Nguyen, Aditi Tiwari, Qian Zhou, Zhisheng Yan, and Klara Nahrstedt, ST-360: Spatial–Temporal Filtering-Based Low-Latency 360-Degree Video Analytics Framework. ACM Trans. Multimedia Comput. Commun. Appl, September 2024.
  • Jiaxi Li, Jingwei Liao, Bo Chen, Anh Nguyen, Aditi Tiwari, Qian Zhou, Zhisheng Yan, and Klara Nahrstedt. 2023. Latency-Aware 360-Degree Video Analytics Framework for First Responders Situational Awareness. In Proceedings of the 33rd Workshop on Network and Operating System Support for Digital Audio and Video (NOSSDAV ’23). Association for Computing Machinery, New York, NY, USA, 8–14. 

Funding

This project is supported by the National Science Foundation (NSF). 

Code 

EcoLens: https://github.com/bencivjan/ecolens

IFSI Onsite Experiments: https://github.com/bencivjan/distributed-360-streaming