Introduction

With the increasing number of buildings, incidents of falling from heights have become more and more frequent. Accurately detecting individuals at the edges of buildings through surveillance videos is crucial for issuing early warnings. However, this task is challenging due to varying lighting conditions, heavy occlusions, and small sizes of person instances captured at a long distance. Therefore, there is a high demand to develop methods for Person Detection at the Edges of Buildings (PDEB) through surveillance videos. In this work, we construct a new dataset, named EBPersons, specifically for PDEB. EBPersons consists of 1,314 videos shot from over 300 scenes in various lighting conditions. Moreover, to establish a benchmark, we propose a new approach for the PDEB task, which utilizes scale distributions of training data and temporal context in videos. Extensive experiments are conducted on EBPersons to compare our method with other detectors, including generic object detectors, pedestrian detectors, and video object detectors. The results demonstrate the superior performance of our proposed method, providing an effective baseline for future research on PDEB.

Examples