Efficient Labelling of Pedestrian Supervisions


  • Kyaw Kyaw Htike UCSI University


Object detection is a fundamental goal to achieve intelligent visual perception by computers due to the fact that objects are the basic building blocks to achieve higher level image understanding. Among the numerous categories of objects in the real-world, pedestrians are among the most important due to several potential benefits brought about by successful pedestrian detection. Often, pedestrian detectors are trained in state-of-the-art systems using supervised machine learning algorithms which necessitates costly and often tedious manual annotation of pedestrians in the form of precise bounding boxes. In this paper, a novel weakly supervised learning algorithm is proposed to train a pedestrian detector that requires, instead of bounding boxes, only annotations of estimated centres of pedestrians. The algorithm makes use of a pedestrian prior learnt in an unsupervised way from the video and this prior is fused with the given weak supervision information in a systematic manner. By evaluating on publicly available datasets, we demonstrate that our weakly supervised algorithm reduces the cost of manual annotation of pedestrians by more than four times while achieving similar performance to a pedestrian detector trained with standard bounding box annotations.


Object detection, Pedestrian Detecton, Weakly supervised learning, cue fusion

Author Biography

Kyaw Kyaw Htike, UCSI University

Kyaw Kyaw Htike is currently an Assistant Professor at School of Information Technology, UCSI University, Malaysia. He holds a PhD (2014) in Computer Vision from University of Leeds, UK. His thesis was on developing novel algorithms for domain adaptation applied to pedestrian detection. He spent a year working as a Research Fellow at University of Leeds on the project "Learning to Recognise Dynamic Visual Content from Broadcast Footage" in collaboration with University of Oxford and University of Surrey. His research interests include object detection, feature extraction, and integrating different supervision levels in the learning process to solve image recognition problems.




Download data is not yet available.