A video dataset that is designed to study fine-grained categorisation of pedestrians is introduced. Pedestrians were recorded “in-the-wild” from a moving vehicle. Annotations include bounding boxes, tracks, 14 keypoints with occlusion information and the fine-grained categories of age (5 classes), sex (2 classes), weight (3 classes) and clothing style (4 classes). There are a total of 27,454 bounding box and pose labels across 4222 tracks. This dataset is designed to train and test algorithms for fine-grained categorisation of people; it is also useful for benchmarking tracking, detection and pose estimation of pedestrians. State-of-the-art algorithms for fine-grained classification and pose estimation were tested using the dataset and the results are reported as a useful performance baseline.
Download the set of images that have been annotated in the videos. Download the annotations and load using pickle. Use the code to view the annotations overlayed on each frame. The original videos are also available.