Introduction

What are the typical ranges of motion for human arms? What types of leg movements tend to correlate with specific shoulder positions? How can we expect the arms to move given the current body pose? Our goal is to address these questions by recovering a set of ``bases poses'' that summarize the variability of movements in a given collection of static poses captured from images at various viewing angles.

One of the main difficulties of studying human movement is that it is a priori unrestricted, except for physically imposed joint angle limits which have been studied in medical text books, typically for a limited number of configurations. Furthermore, human movement may be distinguished into movemes, actions, and activities depending on structure, complexity, and duration. Movemes refer to the simplest meaningful pattern of motion: a short, target-oriented trajectory, that cannot be further decomposed, e.g. ``reach'', ``grasp'', ``step'', ``kick''.

Extensive studies have been carried out on human action and activity recognition, however little attention has been paid to movemes since human behaviour is difficult to analyze at such a fine scale of dynamics. In this paper, our primary goal is to learn a basis space to smoothly capture movemes from a collection of two dimensional images, although our learned representation can also aid in higher level reasoning.

         

Back to top

Contributions

Given a collection of static joint locations from images taken at any angle of view we learn a factorization into a basis pose matrix U and a coefficient matrix V. The learned basis poses U are rotation-invariant and can be globally applied across a range of viewing angles. A sparse linear combination of the learned basis accurately reconstructs the pose of a human involved in an action at any angle of view, also for poses not contained in the training set. In summary, the main contributions of our paper are:

Back to top

Results

We analyze the flexibility and usefuleness of the proposed model in a variety of application domains and experiments. In particular, we evaluate (i) the performance of the learned representation for supervised learning tasks such as activity classification; (ii) whether the learned representation captures enough semantics for meaningful manifold traversal and visualization; and (iii) the robustness to initialization and the generalization error. Finally, we provide a qualitative visualization of the embedding of the manifold of human motion that was learned. Collectively, results suggest that our approach is effective at capturing rotation invariant semantics of the underlying data.

Back to top

Comprehensive LSP (CO_LSP) Dataset

We extended the LSP dataset, providing additional annotations of the angle of view each depicted subject is facing. We also include the 3D keypoint locations of the 14 joints of the human body, obtained by runnnig an off-the-shelf method for 2D to 3D pose estimation algorithm [2], along with the 2D annotations from the original LSP dataset [1]. We redistribute these comprehensive annotations in JSON format as the Comprehensive LSP (CO_LSP) Dataset, along with a Python API to load, manipulate and visualize the annotations.

         

We collected high-quality viewing angle annotations for each pose in LSP. Although these annotations are not necessary for training, we used them to demonstrate the robustness of our model to poor angle initialization, and that it can in fact recover the ground truth value. Three annotators evaluated each image and were instructed to provide the direction at which the torso of the depicted subject was facing through the GUI visible in Figure. The standard deviation in the reported angle of view averaged over the whole dataset is 12 degrees, and more than half of the images have a deviation of less than 10 degrees, showing a very high annotator agreement for the task.

Example visualization for an annotation in the dataset:

Notes:

Back to top

Download

We provide a python implementation of our algorithm and the three new annotations of the LSP dataset in JSON file format. Full details are described in the Dataset Section and in the README files.

Notes:

Back to top

Cite

If you find our paper or the released data or code useful to your work, please cite:

@inproceedings{
DBLP:conf/icdm/RonchiKY16,
author = {Matteo Ruggero Ronchi and Joon Sik Kim and Yisong Yue},
title = {A Rotation Invariant Latent Factor Model for Moveme Discovery from Static Poses},
booktitle = {IEEE 16th International Conference on Data Mining, {ICDM} 2016, December 12-15, 2016, Barcelona, Spain},
pages = {1179--1184},
year = {2016},
crossref = {DBLP:conf/icdm/2016},
url = {https://doi.org/10.1109/ICDM.2016.0156},
doi = {10.1109/ICDM.2016.0156},
timestamp = {Mon, 11 Feb 2019 17:32:48 +0100},
biburl = {https://dblp.org/rec/bib/conf/icdm/RonchiKY16},
bibsource = {dblp computer science bibliography, https://dblp.org}
}

Back to top

Contact

© 2016, Matteo Ruggero Ronchi, Joon Sik Kim, and Yisong Yue

Back to top

Flag Counter