We are More than Our Joints: Predicting How 3D Bodies Move

teaser

MOJO in a nutshell:

A body surface marker-based representation. Compared to joint locations, body surface markers contain richer information of the body shape, and provide more body degree-of-freedom constraints. Compared to joint rotations, markers are located in the Euclidean space, which are easier for neural networks to learn.
A conditional VAE with latent frequencies. With latent frequencies, the generated motion has more high-frequency components and hence looks more realistic. Boosted by DLow as an advanced sampler in the latent space, MOJO produces highly diverse future motions based on the same motion seed.
A recursive marker reprojection scheme. This scheme is to recover the body meshes from the generated markers during testing. After reprojecting the markers to the mesh template at each time step, it always keeps the markers in the valid body space, and hence can eliminate error accumulation of the recurrent network.

The marker-based representation has been leveraged in these related projects of motion capture and synthesis:

Citation

Yan Zhang, Michael J. Black, Siyu Tang
We are More than Our Joints: Predicting how 3D Bodies Move
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021

   @inproceedings{zhang2021mojo,
      title={We are More than Our Joints: Predicting how 3D Bodies Move},
      author={Zhang, Yan and Black, Michael J and Tang, Siyu},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
      pages={3372--3382},
      year={2021}
    }

We are More than Our Joints

Predicting How 3D Bodies Move

Video

Code

Citation

Team

Yan
Zhang

Michael J. Black

Siyu
Tang