We are More than Our Joints

Predicting How 3D Bodies Move

"We are more than our joints", or MOJO for short, is a solution to stochastic motion prediction of expressive 3D bodies. Given a short motion from the past, MOJO generates diverse plausible motions in the near future.

teaser

MOJO in a nutshell:

  • A body surface marker-based representation. Compared to joint locations, body surface markers contain richer information of the body shape, and provide more body degree-of-freedom constraints. Compared to joint rotations, markers are located in the Euclidean space, which are easier for neural networks to learn.
  • A conditional VAE with latent frequencies. With latent frequencies, the generated motion has more high-frequency components and hence looks more realistic. Boosted by DLow as an advanced sampler in the latent space, MOJO produces highly diverse future motions based on the same motion seed.
  • A recursive marker reprojection scheme. This scheme is to recover the body meshes from the generated markers during testing. After reprojecting the markers to the mesh template at each time step, it always keeps the markers in the valid body space, and hence can eliminate error accumulation of the recurrent network.

The marker-based representation has been leveraged in these related projects of motion capture and synthesis:

Video

Citation


Yan Zhang, Michael J. Black, Siyu Tang
We are More than Our Joints: Predicting how 3D Bodies Move
In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021

   @inproceedings{zhang2021mojo,
      title={We are More than Our Joints: Predicting how 3D Bodies Move},
      author={Zhang, Yan and Black, Michael J and Tang, Siyu},
      booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
      pages={3372--3382},
      year={2021}
    }

Team

card image

Yan
Zhang

card image

Michael J. Black

card image

Siyu
Tang