We propose GAMMA, an automatic and scalable solution, to populate the 3D scene with diverse digital humans. The digital humans have 1) varied body shapes, 2) realistic and perpetual motions to reach goals, and 3) plausible body-ground contact.
Collaborating with Gramazio Kohler Research, GAMMA is implemented in Nvidia Omniverse, and let 200 virtual humans inhabit a 600-meter-high digital verticle city. As shown on the left, the entire system is running on the fly for about 8 hours per day. Virtual humans of various identities are continuously placed at random places, and move spontaneously in the scene. Curated by Norman Foster Foundation and others, this is currently exhibited at the Guggenheim Museum, Bilbao. Our research is also in the frontpage news of ETH Zurich.
We implement our method into Microsoft Hololens, and developed an App to populate a real environment. The left is an example in our ETH main building. The App includes a frontend in the edge device to scan the environment and render the human bodies, and a backend to generate human motions on the cloud. By placing waypoints on the floors, we can guide differnet human bodies to move spontaneously. As shown below, virtual humans are wandering multiple floors at the ETH main building lobby. Code is released here!

Our goal is to populate digital environments, in which the digital humans have diverse body shapes, move perpetually, and have plausible body-scene contact. The core challenge is to generate realistic, controllable, and infinitely long motions for diverse 3D bodies. To this end, we propose generative motion primitives via body surface markers, shortened as GAMMA. In our solution, we decompose the long-term motion into a time sequence of motion primitives. We exploit body surface markers and conditional variational autoencoder to model each motion primitive, and generate long-term motion by implementing the generative model recursively. To control the motion to reach a goal, we apply a policy network to explore the model latent space, and use a tree-based search to preserve the motion quality during testing. Experiments show that our method can produce more realistic and controllable motion than state-of-the-art data-driven method. With conventional path-finding algorithms, the generated human bodies can realistically move in the scene for a long distance and a long time.



@inproceedings{zhang2022wanderings,
   title={The Wanderings of Odysseus in 3D Scenes},
   author={Zhang, Yan and Tang, Siyu},
   booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
   pages={20481--20491},
   year={2022}
   }