Home / Technology / AI researchers translate language into physical movement

AI researchers translate language into physical movement

Carnegie Mellon University AI researchers have created an AI agent that is able to translate words into physical movement. Called Joint Language-to-Pose, or JL2P, the approach combines natural language with 3D pose models. The pose forecasting joint embedding is learned with end-to-end curriculum learning, a training approach that stresses shorter task completion sequences before moving on to harder objectives.

JL2P animations are limited to stick figures today, but the ability to translate words into human-like movement can someday help humanoid robots do physical tasks in the real world or assist creatives in animating virtual characters for things like video games or movies.

JL2P is in line with previous works that turn words into imagery — like Microsoft’s ObjGAN, which sketches images and storyboards from captions, Disney’s AI that uses words in a script to create storyboards, and Nvidia’s GauGAN, which lets users paint landscapes using paintbrushes labeled with words like “trees,” “mountain,” or “sky.”

 

JL2P is able to do things like walk or run, play musical instruments (like a guitar or violin), follow directional instructions (left or right), or control speed (fast or slow). The work originally detailed in a paper on arXiv July 2 will be presented by coauthor and CMU Language Technology Institute graduate research assistant Chaitanya Ahuja on September 19 at the International Conference on 3D Vision in Quebec City, Canada.

“We first optimize the model to predict 2 time steps conditioned on the complete sentence,” the paper reads. “This easy task helps the model learn very short pose sequences, like leg motions for walking, hand motions for waving, and torso motions for bending. Once the loss on the validation set starts increasing, we move on to the next stage in the curriculum. The model is now given twice the [number] of poses for prediction.”

JL2P claims a 9% improvement upon human motion modeling compared to state-of-the-art AI proposed by SRI International researchers in 2018.

JL2P is trained using the KIT Motion-Language Dataset.

Introduced in 2016 by the High Performance Humanoid Technologies in Germany, the data set combines human motion with natural language descriptions that maps 11 hours of recorded human movement to more than 6,200 English sentences that are approximately eight words long.

About admin

Check Also

Sacklers threaten to scrap opioid deal if they aren’t shielded from lawsuits

Enlarge / PURDUE PHARMA, STAMFORD, CT, UNITED STATES – 2019/09/12: Members of P.A.I.N. (Prescription Addiction …

Leave a Reply

Your email address will not be published. Required fields are marked *