A Customizable, Digital Avatar That Can Express Emotions
March 27, 2013

A Customizable, Digital Avatar That Can Express Emotions

Lee Rannals for redOrbit.com — Your Universe Online

The Wizard of Oz's trickery has been unpacked by a team of researchers at the University of Cambridge in the UK.

Researchers built a virtual "talking head" known as Zoe, capable of expressing human emotions on demand with "unprecedented realism."

Zoe is able to express a full range of human emotions and could be used as a digital personal assistant, or to replace texting with "face messaging." The virtual face is able to convey emotions like happiness, anger and fear, and changes in Zoe's voice allow it to add the proper level of emotion in what she is saying as well.

Users can type in any message, specifying the desired emotion, and the face recites the text. The designers say it is the most expressive, controllable avatar ever created.

The face is of Zoe Lister, an actress known as Zoe Carpenter in the series Hollyoaks. In order to recreate her face, the team spent several days recording Zoe's speech and facial expressions. They were able to develop a system that is light enough to work in mobile technology, and could be used as a personal assistant in smartphones.

The team collected a dataset of thousands of sentences they used to train the speech model with the help of the real-life Zoe Lister. They also tracked Lister's face while she was speaking these sentences using computer vision software. Mathematical algorithms were then applied to give the voice and image data they needed to recreate expressions on a digital face.

A template behind Zoe could be enabled for people to upload their own faces and voices in a matter of seconds. With this template, future users will be able to customize and personalize their own digital assistants.

[ Watch the Video: Face of the Future Rears its Head ]

Once Zoe is utilized on smartphones, users will be able to send messages like "I'm going to be late" with a "frustrated" emotion, so their friends would receive a "face message" that looked like the sender and repeated the message in a frustrating way.

The researchers are looking for applications for Zoe, and are working with a school for autistic and deaf children, where technology could be used to help "read" emotions and lip-read.

“This technology could be the start of a whole new generation of interfaces which make interacting with a computer much more like talking to another human being,” Professor Roberto Cipolla, from the Department of Engineering, University of Cambridge, said. “It took us days to create Zoe, because we had to start from scratch and teach the system to understand language and expression. Now that it already understands those things, it shouldn´t be too hard to transfer the same blueprint to a different voice and face.”

The program used to run Zoe is just tens of megabytes in size, so it could be able to be easily incorporated into small computer devices, like tablets and smartphones. Zoe's voice also features six basic settings, including Happy, Sad, Tender, Angry, Afraid and Neutral. Each of these settings can be adjusted to different levels, altering pitch, speed and depth of the voice. Users can pre-set or create almost infinite levels of emotional combinations by combining these levels. The designers say combining speed, anger and fear make Zoe sound as if she is panicking.

The researchers gave volunteers a video or audio clip of a single sentence from the test set in order to see how effective the system was. Ten sentences were evaluated by 20 different people to determine what level of emotion was being conveyed. Volunteers who had audio and video saw a success rate of 77 percent when determining what emotion was being conveyed. They used the real-life Zoe as a control, and the success rate under the same test was just 73 percent.

“Present day human-computer interaction still revolves around typing at a keyboard or moving and pointing with a mouse.” Cipolla added. “For a lot of people, that makes computers difficult and frustrating to use. In the future, we will be able to open up computing to far more people if they can speak and gesture to machines in a more natural way. That is why we created Zoe - a more expressive, emotionally responsive face that human beings can actually have a conversation with.”

Avatar technologies have come a long way, even in just a year. In 2012, Japanese researchers announced the development of a wearable miniature humanoid virtual presence. The robot, MH-2, allows friends or family to interact through the virtual presence device so you do not feel alone. However, this virtual presence device is bulky and requires a backpack, unlike Zoe.