Reducing Video Chat Bandwidth by Studying the Human Gaze

Originally published November 19, 2013

By Nadia M. Whitehead

UTEP News Service

If you’re not hooked up to WiFi, video chatting on your smartphone can be a pretty choppy experience. Users often run into frustrating problems like highly pixelated or frozen frames, lagging video or audio, and connection losses.

Mobile video chat programs such as FaceTime, Skype and Google Hangouts use a lot of bandwidth, which is the amount of data you can send in a given period of time, usually limited by your smartphone coverage plan.

“Constant video transmission is very expensive,” said Nigel Ward, Ph.D., professor of computer science at The University of Texas at El Paso. “When you’re using something like Skype, it repeatedly takes pictures of you and sends them. It [takes and sends pictures] about thirty times per second.”

Nigel Ward, Ph.D., professor of computer science, video chats on his computer. The computer scientist is developing an idea to improve user experience and reduce bandwidth consumption when using mobile video chat. Photo by J.R. Hernandez / UTEP News Service
Nigel Ward, Ph.D., professor of computer science, video chats on his computer. The computer scientist is developing an idea to improve user experience and reduce bandwidth consumption when using mobile video chat. Photo by J.R. Hernandez / UTEP News Service

This constant transfer of large data in mobile technology — where bandwidth is limited and costly — can result in the annoying problems users experience in video chat.

To help prevent the glitches and save users money, Ward recently came up with an idea that may dramatically improve mobile video chat by reducing the amount of bandwidth the chats consume.

Countless studies have shown that people tend to look away while they’re talking, especially right before they start a sentence or when they pause to think about what to say next.

“During conversations, Americans look away every 10 to 20 seconds for about an average of two seconds,” Ward said. “This turns out to be a significant chunk of time that we’re looking away from each other.”

The computer scientist plans to capitalize on these gaze aversions by pausing video transmission when they occur during video chat.

“There’s no point in sending your face across the network if I’m looking away at that time,” he said. “For that second or two we can suspend transmission because I’m not going to be looking at what’s on my screen anyway.”

For instance, if you’re using Skype with a friend, the program is constantly taking photos of the two of you and sending them back and forth. In this case, whenever you look away from the screen, Ward’s software would freeze your friend’s video transmission, leaving him or her frozen on your screen, while continuing to transmit audio. As soon as you look back at your screen, the video transmission would resume and your friend’s face would continue to move.

The video transmission pauses could save a lot of unnecessary data from being transferred across the wire or through the air and save bandwidth.

To make the software possible, the research team needs to develop a gaze prediction algorithm that will be able to determine 200 milliseconds in advance when a user will avert their gaze from the screen.

Using an eye gaze tracker, the team will study 200 people in regular five minute-long conversations to begin training the software on gaze aversions.

Computer scientist David Novick, Ph.D., an expert in gaze behavior, will assist with the endeavor. He has completed multiple studies at UTEP that involve videotaping conversations to better comprehend the function of the human gaze.

“Speakers almost always look away in two-party conversations and look back to see if the listener is understanding,” he said. “The speaker looks away about 80 percent of the time, while the listener is almost always looking at the speaker – about 90 percent of the time.”

While Ward will focus on the software, Novick will help determine where people look and when, to improve the efficiency of the mathematical model.

“One of the problems that the project is going to face is if the gaze algorithm wrongly predicts looking away,” Novick said. “Then the viewer will think the system froze. So we need to get really good at predicting so that it can be transparent.”

If all goes according to plan, Ward believes that users won’t be able to notice the video pauses and that the new software will reduce mobile bandwidth consumption in video chat by 9 percent — leading to an all around better experience for users.

The team hopes to eventually create different gaze prediction algorithms for different cultures. According to the researchers, gaze aversion in conversations varies across culture — some may look at each other more, while others less. The algorithms could then be applied to a person’s video software depending on their cultural background.

“Nigel’s idea is very clever and I’m happy to be working on this project with him,” Novick said. “He has an insight about using [gaze behavior] in an valuable way. It’s going to be a big contribution.”

Ward recently applied for a National Science Foundation grant to fund the research project. Awardees will be contacted in the next six months.