A view of the future

A view of the future
תאריך
מרצה
Prof. Sharon Gannot

The SPRING consortium, powered by the EU’s Horizon2020 program, is developing a socially pertinent robot. Prof. Sharon Gannot, head of one of the project’s research teams, explains how to give robots human listening capabilities.

The world of robotics is taking huge leaps. Robots in Japan play with children, and it isn’t rare to find robots in airports and supermarkets across the globe that provide basic information to passerby. SPRING - Socially Pertinent Robots in Gerontological Healthcare, a new consortium established as part of the EU’s Horizon2020 program, aspires to take robotic capabilities forward and create a socially pertinent robot. “The robot we’re working on is intended for geriatric hospitals visited by the elderly population for various tests, explains Prof. Sharon Gannot, one of the consortium’s researchers. “The robot will welcome them, get a casual conversation going – about the weather or the morning news – and walk them through the process of tests and answer technical questions. We wanted it to encourage interaction between patients in the waiting room, introduce them, organize social games, things that would stimulate them intellectually, but we might have to put that on hold for now due to COVID-19 and social distancing measures. But the robot is definitely going to converse with the visitor and their escort, which means that it would have to not only convey relevant information, but perform certain functions according to the state of things: approach visitors at the reception area, make way for the doctors, and even identify aversion and retreat, as needed.”

The project took off in early 2020 and was budgeted for 4 years, until the end of 2023. Prof. Gannot is one of the eight groups that make up the consortium and is in charge of audio – eliminating noise and separating sources of speech. “When people are someplace loud, like a café, they can distinguish between the conversations around them, focus on one speaker, and ignore the others (to a point). With so-called intelligent machines, this becomes more complex, and our goal is to separate the desired speech – that of the person addressed by the robot – and background noises or conversations. The robot would have to do it in a natural environment and under dynamic conditions, on the go, with people coming and going and numerous interactions, some of them simultaneous,” explains Prof. Gannot. “In addition, the robot would have to know where people are and be able to track them when they walk and talk. To improve its understanding the robot would use its ‘eyes’ – video cameras used for seeing – just like we see whoever is talking to us, read their lips or understand their mimicry, even under extreme noise conditions. This combination of sight and hearing, which are often complementary, will provide an integrated response, just like that found in humans.”

Part of the challenge of this project is that it is designed for a specific population, more vulnerable, and less accustomed to technology. “But if the technology is well-planned, it could amuse them, challenge them,” Gannot clarifies. “Even today, we have robotic solutions for the recluse, to remind them about things like taking their medicine or calling their kids. There are even models that incorporate a tablet (like the robot in our project), through which the conversation can be carried. But our robot will have advanced capabilities, in audio and video. These capabilities will allow the robot to understand when its presence is not wanted, and back away. This could be triggered by an explicit statement, or by ‘understanding’ emotions that can be detected by the person’s intonation, like stress or anger. The technology for understanding emotions through human voice already exists: many hotlines are manned by machines due to lack of manpower, and the machines can recognize a stressed, angry, or otherwise upset tone of voice and transfer the call to a human representative. Our project has to implement these capabilities in a very unique, ‘acoustically hostile environment, which means that the robot has to be able to analyze voices in a densely populated space. We use several microphones to manage and analyze the conversation so that we can focus on different directions. Developing these capabilities is part of what makes this project so special – complex environments, dynamic, crowded, and noise-heavy.”

This is a research-rich project. The consortium has eight groups from different European countries and includes a robotics company from Spain that is currently finalizing the manufacturing of the robot. The pilot is expected to take place at a hospital in Paris, and Prof. Gannot is currently seeking doctoral and post-doctoral students who specialize in signal processing and machine learning to join his team. “I find this project very special, in part because there are multiple, strong research teams collaborating, and I like working with them. I’ve been working with one group from Grenoble, France, for several years now, and I love that, but I’m also excited about expanding my collaborations with the other teams. The realization that this is a real issue and the goal could truly help society is part of why I’m excited about the project,” he says. “Also academically, I think there’s potential for many important studies, both theoretical and practical. Sadly, the project had an unlucky start, because we began in January. In early February we kicked off in Paris, and then came COVID-19 and prevented us from the meeting. We work and research from home, there’s a joint lecture routine, each group shares its expertise and progress – but it’s nothing compared to visits and face-to-face meetings. Technology can’t replace personal interaction, and that’s probably another reason to enhance it with social capabilities.”

Want to read more about the project? Visit https://spring-h2020.eu/

Want to join the research? Email Prof. Sharon Gannot at Sharon.Gannot@biu.ac.il

 

Last Updated Date : 25/04/2021