Smart Speaker Determines Optimal Timing to Talk
A KAIST research team has developed a new context-awareness technology that enables AI assistants to determine when to talk to their users based on user circumstances. This technology can contribute to developing advanced AI assistants that can offer pre-emptive services such as reminding users to take medication on time or modifying schedules based on the actual progress of planned tasks.
Unlike conventional AI assistants that used to act passively upon users’ commands, today’s AI assistants are evolving to provide more proactive services through self-reasoning of user circumstances. This opens up new opportunities for AI assistants to better support users in their daily lives. However, if AI assistants do not talk at the right time, they could rather interrupt their users instead of helping them.
The right time for talking is more difficult for AI assistants to determine than it appears. This is because the context can differ depending on the state of the user or the surrounding environment.
A group of researchers led by Professor Uichin Lee from the KAIST School of Computing identified key contextual factors in user circumstances that determine when the AI assistant should start, stop, or resume engaging in voice services in smart home environments. Their findings were published in the Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT) in September.
The group conducted this study in collaboration with Professor Jae-Gil Lee’s group in the KAIST School of Computing, Professor Sangsu Lee’s group in the KAIST Department of Industrial Design, and Professor Auk Kim’s group at Kangwon National University.
After developing smart speakers equipped with AI assistant function for experimental use, the researchers installed them in the rooms of 40 students who live in double-occupancy campus dormitories and collected a total of 3,500 in-situ user response data records over a period of a week.
The smart speakers repeatedly asked the students a question, “Is now a good time to talk?” at random intervals or whenever a student’s movement was detected. Students answered with either “yes” or “no” and then explained why, describing what they had been doing before being questioned by the smart speakers.
Data analysis revealed that 47% of user responses were “no” indicating they did not want to be interrupted. The research team then created 19 home activity categories to cross-analyze the key contextual factors that determine opportune moments for AI assistants to talk, and classified these factors into ‘personal,’ ‘movement,’ and ‘social’ factors respectively.
Personal factors, for instance, include:
1. the degree of concentration on or engagement in activities,
2. the degree urgency and busyness,
3. the state of user’s mental or physical condition, and
4. the state of being able to talk or listen while multitasking.
While users were busy concentrating on studying, tired, or drying hair, they found it difficult to engage in conversational interactions with the smart speakers.
(Content Courtesy: https://news.kaist.ac.kr/newsen/html/news/?mode=V&mng_no=10771&skey=category&sval=research&list_s_date=&list_e_date=&GotoPage=1)