Socially Aware Robot Assistant
SARA is a Socially-Aware Robot Assistant that interacts with people in a whole new way, personalizing the interaction and improving task performance by relying on information about the relationship between the human user and virtual assistant. Rather than taking the place of people, Sara is programmed to collaborate with her human users. Rather than ignoring the socio-emotional bonds that form the fabric of society, SARA depends on those bonds to improve her collaboration skills.
SARA at the World Economic Forum
SARA was presented at the World Economic Forum (WEF) Annual Meeting in Davos (January 17-20, 2017). The SARA booth was located right in the middle of the main corridor of the Davos Congress Center and was, in fact, the only demo in the Congress Center.
SARA had access to the WEF database of sessions being presented, participants attending, demos being shown in the Loft across the street, and places to get food in the Congress Center (she also knew about some private parties – information she was willing to share if asked nicely!). SARA was programmed to use this information to act as a virtual personal assistant. She assisted the global leaders attending Davos by finding out about their interests and goals in attending the WEF and then recommending sessions and people who were relevant to their interests and goals. In so-doing, SARA showed what it means to have socially-aware Artificial Intelligence. That is SARA used the conversation to build a relationship with the person talking to her, and then used that relationship to obtain better information about his/her interests and goals. In turn, that allowed her to do a better job recommending sessions and people.
SARA is designed to build interpersonal closeness or rapport over the course of a conversation by managing rapport through the understanding and generation of visual, vocal, and verbal behaviors. The ArticuLab always begins by studying human-human interaction, using that as the basis for our design of artificial intelligence systems. Leveraging our prior work on the dynamics of rapport in human-human conversation this SARA system includes the following components:
- The computational model of rapport: The computational model is the first to explain how humans in dyadic interactions build, maintain, and destroy rapport through the use of specific conversational strategies that function to fulfill specific social goals, and that are instantiated in particular verbal and nonverbal behaviors.
- Conversational strategy classification: The conversational strategy classifier can recognize high-level language strategies closely associated with social goals through training on linguistic features associated with those conversational strategies in a test set.
- Rapport level estimation: The rapport estimator estimates the current rapport level between the user and the agent using temporal association rules.
- Social and task reasoning: The social reasoner outputs a conversational strategy that the system must adopt in the current turn. The reasoner is modeled as the spreading activation network.
- Natural language and nonverbal behavior generation: The natural language generation module expresses conversational strategies in specific language and associated nonverbal behaviors, and they are performed by a virtual human.
The system’s architecture is organized around a task-pipeline and a social-pipeline [Matsuyama et al. 2016]. The task-pipeline consists of a task-oriented Natural Language Understanding (NLU), extracting user’s intention from its speech, and a Task Reasoner selecting SARA’s next intention based on the NLU’s output. The social-pipeline consists of three different modules. The Conversational Strategy Classifier detects user’s conversational strategy based on user’s multimodal cues [Zhao et al. 2016a], the Rapport Estimator relies on these conversational strategies as well as visual and acoustic features to predict the level of rapport going on during the interaction [Zhao et al. 2016b], and the Social Reasoner selects SARA’s next conversational strategy based on the history of the interaction [Romero et al. 2017]. Given the system’s task and social intentions decided by the Task and Social Reasoners, a Natural Language Generator (NLG) and Nonverbal Behavior Generator interpreted these intentions into a sentence and nonverbal behavior plans rendered on SARA’s character animation realizer and Text-to-Speech (TTS). The system also had access to the recommendation database, user authentication and messenger applications of the online collaboration platform system.
Analysis of the Field Studies
Research Question: “How does the task performance of a personal assistant affect the dynamics of rapport over the course of an interaction?“
Participants interacted with SARA during the conference, receiving recommendations about sessions to attend and/or people to meet. After the attendees entered the booth, SARA first introduced herself and asked several questions about the attendees’ current feelings and mood. Then, the attendees were asked about their occupation as well as their interests and goals for attending the conference. SARA would then cycle through several rounds of people and/or session recommendations, showing information about the recommendation on the virtual board behind her. The attendees were able to request as many recommendations as desired, and were able to leave the booth anytime they wanted. Finally, SARA proposed to take a “selfie” with the attendees before saying farewell. During each interaction, attendees’ video and audio were recorded using a camera and a microphone. SARA’s animations, for their part, were recorded separately in a log file. Audio records were used to get text transcriptions of both attendee’s and SARA’s utterances using a third party transcription service. These transcriptions contained turn-taking information such as speaker ID and starting and ending timestamps for each turn. With rapport being a dyadic phenomenon, we eventually reconstructed the interactions to have both attendee and SARA present in the same video before annotating them. Our corpus contains data from 69 of these interactions, including both attendee’s and SARA’s video, audio and textual speech transcription, which combined accounted for more than 5 hours of interaction (total time = 21055 seconds, mean session duration = 305.15 seconds, SD = 65.00 seconds). Out of these 69 attendees, 29 were women and 40 were men. We did not gather any information about the attendees’ age or nationality.
For the details of the data analysis, please refer to [Pecune et al. 2018].
- [Matsuyama et al. 2016] Matsuyama, M., Bhardwaj, A., Zhao, R., Romero, O., Akoju, S., Cassell, J. (2016, September). Socially-Aware Animated Intelligent Personal Assistant Agent, 17th Annual SIGDIAL Meeting on Discourse and Dialogue
- [Goel et al. 2018] Pranav Goel, Yoichi Matsuyama, Michael Madaio and Justine Cassell, “I think it might help if we multiply, and not add” : Detecting Indirectness in Conversation, International Workshop on Spoken Dialog System Technology (IWSDS 2018). | PDF
- [Pecune et al. 2018] Florian Pecune, Jingya Chen, Yoichi Matsuyama and Justine Cassell, Field Study Analysis of a Socially Aware Robot Assistant, Proceedings of the special track Socially Interactive Agents (SIA) at the 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2018).
- [Zhao et al. 2016a] Zhao, R., Sinha, T., Black, A., & Cassell, J. (2016, September). Automatic Recognition of Conversational Strategies in the Service of a Socially-Aware Dialog System, 17th Annual SIGDIAL Meeting on Discourse and Dialogue.
- [Zhao et al. 2016b] Zhao, R., Sinha, T., Black, A., & Cassell, J. (2016, September). Socially-Aware Virtual Agents: Automatically Assessing Dyadic Rapport from Temporal Patterns of Behavior, 16th International Conference on Intelligent Virtual Agents (IVA) [*Best Student Paper]
- [Romero et al. 2017] Oscar Romero, Ran Zhao, and Justine Cassell. Cognitive-inspired Conversational-Strategy Reasoner for Socially-Aware Agents, In Proceedings of the 26th International Joint Conference on Artificial Intelligence, pp. 3807-3813. AAAI Press, 2017.
This work was supported in part by generous funding from Microsoft, LivePerson, Google, and the IT R&D program of MSIP/IITP [2017-0-00255, Autonomous Digital Companion Development].
- MIT Technology Review, Chatbots with Social Skills Will Convince You to Buy Something
- Google Blog (Google for Education), At Carnegie Mellon University, machine learning gets social
- BBC Business Daily, Talking to Robots
- Popular Science, S.A.R.A. Seeks To Give Artificial Intelligence People Skills
- Science Friday, Are Digital Assistants Smart Enough to Do Their Jobs?
- CNET, The Advent of Virtual Humans
- CNBC, Best of World Economic Forum in Tianjin
- Foreign Policy, “Is AI Sexist?“
- CNBC Africa, “Meet SARA, the socially-aware robot“
- Al Arabiya, “Meet Sara…the robot that knows the ins and outs of Davos“
- USA TODAY, “This robot assistant can understand facial expressions“
- CNBC, “On the ground at the Summer Davos“
- CNET, “The advent of virtual humans“
- WASHINGTON POST, The big contradiction in how the world’s most powerful people think about its future
- BLOOMBERG QUINT, SARA: A socially aware robotic assistant that reads your mood
- PressTV, Socially Aware Robot Assistant Displayed at WEF
- INDIA TIMES, Now, a socially-aware robotic assistant that gets your mood!
- THE TRIBUNE, A robotic assistant that gets your mood!
- DAILY NEWS AND ANALYSIS (DNA), Fourth industrial revolution on full display at the WEF
- FINANCIAL TIMES, Tech leaders at Davos fret over effect of AI on jobs
- JLL (Sushell Koul), From macro to machines
- CMSWIRE, Step Up Your Personalization Game with AI
- TECH FACTS LIVE, Socially Aware Robotic Assistant Introduced at World Economic Forum
- ROBOTICS AND AUTOMATION NEWS, Job-stealing robots a growing concern for world leaders
- PTS NEWS NETWORK, Socially-Aware robot frees humans from repetitive work
- DIGITAL MARKETING BUREAU, Sara: The Socially Aware Robot Assistant
- HUMAN INSIDE, S.A.R.A., the sensitive robot who improves people’s performance
- D!GITALIST MAGAZINE, Empathy: The Killer App for Artificial Intelligence
- Zee News, Here’s a socially-aware robotic assistant that gets your mood!
- The Indian Express, Now, a socially-aware robotic assistant (SARA) that gets your mood
- DE TIJD, https://twitter.com/Dorien_Luyckx/status/821731750950871040
- DW – Business, https://twitter.com/dw_business/status/821671699695435776
- CCTV (China Central Television)
- Tartan (Carnegie Mellon’s Student Newspaper) The Frontier Conference exhibits featured new technology
- Radio Sputnik, These are robots that intend to make society stronger by focusing on social bond
- Atelier.net, chatbots of growing human
- Radio Canada, Here are 4 robots at your service
- Point Park Globe, President Obama, White House host science, technology conference at CMU