S³CIX 2026

Track All
keynote

Machine Learning for Human–Robot Interaction. Vision-Language Grounding, Personalization, and Failure Awareness

Mohamed Chetouani

on  Fr, 9:30 ! Livein  SAWB 422/423for  60min

Abstract: This talk explores recent advances in Machine Learning for Human–Robot Interaction (HRI), with a focus on Vision-Language Models (VLMs) for collaborative object manipulation. We present an integrated approach combining long-term multimodal personalization, ambiguity-aware interactive visual grounding, and semantic failure detection. The lecture will show how robots can maintain user-specific models, detect unclear instructions and ask for clarification, and recognize when their actions are semantically misaligned with human intent. We will also situate these contributions within the broader framework of interactive robot learning, discussing how humans teach embodied agents through feedback, demonstrations, and instructions, as well as the current limitations and open challenges for building robust, trustworthy, and adaptive human–robot collaboration systems

 All speakers  Program