SMM26

Event	Time Slot (UTC+2)
Workshop Introduction	10:30 to 10:50
Keynote Speech: AI for Creators: Pushing Creative Abilities to the Next Level Dr. Yuki Misufuji (Distinguished Engineer at Sony Group Corporation, Japan) This talk explores how cutting-edge generative AI is transforming creative workflows in music, cinema, and gaming. Led by Dr. Yuki Mitsufuji, the Music Foundation Model Team at Sony AI has developed multimodal frameworks such as MMAudio, which generate high-quality, synchronized audio from video and text inputs. Their research, recognized at top venues like NeurIPS, ICLR, and CVPR, has contributed to both content creation and protection, with practical demos integrated into commercial products. The session will highlight key innovations, including sound restoration projects and the future of AI-powered media production.	10:50 to 11:35
Coffee Break	11:35 to 11:50
Oral presentations (20 minutes each, including Q&A) Paper#1: Musitopia: Bridging Human Expertise and AI for Digital Music Therapy Paper#2: Parametric analysis of feature specific neural coding during music imagery and perception	11:50 to 12:30
Special Talk: Subhrojyoti Roy Chaudhuri (TCS Research)	12:30 to 13:00
Lunch Break	13:00 to 14:30
Keynote Speech: AI & The Sound of Mental Health: Remixed Feelings Professor. Björn Schuller (Professor of Artificial Intelligence and the Head of GLAM at Imperial College London/UK) Mental health has a sound, and AI is beginning to hear and mix it. In the voice and language, we find traces of affect, depression, and recovery that can open new pathways for scalable and personalised care. Starting there, we will explore how speech-based digital psychology is expanding from diagnosis toward intervention: from vocal and linguistic biomarkers of mental well-being to large language models that support rather than simply assess, reaching to generative music systems that enable closed-loop personalised emotional regulation. This convergence of speech AI, interventive language technology, and generative audio suggests a future in which intelligent systems can listen, understand, and respond in psychologically meaningful ways. Realising that future, however, requires more than technical performance. It demands beyond clinical relevance reliability, safety, explainability, and trust. I will discuss the promise, route, and responsibility of building AI that does not merely analyse the mind, but helps care and deejay for it.	14:30 to 15:15
Coffee Break	15:15 to 15:30
Oral presentations (20 minutes each, including Q&A) Paper#3: voice2mode: Phonation Mode Classification in Singing using Self-Supervised Speech Models Paper#4: Induced Mood Shift and Cognitive Adaptation During Free Play in a Changing Tonic Context/li>	15:30 to 16:10
Pit Stop	16:10 to 16:20
Brainstorming Session/Panel Discussion	16:20 to 17:00
Workshop Conclusion and Takeaways	17:00 to 17:20