Bernd Huber, Research Scientist

Bernd Huber

Senior Research Scientist, Spotify

PhD Computer Science, Harvard University

I work on reward modeling, RLHF/DPO, and evaluation for agentic LLM systems. I developed Embedding-to-Prefix and contribute to the systems behind AI Playlist and AI DJ. My PhD at Harvard focused on dialogue systems.

Publications

Preference optimization flywheel

Personalizing Agentic AI to Users' Musical Tastes with Scalable Preference Optimization

Bernd Huber, Sebastian Peleato, Mounia Lalmas-Roellke, Paul N. Bennett

I developed a hybrid approach combining reward models and Direct Preference Optimization (DPO) for tool orchestration in LLM-based agentic systems. The method achieved a 70% reduction in erroneous tool calls and a 4% lift in listening time.
[Spotify Research, 2025]

Embedding-to-Prefix visualization

Embedding-to-Prefix: Parameter-Efficient Personalization for Pre-Trained Large Language Models

Bernd Huber, Ghazal Fazelnia, Andreas Damianou, Sebastian Peleato, Max Lefarov, Praveen Ravichandran, Marco De Nadai, Mounia Lalmas-Roellke, Paul N. Bennett

I developed a novel architecture that enables deep personalization of large language models using pre-computed user embeddings. This method bridges representation learning and generative AI, allowing foundation models to be steered by rich user context without costly fine-tuning. The approach achieves strong personalization while maintaining computational efficiency at scale.
[NeurIPS CCFM, 2025]

Multimodal dialogue system

Emotional Dialogue Generation Using Image-Grounded Language Models

Bernd Huber, Daniel McDuff, Chris Brockett, Michel Galley, Bill Dolan

I built a multimodal dialogue system that generates contextually appropriate responses by jointly processing text and visual information. This work established foundational methods for incorporating visual sentiment and scene understanding into conversational AI, demonstrating how computational systems can respond to nuanced, multimodal human inputs.
[CHI, 2018]