Med-VLM - Prompt Adaptation of Vision-Language Models to Medical Domains

Abstract:

Medical imaging faces persistent domain shifts across scanners, protocols, pathologies, and populations. Recent vision-language models, such as CLIP (Contrastive Language-Image Pre-training), learns joint representations from image-text pairs, enabling zero-shot transfer to new visual concepts through natural language descriptions. Although the recent works demonstrate that CLIP based models offer strong zero-shot transfer, their performance degrades on unseen medical data.

Recent adaptation techniques aim to mitigate this by proposing prompt-based, parameter-efficient, and feature-level strategies. Complementing this, text-only prompt learning optimizes prompts in the text space using LLM-derived biomedical descriptions or ontology-based prototypes, improving transfer to unseen data. This Master’s thesis proposes adapting CLIP-based models to unseen medical data through techniques including prompt learning and CLIP’s consistent contrastive alignment, emphasizing practical clinical feasibility.

This Master thesis aims to analyze and propose: (i) an adaptation framework that leverages medical concepts into prompts to improve robustness on unseen data, (ii) prompt-based adaptation procedures for non-independent and identically distributed (non-i.i.d) shifts through lightweight mechanisms, (iii) a comprehensive evaluation on multi-source medical benchmarks. Expected outcomes include, but are not limited to: a vision and language-guided framework for medical imaging that improves generalization across multiple scanners and diverse population data, a lightweight test-time adaptation method with biomedical prompts, and analysis on medical benchmarks with ablations under realistic scenarios. This Master’s thesis aspires to publish results in a relevant academic venue.

The start date ideally is Sept/Oct 2025. The thesis is offered by the Chair for Computational Imaging and AI in Medicine (Prof. Dr. Julia Schnabel) and supervised by Sameer Ambekar and Dr. Daniel Lang. If interested, please email your transcripts and a short motivation to sameer.ambekar@tum.de and lang@helmholtz-munich.de.

Sameer Ambekar
Sameer Ambekar
Doctoral Researcher

My research interests include Domain Generalization, Meta learning, Variational Inference

Daniel M. Lang
Daniel M. Lang
Research Scientist

My current research focuses on the development of deep generative models for dynamic settings in cancer imaging.