Automir: Effective Zero-Shot Medical Information Retrieval Without Relevance Labels
January 2025
TLDR SL-HyDE improves medical information retrieval accuracy without needing labeled data.
The study introduces SL-HyDE, a framework for zero-shot medical information retrieval that uses large language models to generate hypothetical documents based on queries, improving retrieval accuracy without the need for relevance-labeled data. It employs a self-learning mechanism to enhance document generation and retrieval iteratively. The research also presents the Chinese Medical Information Retrieval Benchmark (CMIRB) for evaluating MIR systems across five tasks and ten datasets. Experimental results show that SL-HyDE significantly outperforms existing models like HyDE in terms of retrieval accuracy, generalization, and scalability.