Towards Fairer Health Recommendations: Finding Informative Unbiased Samples via Word Sense Disambiguation

September 2024 in “ arXiv (Cornell University) ”

Gavin Butts, Pegah Emdad, jeong-in Lee, Shannon Song, Chiman Salavati, Willmar Sosa Diaz, Shiri Dori-Hacohen, Fabrício Murai

TLDR Fine-tuned BERT models are better than LLMs for detecting bias in medical data.

The study addresses the issue of biased medical data in health-related applications by using AI to debias data rather than correcting model biases. Researchers evaluated NLP models, including LLMs and fine-tuned BERT models, on a dataset of 4,105 excerpts annotated for bias. They proposed using Word Sense Disambiguation models to improve dataset quality by removing irrelevant sentences. The study found that while LLMs are generally state-of-the-art for many NLP tasks, they are unsuitable for bias detection. In contrast, fine-tuned BERT models performed well across all evaluated metrics.

View this study on arxiv.org →

Discuss this study in the Community →