MSF-VMDNet for Multi-Class Segmentation of Skin Cancer Whole Slide Images Using a Multi-Frequency Dual Encoder Network
April 2026
in “
Scientific Reports
”
The study introduces MSF-VMDNet, a novel deep learning model designed for multi-class segmentation of skin cancer whole-slide images, addressing the complexity of differentiating 10 distinct tissue classes. This model combines U-Net and Vision Mamba dual encoders to enhance feature extraction and segmentation accuracy. The U-Net encoder uses an improved AFNO spectral decomposition module for high-resolution semantic information, while the Vision Mamba encoder optimizes long-range dependency modeling. The SCConv module fuses features from various frequency domains and spatial levels. MSF-VMDNet outperforms existing methods, achieving an MIoU of 95.37% and a Dice coefficient of 95.11%, and demonstrates strong generalization across multiple datasets.