GC-ViT-E: Enhanced global context vision transformers for robust skin cancer classification under severe class imbalance

Accurate multi-class skin cancer classification is clinically valuable only when models perform reliably across both common and rare diagnostic categories. To address persistent long-tail failures, we systematically evaluate three enhanced vision-transformer architectures, GC-ViT Small, CoAt-Lite, and FocalNet, augmented with two novel attention modules: Transformer Blocks for global context integration and Separable Self-Attention for precision–recall calibration. Models were evaluated on held-