Robust Image Watermarking via Clustered Visual State-Space Modeling

Most existing DNN-based image watermarking methods adopt an “encoder–noise–decoder” paradigm, where the watermark is typically replicated and expanded in a straightforward manner and then directly fused with image features, which limits robustness under complex distortions. Although Transformers improve fusion via attention mechanisms, their quadratic computational complexity makes high-resolution processing prohibitively expensive. To address these issues, we propose CCViM, a robust watermarkin