Disentangling multi-modal fusion for financial distress prediction with multi-level attention mechanisms