LLecture notes in computer science1/1/2026

Do We Still Need Text Features for Video Retrieval in the Era of Vision-Language Models?

Jiaqi Samantha Zhan·Jimmy Lin·Shengyao Zhuang·Xueguang Ma·Crystina Zhang

Read at Lecture notes in computer science

Tags

Multimodal Machine Learning ApplicationsComputer Vision and Pattern Recognition