AniTales: End-to-End Multimodal Story Generation Through Natural Language Prompting (Student Abstract)

We present AniTales, a system designed to generate multimodal visual novels from natural language prompts. Our system integrates large language models for story generation, diffusion models for character art, and text-to-speech for voice acting. This paper describes the system's architecture and presents findings from a pilot user study. We evaluated the system with general users (n=10) and domain experts (n=5), focusing on usability, coherence, and visual consistency. General users reported hig