An Eye For AI
Subscribe
Text-to-Video
Diffusion Models
Dataset
Prompts
Real Prompt-Gallery Dataset for Text-to-Video Models

The evolution of text-to-video diffusion models has been remarkable, with Sora presenting major advancements in video generation. Understanding the importance of prompts in this domain, a paper titled VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models unveils VidProM—a dataset of 1.67 million unique text-to-video prompts.

  • Sora and other models depend on prompts to generate quality videos.
  • No previous datasets provided insight into user preferences and prompt types for text-to-video.
  • VidProM includes videos generated by four top-tier diffusion models for comprehensive analysis.
  • Offers insights into real user preferences and motivates the research of text-to-video prompt engineering.
  • Available on GitHub and Hugging Face, contributing to the community and encouraging further research.

This dataset marks a significant step in understanding and optimizing text-to-video diffusion models. Researchers can leverage VidProM to develop better, safer, and more efficient models while gaining a rich understanding of user intentions and market needs.

Personalized AI news from scientific papers.