WebSight Dataset for UI Conversion

The AI NL

Web Development

Dataset

Vision-Language Models

Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset introduces an innovative dataset catered to empower vision-language models (VLMs) to transform UI screenshots into HTML.

WebSight comprises 2 million HTML and screenshot pairs, serving as a high-quality dataset for model fine-tuning.
The fine-tuned foundational VLM on WebSight showcases proficiency in generating functional HTML from webpage screenshots.
Aimed at boosting no-code solution efficiency, the authors release the dataset open-source to further research.

The ability for VLMs to interpret and recreate web page designs from screenshots could revolutionize web development processes, especially in no-code environments. The open-sourcing of the WebSight dataset is expected to spur significant advancements in this domain.

Personalized AI news from scientific papers.