A subjective framework for evaluating AI design capabilities
Enter your design prompt or requirement. Each model receives the identical brief and system prompt based on your category.
View side-by-side comparisons of AI-generated designs. Models are anonymized during evaluation to eliminate bias. Toggle between rendered preview and source code. Pick the design you like more.
After five rounds (first two, second two, winner's bracket, loser's bracket, and 2nd vs 3rd), you unveil your pick for first, second, third, and fourth place models. We conduct the additional final round to ensure that model speed doesn't skew the results.
Good design isn’t just functional — it reflects aesthetic values. Design Arena explores whether AI can exhibit taste, as measured by your human judgement.
We focus on what AI can actually do today. This is about grounded comparisons, not cherry-picked examples.
These live matchups hold up a mirror to current model performance, limitations, and stylistic tendencies.
Source code and visualizations reveal deeper insights into how state-of-the-art models “think” about UI.