AI Website Building Competition: Which AI Model Builds the Best Website?

The Experiment

We gave 12 of the most popular AI models on OpenRouter the exact same prompt: “Build a professional plumbing company website for Sydney Pro Plumbing.” Each model had to create a homepage and an emergency plumbing services page � completely self-contained HTML with inline CSS, no frameworks, no external dependencies.

Then we compared the results on design quality, content completeness, and cost.

The prompt was identical for all models. No advantages, no tweaking. Pure model capability.


The Winners

Best Overall Design: Claude Sonnet 4.6 � 8.8/10

The most complete and polished website in the competition. Every section � hero, features, services grid, testimonials, service areas, CTA, footer � was fully designed with excellent visual hierarchy. Professional blue/orange color scheme that inspires trust. Cost: $0.48

Best Value: MiniMax M2.7 � 8.6/10 for just $0.04

Nearly tied for first place in design quality at a fraction of the cost. Stunning dark navy/teal/gold palette that stood apart from every other entry. A hidden gem.

Best Free Model: Step 3.5 Flash � 7.35/10 for $0.00

Delivered a solid, professional website for literally zero cost. All required sections present with clean design.


Full Leaderboard

Rank Model Score Cost Tokens Time
1 Claude Sonnet 4.6 8.80/10 $0.482 32,821 313s
2 MiniMax M2.7 8.60/10 $0.039 32,759 405s
3 Gemini 3 Flash 8.35/10 $0.028 9,887 47s
4 MiniMax M2.5 8.00/10 $0.026 23,161 283s
5 DeepSeek V3.2 7.90/10 $0.006 17,069 136s
6 MiMo V2 Pro 7.75/10 $0.070 23,892 262s
7 MiMo V2 Omni 7.70/10 $0.035 18,057 135s
8 Step 3.5 Flash (Free) 7.35/10 FREE 17,243 139s
9 Nemotron 3 Super (Free) 7.00/10 FREE 26,721 615s
10 Grok 4.1 Fast 6.40/10 $0.006 11,976 50s
11 Claude Opus 4.6 5.80/10 $0.804 32,821 320s
12 GLM 5 Turbo 5.40/10 $0.129 32,741 401s

Key Findings

1. Price Does NOT Equal Quality

Claude Opus 4.6 was the most expensive model at $0.80 per website � yet it scored near the bottom (5.8/10) due to rendering issues. Meanwhile, MiniMax M2.7 delivered a near-winning design for just $0.04 (50x cheaper).

2. The Best Value in AI Right Now

DeepSeek V3.2 produced a solid 7.9/10 website for just $0.006 (less than a penny). That’s a production-quality website for the cost of rounding error.

3. Free Models Are Surprisingly Good

Step 3.5 Flash (free) scored 7.35/10 � a perfectly usable professional website. Nemotron 3 Super (also free) scored 7.0/10. You can build legitimate websites with free AI models in 2026.

4. Speed King: Gemini 3 Flash

Finished in 47 seconds and scored 8.35/10 (3rd place). Best combination of speed and quality.

5. The Content Trap

Models that hit their token limit (32K tokens) didn’t always produce better results. Some created verbose CSS while others used those tokens for richer content. Token efficiency matters.

6. Flagship Models Can Disappoint

Both Claude Opus 4.6 and GLM 5 Turbo � premium-priced models � underperformed significantly. Their pages had rendering issues and empty sections. More expensive doesn’t mean better at every task.


Value Rankings (Score per Dollar)

Rank Model Score/$ Score Cost
1 Step 3.5 Flash FREE 7.35 $0.00
2 Nemotron 3 Super FREE 7.00 $0.00
3 DeepSeek V3.2 1,317 pts/$ 7.90 $0.006
4 Grok 4.1 Fast 1,067 pts/$ 6.40 $0.006
5 MiniMax M2.5 308 pts/$ 8.00 $0.026
6 Gemini 3 Flash 298 pts/$ 8.35 $0.028
7 MiniMax M2.7 221 pts/$ 8.60 $0.039
8 MiMo V2 Omni 220 pts/$ 7.70 $0.035
9 MiMo V2 Pro 111 pts/$ 7.75 $0.070
10 GLM 5 Turbo 42 pts/$ 5.40 $0.129
11 Claude Sonnet 4.6 18 pts/$ 8.80 $0.482
12 Claude Opus 4.6 7 pts/$ 5.80 $0.804

Methodology

  • Prompt: Identical for all 12 models � build a Sydney plumbing company homepage + emergency services page
  • Requirements: Self-contained HTML, inline CSS, no frameworks, Google Fonts, responsive design
  • Scoring: Visual design (25%), layout (20%), typography (15%), content (15%), UX (10%), responsiveness (10%), code quality (5%)
  • API: All models accessed via OpenRouter API
  • Token limit: 32,000 max tokens per model
  • Total competition cost: ~$1.62 across all 12 models

Competition run on March 30, 2026 using OpenRouter API. All models used with default temperature (0.7).