AI Website Building Competition: Which AI Model Builds the Best Website?
The Experiment
We gave 12 of the most popular AI models on OpenRouter the exact same prompt: “Build a professional plumbing company website for Sydney Pro Plumbing.” Each model had to create a homepage and an emergency plumbing services page � completely self-contained HTML with inline CSS, no frameworks, no external dependencies.
Then we compared the results on design quality, content completeness, and cost.
The prompt was identical for all models. No advantages, no tweaking. Pure model capability.
The Winners
Best Overall Design: Claude Sonnet 4.6 � 8.8/10
The most complete and polished website in the competition. Every section � hero, features, services grid, testimonials, service areas, CTA, footer � was fully designed with excellent visual hierarchy. Professional blue/orange color scheme that inspires trust. Cost: $0.48
Best Value: MiniMax M2.7 � 8.6/10 for just $0.04
Nearly tied for first place in design quality at a fraction of the cost. Stunning dark navy/teal/gold palette that stood apart from every other entry. A hidden gem.
Best Free Model: Step 3.5 Flash � 7.35/10 for $0.00
Delivered a solid, professional website for literally zero cost. All required sections present with clean design.
Full Leaderboard
| Rank | Model | Score | Cost | Tokens | Time |
|---|---|---|---|---|---|
| 1 | Claude Sonnet 4.6 | 8.80/10 | $0.482 | 32,821 | 313s |
| 2 | MiniMax M2.7 | 8.60/10 | $0.039 | 32,759 | 405s |
| 3 | Gemini 3 Flash | 8.35/10 | $0.028 | 9,887 | 47s |
| 4 | MiniMax M2.5 | 8.00/10 | $0.026 | 23,161 | 283s |
| 5 | DeepSeek V3.2 | 7.90/10 | $0.006 | 17,069 | 136s |
| 6 | MiMo V2 Pro | 7.75/10 | $0.070 | 23,892 | 262s |
| 7 | MiMo V2 Omni | 7.70/10 | $0.035 | 18,057 | 135s |
| 8 | Step 3.5 Flash (Free) | 7.35/10 | FREE | 17,243 | 139s |
| 9 | Nemotron 3 Super (Free) | 7.00/10 | FREE | 26,721 | 615s |
| 10 | Grok 4.1 Fast | 6.40/10 | $0.006 | 11,976 | 50s |
| 11 | Claude Opus 4.6 | 5.80/10 | $0.804 | 32,821 | 320s |
| 12 | GLM 5 Turbo | 5.40/10 | $0.129 | 32,741 | 401s |
Key Findings
1. Price Does NOT Equal Quality
Claude Opus 4.6 was the most expensive model at $0.80 per website � yet it scored near the bottom (5.8/10) due to rendering issues. Meanwhile, MiniMax M2.7 delivered a near-winning design for just $0.04 (50x cheaper).
2. The Best Value in AI Right Now
DeepSeek V3.2 produced a solid 7.9/10 website for just $0.006 (less than a penny). That’s a production-quality website for the cost of rounding error.
3. Free Models Are Surprisingly Good
Step 3.5 Flash (free) scored 7.35/10 � a perfectly usable professional website. Nemotron 3 Super (also free) scored 7.0/10. You can build legitimate websites with free AI models in 2026.
4. Speed King: Gemini 3 Flash
Finished in 47 seconds and scored 8.35/10 (3rd place). Best combination of speed and quality.
5. The Content Trap
Models that hit their token limit (32K tokens) didn’t always produce better results. Some created verbose CSS while others used those tokens for richer content. Token efficiency matters.
6. Flagship Models Can Disappoint
Both Claude Opus 4.6 and GLM 5 Turbo � premium-priced models � underperformed significantly. Their pages had rendering issues and empty sections. More expensive doesn’t mean better at every task.
Value Rankings (Score per Dollar)
| Rank | Model | Score/$ | Score | Cost |
|---|---|---|---|---|
| 1 | Step 3.5 Flash | FREE | 7.35 | $0.00 |
| 2 | Nemotron 3 Super | FREE | 7.00 | $0.00 |
| 3 | DeepSeek V3.2 | 1,317 pts/$ | 7.90 | $0.006 |
| 4 | Grok 4.1 Fast | 1,067 pts/$ | 6.40 | $0.006 |
| 5 | MiniMax M2.5 | 308 pts/$ | 8.00 | $0.026 |
| 6 | Gemini 3 Flash | 298 pts/$ | 8.35 | $0.028 |
| 7 | MiniMax M2.7 | 221 pts/$ | 8.60 | $0.039 |
| 8 | MiMo V2 Omni | 220 pts/$ | 7.70 | $0.035 |
| 9 | MiMo V2 Pro | 111 pts/$ | 7.75 | $0.070 |
| 10 | GLM 5 Turbo | 42 pts/$ | 5.40 | $0.129 |
| 11 | Claude Sonnet 4.6 | 18 pts/$ | 8.80 | $0.482 |
| 12 | Claude Opus 4.6 | 7 pts/$ | 5.80 | $0.804 |
Methodology
- Prompt: Identical for all 12 models � build a Sydney plumbing company homepage + emergency services page
- Requirements: Self-contained HTML, inline CSS, no frameworks, Google Fonts, responsive design
- Scoring: Visual design (25%), layout (20%), typography (15%), content (15%), UX (10%), responsiveness (10%), code quality (5%)
- API: All models accessed via OpenRouter API
- Token limit: 32,000 max tokens per model
- Total competition cost: ~$1.62 across all 12 models
Competition run on March 30, 2026 using OpenRouter API. All models used with default temperature (0.7).