# AI Website Building Competition: Which AI Model Builds the Best Website?

## The Experiment

We gave **12 of the most popular AI models on OpenRouter** the exact same prompt: *"Build a professional plumbing company website for Sydney Pro Plumbing."* Each model had to create a homepage and an emergency plumbing services page � completely self-contained HTML with inline CSS, no frameworks, no external dependencies.

Then we compared the results on design quality, content completeness, and cost.

**The prompt was identical for all models.** No advantages, no tweaking. Pure model capability.

---

## The Winners

### Best Overall Design: Claude Sonnet 4.6 � 8.8/10
The most complete and polished website in the competition. Every section � hero, features, services grid, testimonials, service areas, CTA, footer � was fully designed with excellent visual hierarchy. Professional blue/orange color scheme that inspires trust. Cost: **$0.48**

### Best Value: MiniMax M2.7 � 8.6/10 for just $0.04
Nearly tied for first place in design quality at a fraction of the cost. Stunning dark navy/teal/gold palette that stood apart from every other entry. A hidden gem.

### Best Free Model: Step 3.5 Flash � 7.35/10 for $0.00
Delivered a solid, professional website for literally zero cost. All required sections present with clean design.

---

## Full Leaderboard

| Rank | Model | Score | Cost | Tokens | Time |
|------|-------|-------|------|--------|------|
| 1 | **Claude Sonnet 4.6** | 8.80/10 | $0.482 | 32,821 | 313s |
| 2 | **MiniMax M2.7** | 8.60/10 | $0.039 | 32,759 | 405s |
| 3 | **Gemini 3 Flash** | 8.35/10 | $0.028 | 9,887 | 47s |
| 4 | MiniMax M2.5 | 8.00/10 | $0.026 | 23,161 | 283s |
| 5 | DeepSeek V3.2 | 7.90/10 | $0.006 | 17,069 | 136s |
| 6 | MiMo V2 Pro | 7.75/10 | $0.070 | 23,892 | 262s |
| 7 | MiMo V2 Omni | 7.70/10 | $0.035 | 18,057 | 135s |
| 8 | Step 3.5 Flash (Free) | 7.35/10 | FREE | 17,243 | 139s |
| 9 | Nemotron 3 Super (Free) | 7.00/10 | FREE | 26,721 | 615s |
| 10 | Grok 4.1 Fast | 6.40/10 | $0.006 | 11,976 | 50s |
| 11 | Claude Opus 4.6 | 5.80/10 | $0.804 | 32,821 | 320s |
| 12 | GLM 5 Turbo | 5.40/10 | $0.129 | 32,741 | 401s |

---

## Key Findings

### 1. Price Does NOT Equal Quality
**Claude Opus 4.6** was the most expensive model at **$0.80** per website � yet it scored near the bottom (5.8/10) due to rendering issues. Meanwhile, **MiniMax M2.7** delivered a near-winning design for just **$0.04** (50x cheaper).

### 2. The Best Value in AI Right Now
**DeepSeek V3.2** produced a solid 7.9/10 website for just **$0.006** (less than a penny). That's a production-quality website for the cost of rounding error.

### 3. Free Models Are Surprisingly Good
**Step 3.5 Flash** (free) scored 7.35/10 � a perfectly usable professional website. **Nemotron 3 Super** (also free) scored 7.0/10. You can build legitimate websites with free AI models in 2026.

### 4. Speed King: Gemini 3 Flash
Finished in **47 seconds** and scored 8.35/10 (3rd place). Best combination of speed and quality.

### 5. The Content Trap
Models that hit their token limit (32K tokens) didn't always produce better results. Some created verbose CSS while others used those tokens for richer content. **Token efficiency matters.**

### 6. Flagship Models Can Disappoint
Both Claude Opus 4.6 and GLM 5 Turbo � premium-priced models � underperformed significantly. Their pages had rendering issues and empty sections. More expensive doesn't mean better at every task.

---

## Value Rankings (Score per Dollar)

| Rank | Model | Score/$ | Score | Cost |
|------|-------|---------|-------|------|
| 1 | Step 3.5 Flash | FREE | 7.35 | $0.00 |
| 2 | Nemotron 3 Super | FREE | 7.00 | $0.00 |
| 3 | DeepSeek V3.2 | 1,317 pts/$ | 7.90 | $0.006 |
| 4 | Grok 4.1 Fast | 1,067 pts/$ | 6.40 | $0.006 |
| 5 | MiniMax M2.5 | 308 pts/$ | 8.00 | $0.026 |
| 6 | Gemini 3 Flash | 298 pts/$ | 8.35 | $0.028 |
| 7 | MiniMax M2.7 | 221 pts/$ | 8.60 | $0.039 |
| 8 | MiMo V2 Omni | 220 pts/$ | 7.70 | $0.035 |
| 9 | MiMo V2 Pro | 111 pts/$ | 7.75 | $0.070 |
| 10 | GLM 5 Turbo | 42 pts/$ | 5.40 | $0.129 |
| 11 | Claude Sonnet 4.6 | 18 pts/$ | 8.80 | $0.482 |
| 12 | Claude Opus 4.6 | 7 pts/$ | 5.80 | $0.804 |

---

## Methodology

- **Prompt**: Identical for all 12 models � build a Sydney plumbing company homepage + emergency services page
- **Requirements**: Self-contained HTML, inline CSS, no frameworks, Google Fonts, responsive design
- **Scoring**: Visual design (25%), layout (20%), typography (15%), content (15%), UX (10%), responsiveness (10%), code quality (5%)
- **API**: All models accessed via OpenRouter API
- **Token limit**: 32,000 max tokens per model
- **Total competition cost**: ~$1.62 across all 12 models

---

*Competition run on March 30, 2026 using OpenRouter API. All models used with default temperature (0.7).*
