How I Test AI-Generated Websites (Lovable, Bolt, v0, Framer)

AI website builders have changed how fast people can go from idea to live product. Lovable, Bolt, v0, Framer — a founder can have a working site in hours. That speed is genuinely impressive. But it comes with a catch: AI-generated code looks complete and often isn't.

I've tested dozens of sites built with these tools. Here's exactly what I look for and how I find what the AI missed.

Why AI-Generated Sites Need Extra QA

The AI doesn't know your users. It generates code that looks correct and renders well in a single viewport on a single browser. What it misses:

Mobile layouts — the AI often generates desktop-first CSS that breaks on small screens
Edge cases — empty states, long text, special characters in form inputs
Real interactions — hover states that don't work on touch, focus states missing for keyboard users
Cross-browser differences — what works in Chrome may break in Firefox or Safari on iOS
Performance — AI tools often generate bloated components or load unnecessary resources

My Testing Process

1. Start with Mobile

I test on mobile first because AI builders tend to be weakest there. I use Playwright to emulate common devices:

const iPhone = devices['iPhone 14']
const browser = await chromium.launch()
const context = await browser.newContext({ ...iPhone })
const page = await context.newPage()

I look for text overflow, buttons that are too small to tap, modals that don't fit the screen, and horizontal scrolling that shouldn't be there.

2. Cross-Browser Check

Every AI-built site gets tested in Chromium, Firefox, and WebKit. Framer sites in particular use CSS features that Safari handles differently. I run a quick visual sweep across all three before diving into functionality.

3. Form Testing

Forms are where AI-generated code fails most often. I test:

Empty submission — does validation fire?
Invalid formats — bad email, phone numbers, short passwords
Long inputs — 500+ character names, long email addresses
Special characters — apostrophes, quotes, <script> tags in input fields
Success state — does something actually happen after submit?
Network errors — what happens if the API call fails?

For Lovable and Bolt sites specifically, the backend integration is often the weakest point. The form looks great but the submit handler either does nothing or errors silently.

I use Playwright to crawl all links and check for 404s:

const links = await page.locator('a').all()
for (const link of links) {
  const href = await link.getAttribute('href')
  if (href && href.startsWith('/')) {
    const response = await page.goto(href)
    console.log(href, response?.status())
  }
}

Broken internal links are common in AI-generated sites, especially when the AI creates navigation that points to sections that don't exist yet.

5. Performance Basics

I run Lighthouse and look at three numbers: LCP, TLS, CLS. Framer sites usually score well here because Framer's export is optimised. Lovable and Bolt sites vary — I've seen some with 8MB of JavaScript on load.

6. Content Edge Cases

I test with real-world content, not placeholder text. I paste:

Very long product names or titles
User names with accents and special characters
Empty arrays (what does the UI show with zero items?)
Single-item lists (does "1 items" appear?)

These are the cases AI never thinks to test because it generates happy-path content.

Platform-Specific Issues I Find Repeatedly

Lovable — Form submissions often fail silently. The success toast shows but nothing hits the database. Always test the full flow end to end.

Bolt — CSS specificity issues when customising the generated theme. Styles look fine in the editor, broken in production.

v0 — Component composition issues. Individual components look great but when combined in a page, spacing and layout break. Test pages as a whole, not components in isolation.

Framer — Animation performance on low-end Android devices. The animations that look silky on a MacBook Pro chug on a mid-range phone.

The Deliverable

After testing, I produce a structured QA report with:

Bug severity (Critical / Major / Minor)
Screenshot or screen recording of the issue
Steps to reproduce
Recommendation for the fix

This gives the developer or the AI tool operator exactly what they need to fix issues without back-and-forth.

AI website builders are only getting better. But the gap between "AI-generated" and "production-ready" is still real — and closing that gap is what QA is for.