Browser Rendering for SPA Sites
Many modern documentation sites are JavaScript Single Page Applications (SPAs) that return empty HTML shells. Skill Seekers v3.5.0 includes a Playwright-based browser renderer to handle these sites.
When to Use
Use browser rendering when:
- A site returns “No scraped data found” despite successful navigation
- The site is built with React, Vue, Angular, or similar SPA frameworks
- Content is loaded dynamically via JavaScript
Quick Start
# Install browser dependencies
pip install "skill-seekers[browser]"
# Scrape with browser rendering
skill-seekers create https://spa-docs-site.com --browser
# Or in config:
# Set "browser": true in your JSON config
How It Works
- When
--browserflag is set,DocScraper.scrape_page()delegates toBrowserRenderer.render_page(url) - Chromium is auto-installed on first use via Playwright
- Navigation uses
wait_until='networkidle'to let JavaScript execute - The fully-rendered HTML is returned to the normal pipeline
- BeautifulSoup extraction and content processing continue as normal
Smart SPA Discovery (v3.5.0)
The three-layer discovery engine finds pages even on SPAs:
- sitemap.xml — Standard sitemap discovery
- llms.txt — AI-optimized documentation format
- SPA nav rendering — Renders the navigation and discovers links from the DOM
Config Support
{
"name": "my-spa-docs",
"browser": true,
"start_urls": ["https://spa-docs-site.com/docs"]
}
Default browser timeouts: 60 seconds, domcontentloaded wait condition.