Tutorial: Scraping Documentation
Learn how to scrape any documentation website and create an AI skill in this hands-on tutorial.
Time: 15 minutes | Level: Beginner | Result: Working React docs skill
What You’ll Learn
- How to use preset configurations
- How to scrape documentation websites
- How to verify skill quality
- How to enhance and package skills
Prerequisites
- Skill Seekers installed (Installation Guide)
- Internet connection
- 15 minutes of time
Step 1: Choose a Documentation Site
For this tutorial, we’ll scrape React documentation. Skill Seekers includes 24 preset configs for popular frameworks.
View available presets:
skill-seekers list-configs
Output:
Available configs:
- react.json (React documentation)
- vue.json (Vue.js documentation)
- django.json (Django framework)
- godot.json (Godot game engine)
- fastapi.json (FastAPI framework)
... and 19 more
Step 2: Estimate Page Count
Before scraping, estimate how many pages will be processed:
skill-seekers estimate --config configs/react.json
Output:
📊 Estimation Results:
Base URL: https://react.dev/learn
Estimated pages: ~180 pages
Estimated time: 3-5 minutes
Categories detected: 4
Step 3: Scrape the Documentation
Run the scraper with the React preset:
skill-seekers scrape --config configs/react.json --output output/react/
What happens:
- llms.txt check - Looks for AI-optimized docs (10x faster if available)
- BFS traversal - Crawls all documentation pages
- Smart categorization - Organizes content into sections
- Code detection - Identifies and formats code examples
- SKILL.md generation - Creates main skill file
Progress output:
🔍 Checking for llms.txt...
✅ Found llms-full.txt (2.3 MB)
📥 Downloading...
✅ Downloaded in 1.8 seconds
📝 Creating skill structure...
✅ Skill created: output/react/SKILL.md
⚡ Time saved: 4m 32s vs traditional scraping
Step 4: Review the Skill
Check what was created:
ls -lh output/react/
Output:
output/react/
├── SKILL.md # Main skill file (200-500 lines)
├── references/ # Detailed documentation
│ ├── hooks.md
│ ├── components.md
│ ├── state-management.md
│ └── ... (50-100 reference files)
└── examples/ # Code examples
├── useState-example.md
├── useEffect-example.md
└── ...
Preview SKILL.md:
head -50 output/react/SKILL.md
Step 5: Enhance with AI (Optional)
Transform the skill from basic (3/10) to comprehensive (9/10) using AI:
Option A: Local Enhancement (FREE with Claude Max)
skill-seekers enhance output/react/
This opens Claude Code in a new terminal and enhances the skill using your Claude Max subscription (no API costs!).
Time: 30-60 seconds
Option B: API Enhancement (Fast)
export ANTHROPIC_API_KEY="sk-ant-..."
skill-seekers enhance output/react/ --mode api
Cost: ~$0.15-$0.30
Step 6: Package the Skill
Package for your preferred platform:
For Claude AI:
skill-seekers package output/react/ --target claude
For Gemini:
skill-seekers package output/react/ --target gemini
For OpenAI:
skill-seekers package output/react/ --target openai
Output:
✅ Packaged: react-claude.zip (2.3 MB)
📦 Format: Claude AI (YAML frontmatter)
📄 Files: 1 SKILL.md + 87 references
🎯 Ready to upload!
Step 7: Upload to AI Assistant
Automatic Upload (Recommended):
# Set API key first
export ANTHROPIC_API_KEY="sk-ant-..."
# Upload
skill-seekers upload react-claude.zip --target claude
Manual Upload:
- Open Claude.ai
- Click “Add Knowledge”
- Upload
react-claude.zip - Done!
Step 8: Test Your Skill
Try these prompts in Claude:
"Explain React hooks to me"
"Show me how to use useState with arrays"
"What's the difference between useEffect and useLayoutEffect?"
"Create a simple counter component using hooks"
Result: Claude responds with accurate, context-aware answers based on official React documentation!
Troubleshooting
Issue: “No pages found”
Solution: Check your config selectors:
skill-seekers scrape --config configs/react.json --interactive
Interactive mode shows extracted content and lets you test selectors.
Issue: “Scraping too slow”
Solutions:
- Use
--asyncflag for 2-3x speedup - Increase
rate_limitin config (try 1.0 or 2.0) - Check if llms.txt is available (10x faster!)
Issue: “Enhancement failed”
Solutions:
- Local mode: Install Claude Code
- API mode: Set
ANTHROPIC_API_KEYenvironment variable - Timeout: Increase with
--timeout 1200(20 minutes)
Next Steps
You just created your first AI skill! 🎉
Try these next:
- Scrape another framework: Django Tutorial
- Create custom config: Config Creation Tutorial
- Combine sources: Multi-Source Tutorial
- Explore MCP: MCP Setup
Summary
What you learned:
- ✅ How to use preset configurations
- ✅ How to estimate and scrape documentation
- ✅ How to enhance skills with AI
- ✅ How to package and upload for any platform
- ✅ How to troubleshoot common issues
Time investment: 15 minutes Result: Professional-quality AI skill ready to use!
See Also:
- Scraping Manual - Advanced scraping techniques
- Enhancement Guide - Deep dive into AI enhancement
- CLI Reference - Complete scrape command reference