Skill Seekers v3.0.0: The Universal Intelligence Platform
Transform docs, GitHub repos, PDFs, and codebases into structured knowledge for any AI system. 16 output formats. 1,852 tests. One tool for LangChain, LlamaIndex, Cursor, Claude, and more.
Skill Seekers v3.0.0: The Universal Intelligence Platform
TL;DR
- 🚀 16 output formats (was 4 in v2.x)
- 🛠️ 26 MCP tools (was 9)
- ✅ 1,852 tests passing (was 700+)
- ☁️ Cloud storage support (S3, GCS, Azure)
- 🔄 CI/CD ready (GitHub Action + Docker)
- 🎮 Godot game engine support with signal flow analysis
- 🌐 27+ programming languages (7 new)
- 📚 4 input sources: Docs, GitHub repos, PDFs, Local codebases
pip install skill-seekers
skill-seekers scrape --config react.json
The Problem We’re Solving
Every AI project needs data preprocessing:
- RAG pipelines: “Scrape these docs/repos/PDFs, chunk them, embed them…”
- AI coding tools: “I wish Cursor knew this framework/API…”
- Claude skills: “Convert this codebase into a skill”
70% of RAG development time is spent on data preprocessing. Everyone rebuilds the same infrastructure. Stop rebuilding. Start using.
The Solution: Universal Preprocessor
Skill Seekers v3.0.0 transforms docs, GitHub repos, PDFs, and local codebases into structured knowledge for any AI system:
For RAG Pipelines
# From documentation
skill-seekers scrape --format langchain --config react.json
# From GitHub repository
skill-seekers scrape --format langchain --github https://github.com/user/repo
# From PDF files
skill-seekers scrape --format langchain --pdf ./manual.pdf
# From local codebase
skill-seekers analyze --directory ./my-project --format langchain
For AI Coding Assistants
# Works with any source - docs, repos, or codebases
skill-seekers scrape --target claude --config react.json
cp output/react-claude/.cursorrules ./
# Windsurf, Cline, Continue.dev - same process
For Claude AI
skill-seekers install --config react.json
# Auto-fetches, scrapes, enhances, packages, uploads
What’s New in v3.0.0
4 Input Sources
| Source | Command | Use Case |
|---|---|---|
| Documentation | scrape --config | Framework docs, APIs, guides |
| GitHub Repos | scrape --github | Open source libraries, tools |
| PDF Files | scrape --pdf | Manuals, research papers, books |
| Local Codebases | analyze --directory | Your own projects, game engines |
16 Platform Adaptors
| Category | Platforms | Command |
|---|---|---|
| RAG/Vectors | LangChain, LlamaIndex, Chroma, FAISS, Haystack, Qdrant, Weaviate | --format <name> |
| AI Platforms | Claude, Gemini, OpenAI | --target <name> |
| AI Coding | Cursor, Windsurf, Cline, Continue.dev | --target claude |
| Generic | Markdown | --target markdown |
26 MCP Tools
Your AI agent can now prepare its own knowledge:
- Config tools (3): generate_config, list_configs, validate_config
- Scraping tools (8): estimate_pages, scrape_docs, scrape_github, scrape_pdf, scrape_codebase, detect_patterns, extract_test_examples, build_how_to_guides
- Packaging tools (4): package_skill, upload_skill, enhance_skill, install_skill
- Source tools (5): fetch_config, submit_config, add/remove_config_source, list_config_sources
- Splitting tools (2): split_config, generate_router
- Vector DB tools (4): export_to_weaviate, export_to_chroma, export_to_faiss, export_to_qdrant
Cloud Storage
Upload skills directly to cloud storage:
# AWS S3
skill-seekers cloud upload output/react/ --provider s3 --bucket my-bucket
# Google Cloud Storage
skill-seekers cloud upload output/react/ --provider gcs --bucket my-bucket
# Azure Blob Storage
skill-seekers cloud upload output/react/ --provider azure --container my-container
CI/CD Ready
GitHub Action:
- uses: skill-seekers/action@v1
with:
config: configs/react.json
format: langchain
Docker:
docker run -v $(pwd):/data skill-seekers:latest scrape --config /data/config.json
Godot Game Engine Support
Full Godot 4.x analysis with signal flow detection:
skill-seekers analyze --directory ./my-godot-game --comprehensive
Detects:
- Signal declarations and connections
- Event patterns (EventBus, Observer, Event Chains)
- GDScript test extraction (GUT, gdUnit4)
Extended Language Support
7 New Languages: Dart, Scala, SCSS/SASS, Elixir, Lua, Perl
Total: 27+ programming languages supported
Production Quality
- ✅ 1,852 tests across 100 test files
- ✅ 58,512 lines of Python code
- ✅ 80+ documentation files
- ✅ 12 example projects for every integration
Quick Start
# Install
pip install skill-seekers
# Create a config
skill-seekers config --wizard
# Or use a preset
skill-seekers scrape --config configs/react.json
# Package for your platform
skill-seekers package output/react/ --target langchain
Migration from v2.x
v3.0.0 is fully backward compatible. All v2.x configs and commands work unchanged. New features are additive.
Links
Ready to transform your data into AI knowledge?
pip install skill-seekers
The universal preprocessor for AI systems.