Tutorial: Extracting PDFs
Learn how to extract technical documentation from PDFs and create searchable AI skills.
Time: 10 minutes | Level: Beginner | Result: PDF-based skill
Basic PDF Extraction
skill-seekers pdf \
--input /path/to/manual.pdf \
--output output/manual/
OCR for Scanned PDFs
# Install Tesseract first
# Ubuntu: sudo apt-get install tesseract-ocr
# macOS: brew install tesseract
skill-seekers pdf \
--input /path/to/scanned.pdf \
--output output/scanned/ \
--ocr
Password-Protected PDFs
skill-seekers pdf \
--input /path/to/encrypted.pdf \
--output output/encrypted/ \
--password "your-password"
Extract Tables
skill-seekers pdf \
--input /path/to/spec.pdf \
--output output/spec/ \
--extract-tables
Parallel Processing (3x Faster)
skill-seekers pdf \
--input /path/to/large.pdf \
--output output/large/ \
--parallel \
--workers 8
See: PDF Scraping Manual for complete guide.