Files
personas/personas/_shared/community-skills/librarian/SKILL.md
salvacybersec 0183a1eb5f feat(community-skills): add opencode-cli + feynman-cli + migrate 12 skills
Two new tool-operation skills with deep references/ docs:
- opencode-cli: SKILL.md + 6 references covering rules, agents, models,
  commands, formatters, permissions, skills, MCP, plugins, custom tools,
  LSP, themes, keybinds, server API, SDK, GitHub Actions, IDE, network,
  troubleshooting (full opencode.ai/docs surface)
- feynman-cli: SKILL.md + 6 references covering install, setup, config,
  CLI, REPL slash commands, agents/tools/packages, and full pi-subagents
  custom-agent spec (verified against the working install)

Migrate 12 skills from ~/.claude/skills into _shared/community-skills/:
- clean copy: intel-briefing, vercel-react-best-practices, ui-ux-pro-max
- core-only: notebooklm (data/images stripped — 184M to 224K)
- light sanitize: anythingllm-manager (gitea URL), foia-tool (DB password),
  jira (atlassian instance + email), librarian (paths), obsidian-tasks
  (vault path + email-in-cred-path)
- branding sanitize: marketing-strategist + pentest-reporter (Proudsec
  variants normalized to <COMPANY>)
- secrets sanitize: waha-whatsapp (IP, API key, vault path placeholders)

Skipped per user: proudguard-api (kept locally only).

build.py:
- DEFAULT_SKILL_PERSONA_MAP: 14 new entries
- NAME_PATTERNS: opencode + jira to coding-tools; notebooklm + feynman-
  to ai-llm-dev; waha- to osint-intel

Community-skills index: 703 -> 716 (+13).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 12:19:53 +03:00

9.9 KiB
Raw Blame History

name, description
name description
librarian Use when organizing, categorizing, renaming, or moving PDF/epub books in a library. Triggers on book classification, library reorganization, fixing garbled filenames, sorting documents into topic folders, Komga library maintenance, or any "organize these books/PDFs" request.

Librarian - PDF Library Organizer

Overview

Systematically classify, rename, and organize PDF/epub files into a topic-based folder hierarchy by reading actual content (not just filenames). Every file lands in the right place with a clean name. Uses Claude Code tools (Read, Bash, Glob, Grep, Agent) natively.

Core Principles

  1. READ BEFORE CLASSIFY — Read minimum 20 pages of each PDF via Read tool with pages: "1-20". Filenames lie (OCR artifacts, download garbage, wrong metadata). Only the actual content is truth.
  2. NEVER DELETE — Move unwanted files to Arsiv/ subfolder. Deletion decisions belong to the user only.
  3. NO SPACES IN FOLDER NAMES — Use CamelCase: AskeriTarih/StratejiVeSavasSanati/
  4. CLEAN FILENAMES — Format: Author - Title (Year).pdf. Fix OCR garbage, encoding issues, truncated names.
  5. FLAT TOPICS — Each major topic is a top-level folder under library root. Subtopics one level below. Max 2 levels deep.
  6. ASK BEFORE DESTRUCTIVE OPS — Never rm, never force-overwrite. Always confirm with user.

Tool Usage Map

Task Tool How
Find PDFs Glob pattern: "**/*.pdf"
Count files per folder Bash find DIR -name "*.pdf" | wc -l
Read PDF content Read file_path: "/path/to.pdf", pages: "1-20"
Read PDF metadata Bash pdfinfo "/path/to.pdf" (title, author, pages, creator)
Search text in PDFs Grep pattern: "keyword", path: "/library/"
Move/rename files Bash mkdir -p TARGET && mv "SRC" "TARGET/NewName.pdf"
Parallel processing Agent Dispatch subagents for 30-file batches
Check duplicates Bash ls -la FILE1 FILE2 (compare sizes)

Workflow

digraph librarian {
  rankdir=TB;
  "Survey library" -> "Plan folder structure";
  "Plan folder structure" -> "User approves?";
  "User approves?" -> "Batch files by folder" [label="yes"];
  "User approves?" -> "Revise plan" [label="no"];
  "Revise plan" -> "User approves?";
  "Batch files by folder" -> "Dispatch parallel agents";
  "Dispatch parallel agents" -> "Each agent: read 20pg, classify, rename, move";
  "Each agent: read 20pg, classify, rename, move" -> "Collect reports";
  "Collect reports" -> "Verify & cleanup empty dirs";
}

Phase 1: Survey

# Use Glob to find all PDFs
Glob(pattern="**/*.pdf", path="/library/root/")

# Use Bash to count per folder
Bash("find /library -type d -exec sh -c 'echo \"$(find \"$1\" -maxdepth 1 -name \"*.pdf\" | wc -l) $1\"' _ {} \\;")

Phase 2: Plan Structure

Present proposed folder tree to user. Get approval before any moves. Example:

Books/
├── AskeriTarih/          (military history)
│   ├── StratejiVeSavasSanati/
│   └── GenelAskeriTarih/
├── Istihbarat/           (intelligence)
│   ├── TeoriVeAnaliz/
│   ├── RusIstihbarati/
│   └── CIA/
├── SiberGuvenlik/        (cybersecurity)
└── Arsiv/                (user-decides-later)

Phase 3: Dispatch Parallel Agents

REQUIRED: Use superpowers:dispatching-parallel-agents pattern.

Split work into independent batches of ~30 files. Each agent works on one folder or file batch.

Agent prompt template:

You are a LIBRARIAN. Working directory: {LIBRARY_ROOT}

RULES (non-negotiable):
- Read FIRST 20 PAGES of each PDF: Read tool, file_path, pages "1-20"
- Also run: Bash("pdfinfo 'FILE'") for metadata
- Rename to "Author - Title (Year).pdf"
- NEVER delete any file. Unwanted → {LIBRARY_ROOT}/Arsiv/
- No spaces in folder names
- mkdir -p before every mv

Process these files from {SOURCE_DIR}:
{FILE_LIST}

Target folder mapping:
{FOLDER_MAP_TABLE}

For EACH file:
1. Bash: pdfinfo "FILE" → extract Title, Author, Pages, CreationDate
2. Read: file_path="FILE", pages="1-20" → verify author, title, topic
3. If ambiguous after 20 pages → read pages "20-40"
4. Classify → determine target folder
5. Bash: mkdir -p "TARGET" && mv "OLD" "TARGET/Author - Title (Year).pdf"

Report format per file:
[TYPE] old_name.pdf → Author - Title (Year).pdf → TargetFolder/

Launching agents:

# Send ALL agent calls in a SINGLE message for parallel execution
Agent(
  name="lib-batch1",
  prompt="...(files 1-30)...",
  mode="bypassPermissions",
  run_in_background=True
)
Agent(
  name="lib-batch2",
  prompt="...(files 31-60)...",
  mode="bypassPermissions",
  run_in_background=True
)
# ... up to 10 parallel agents

Phase 4: Read & Classify Each PDF

Step 1 - Metadata check (fast):

pdfinfo "file.pdf"
# Gives: Title, Author, Creator, Producer, CreationDate, Pages

Step 2 - Content read (thorough):

Read(file_path="/path/to/file.pdf", pages="1-20")

Extract from content:

  • Real author (not editor, not advisor, not publisher, not translator)
  • Real title (from title page, not OCR garbage in filename)
  • Publication year (from copyright page, not filename date)
  • Type: Kitap / Makale / Rapor / Tez / El Kitabı / Dergi / Teknik Doküman
  • Topic: Specific subject for classification
  • Language: TR / EN / DE / RU / etc.

Step 3 - Ambiguity resolution: If 20 pages insufficient → Read(file_path, pages="20-40") If still unclear → Grep(pattern="keywords", path="file.pdf")

Phase 5: Rename

Format: Author - Title (Year).pdf

Common OCR/download fixes:

Problem Example Fix
OCR junk prefix Dağıtım GAMEDA - Tanzimat Read title page → real author
Download artifact Book (PDFDrive).pdf Remove (PDFDrive)
Hash/ID in name [#131337]-112799.pdf Remove, use real title
Underscores Yusuf_Hikmet_Bayur Replace with spaces
Wrong year (1882) = birth year Check copyright page
Truncated title Sertan Kolat - Web Tabanli Uygulamalarda Otomatik Guvenlik Denetim Yazilimlarinin Iyilestirilmes Complete from content
Advisor as author Güngör ŞAHİN - MİLLÎ SAVUNMA ÜNİVERSİTESİ (1920) Read → find student name
Typos Terraki, Asleri Fix: Terakki, Askeri
Encoding issues Te__kilat, LLÎGÜVENLİ Read → reconstruct

Phase 6: Move

# Always mkdir -p first
mkdir -p "<LIBRARY_ROOT>/TargetTopic/SubTopic/"
mv "/old/path/garbled_name.pdf" "<LIBRARY_ROOT>/TargetTopic/SubTopic/Author - Title (Year).pdf"

Phase 7: Report & Cleanup

# Remove empty old directories (safe - only removes if empty)
find /old/library/path -type d -empty -delete

# Final count
find <LIBRARY_ROOT> -name "*.pdf" | wc -l

Duplicate Handling

# Compare sizes
ls -la "file1.pdf" "file2.pdf"
# Keep LARGER (better scan quality)
# Move smaller to Arsiv/Duplikatlar/ (NEVER delete)
mkdir -p Arsiv/Duplikatlar
mv "smaller_file.pdf" Arsiv/Duplikatlar/

Library Root

/mnt/storage/Common/Books/ — Komga-compatible hierarchy. Max 2 levels deep.

Classification Reference Table

Content Top Folder Subtopic Examples
Askeri strateji, savaş teorisi AskeriTarih/ StratejiVeSavasSanati/, GenelAskeriTarih/, DenizHarpTarihi/
Field manual, doktrin AskeriDoktrin/ ABD-FieldManual/, FOIA-Pentagon/, ElKitaplari/
NATO dokümanları NATO/ FOIA-NATO/, Teknik/, Tatbikat/, Idari/
İstihbarat genel Istihbarat/ TeoriVeAnaliz/, CIA/, RusIstihbarati/, TurkIstihbarati/
FOIA CIA genel (okuma odası) Istihbarat/CIA/ (doğrudan dosya, alt klasör yok)
FOIA CIA Türkiye Istihbarat/FOIA-CIA-Turkey/ (doğrudan dosya)
FOIA CIA Orta Doğu Istihbarat/FOIA-CIA-OrtaDogu/ (doğrudan dosya)
FOIA CIA Pakistan/Afgan Istihbarat/FOIA-CIA-Pakistan/ (doğrudan dosya)
FOIA CIA Çin Istihbarat/FOIA-CIA-Cin/ (doğrudan dosya)
FOIA CIA Afrika Istihbarat/FOIA-CIA-Afrika/ (doğrudan dosya)
FOIA CIA İsrail/Filistin Istihbarat/FOIA-CIA-Israil-Filistin/ (doğrudan dosya)
FOIA CIA Hindistan Istihbarat/FOIA-CIA-Hindistan/ (doğrudan dosya)
FOIA FBI Vault Istihbarat/FOIA-FBI-Vault/ (doğrudan dosya)
FOIA Rusya/KGB/FSB Istihbarat/RusIstihbarati/ (doğrudan dosya)
FOIA Terörle Mücadele GuvenlikStratejileri/FOIA-CIA-Teror/ (doğrudan dosya)
FOIA Siber Savaş SiberGuvenlik/FOIA-SiberSavas/ (doğrudan dosya)
Osmanlı tarihi OsmanliTarihi/ IttihatVeTerakki/, JonTurkHareketi/, AbdulhamidDonemi/
Cumhuriyet tarihi CumhuriyetTarihi/ MilliMucadele/, SiyasiDusunce/, Donemler/
Dünya tarihi DunyaTarihi/ AntikTarih/, AvrupaTarihi/, DersMateryalleri/
Siber güvenlik SiberGuvenlik/ AgGuvenligi/, WebGuvenligi/, SaldiriTeknikleri/
Roman/kurgu FelsefeVeEdebiyat/ Edebiyat/
Dergi sayıları GuvenlikStratejileri/ Dergi/
Kişisel/idari doküman Arsiv/ Duplikatlar/
Sınıflandırılamayan Arsiv/ Siniflandirilmamis/

Common Mistakes

Mistake Fix
Classifying by filename alone ALWAYS read 20+ pages AND pdfinfo first
Deleting "junk" files Move to Arsiv/, let user decide
Spaces in folder names CamelCase or hyphens only
Using advisor name as author (theses) Read title page for student name
Using topic date as publication year Check copyright/publishing page
Moving without mkdir -p Always create target dir first
Too many files per agent Max 30 files per subagent
Single agent for 100+ files MUST use parallel agents
Not checking pdfinfo first pdfinfo is fast metadata; Read is thorough content
Forgetting epub files Glob for both *.pdf and *.epub