docs(10-osint-code-hosting): create phase 10 plans (9 plans across 3 waves)

This commit is contained in:
salvacybersec
2026-04-06 01:07:15 +03:00
parent cfe090a5c9
commit 191bdee3bc
10 changed files with 1611 additions and 1 deletions

View File

@@ -0,0 +1,108 @@
---
phase: 10-osint-code-hosting
plan: 06
type: execute
wave: 2
depends_on: [10-01]
files_modified:
- pkg/recon/sources/huggingface.go
- pkg/recon/sources/huggingface_test.go
autonomous: true
requirements: [RECON-CODE-08]
must_haves:
truths:
- "HuggingFaceSource queries /api/spaces and /api/models search endpoints"
- "Token is optional — anonymous requests allowed at lower rate limit"
- "Findings have SourceType=\"recon:huggingface\" and Source = full HF URL"
artifacts:
- path: "pkg/recon/sources/huggingface.go"
provides: "HuggingFaceSource implementing recon.ReconSource"
key_links:
- from: "pkg/recon/sources/huggingface.go"
to: "pkg/recon/sources/httpclient.go"
via: "Client.Do"
pattern: "client\\.Do"
---
<objective>
Implement HuggingFaceSource scanning both Spaces and model repos via the HF Hub API.
Token optional; unauthenticated requests work but are rate-limited harder.
Purpose: RECON-CODE-08.
Output: pkg/recon/sources/huggingface.go + tests.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/phases/10-osint-code-hosting/10-CONTEXT.md
@.planning/phases/10-osint-code-hosting/10-01-SUMMARY.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
<interfaces>
HuggingFace Hub API:
GET https://huggingface.co/api/spaces?search=<q>&limit=50
GET https://huggingface.co/api/models?search=<q>&limit=50
Response (either): array of { "id": "owner/name", "modelId"|"spaceId": "owner/name" }
Optional auth: Authorization: Bearer <hf-token>
URL derivation: Source = "https://huggingface.co/spaces/<id>" or ".../<id>" for models.
Rate: 1000/hour authenticated → rate.Every(3600*time.Millisecond); unauth: rate.Every(10*time.Second), burst 1.
</interfaces>
</context>
<tasks>
<task type="auto" tdd="true">
<name>Task 1: HuggingFaceSource + tests</name>
<files>pkg/recon/sources/huggingface.go, pkg/recon/sources/huggingface_test.go</files>
<behavior>
- Test A: Enabled always true (token optional)
- Test B: Sweep hits both /api/spaces and /api/models endpoints for each query
- Test C: Decodes array of {id} and emits Findings with Source prefixed by "https://huggingface.co/spaces/" or "https://huggingface.co/" for models, SourceType="recon:huggingface"
- Test D: Authorization header present when token set, absent when empty
- Test E: Ctx cancellation respected
- Test F: RateLimit returns slower rate when token empty
</behavior>
<action>
Create `pkg/recon/sources/huggingface.go`:
- Struct `HuggingFaceSource { Token, BaseURL string; Registry *providers.Registry; Limiters *recon.LimiterRegistry; client *Client }`
- Default BaseURL: `https://huggingface.co`
- Name "huggingface", RespectsRobots false, Burst 1
- RateLimit: token-dependent (see interfaces)
- Enabled always true
- Sweep: build keyword list, for each keyword iterate two endpoints
(`/api/spaces?search=<q>&limit=50`, `/api/models?search=<q>&limit=50`), emit
Findings. URL prefix differs per endpoint.
- Compile-time assert
Create `pkg/recon/sources/huggingface_test.go` with httptest server that routes
both paths. Assert exact number of Findings (2 per keyword × number of keywords)
and URL prefixes.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run TestHuggingFace -v -timeout 30s</automated>
</verify>
<done>
HuggingFaceSource passes tests covering both endpoints, token modes, cancellation.
</done>
</task>
</tasks>
<verification>
- `go test ./pkg/recon/sources/ -run TestHuggingFace -v`
</verification>
<success_criteria>
RECON-CODE-08 satisfied.
</success_criteria>
<output>
After completion, create `.planning/phases/10-osint-code-hosting/10-06-SUMMARY.md`.
</output>