Files
keyhunter/.planning/phases/10-osint-code-hosting/10-06-PLAN.md

3.8 KiB
Raw Blame History

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
10-osint-code-hosting 06 execute 2
10-01
pkg/recon/sources/huggingface.go
pkg/recon/sources/huggingface_test.go
true
RECON-CODE-08
truths artifacts key_links
HuggingFaceSource queries /api/spaces and /api/models search endpoints
Token is optional — anonymous requests allowed at lower rate limit
Findings have SourceType="recon:huggingface" and Source = full HF URL
path provides
pkg/recon/sources/huggingface.go HuggingFaceSource implementing recon.ReconSource
from to via pattern
pkg/recon/sources/huggingface.go pkg/recon/sources/httpclient.go Client.Do client.Do
Implement HuggingFaceSource scanning both Spaces and model repos via the HF Hub API. Token optional; unauthenticated requests work but are rate-limited harder.

Purpose: RECON-CODE-08. Output: pkg/recon/sources/huggingface.go + tests.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/phases/10-osint-code-hosting/10-CONTEXT.md @.planning/phases/10-osint-code-hosting/10-01-SUMMARY.md @pkg/recon/source.go @pkg/recon/sources/httpclient.go HuggingFace Hub API: GET https://huggingface.co/api/spaces?search=&limit=50 GET https://huggingface.co/api/models?search=&limit=50 Response (either): array of { "id": "owner/name", "modelId"|"spaceId": "owner/name" } Optional auth: Authorization: Bearer

URL derivation: Source = "https://huggingface.co/spaces/" or ".../" for models.

Rate: 1000/hour authenticated → rate.Every(3600time.Millisecond); unauth: rate.Every(10time.Second), burst 1.

Task 1: HuggingFaceSource + tests pkg/recon/sources/huggingface.go, pkg/recon/sources/huggingface_test.go - Test A: Enabled always true (token optional) - Test B: Sweep hits both /api/spaces and /api/models endpoints for each query - Test C: Decodes array of {id} and emits Findings with Source prefixed by "https://huggingface.co/spaces/" or "https://huggingface.co/" for models, SourceType="recon:huggingface" - Test D: Authorization header present when token set, absent when empty - Test E: Ctx cancellation respected - Test F: RateLimit returns slower rate when token empty Create `pkg/recon/sources/huggingface.go`: - Struct `HuggingFaceSource { Token, BaseURL string; Registry *providers.Registry; Limiters *recon.LimiterRegistry; client *Client }` - Default BaseURL: `https://huggingface.co` - Name "huggingface", RespectsRobots false, Burst 1 - RateLimit: token-dependent (see interfaces) - Enabled always true - Sweep: build keyword list, for each keyword iterate two endpoints (`/api/spaces?search=&limit=50`, `/api/models?search=&limit=50`), emit Findings. URL prefix differs per endpoint. - Compile-time assert
Create `pkg/recon/sources/huggingface_test.go` with httptest server that routes
both paths. Assert exact number of Findings (2 per keyword × number of keywords)
and URL prefixes.
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run TestHuggingFace -v -timeout 30s HuggingFaceSource passes tests covering both endpoints, token modes, cancellation. - `go test ./pkg/recon/sources/ -run TestHuggingFace -v`

<success_criteria> RECON-CODE-08 satisfied. </success_criteria>

After completion, create `.planning/phases/10-osint-code-hosting/10-06-SUMMARY.md`.