109 lines
3.8 KiB
Markdown
109 lines
3.8 KiB
Markdown
---
|
||
phase: 10-osint-code-hosting
|
||
plan: 06
|
||
type: execute
|
||
wave: 2
|
||
depends_on: [10-01]
|
||
files_modified:
|
||
- pkg/recon/sources/huggingface.go
|
||
- pkg/recon/sources/huggingface_test.go
|
||
autonomous: true
|
||
requirements: [RECON-CODE-08]
|
||
must_haves:
|
||
truths:
|
||
- "HuggingFaceSource queries /api/spaces and /api/models search endpoints"
|
||
- "Token is optional — anonymous requests allowed at lower rate limit"
|
||
- "Findings have SourceType=\"recon:huggingface\" and Source = full HF URL"
|
||
artifacts:
|
||
- path: "pkg/recon/sources/huggingface.go"
|
||
provides: "HuggingFaceSource implementing recon.ReconSource"
|
||
key_links:
|
||
- from: "pkg/recon/sources/huggingface.go"
|
||
to: "pkg/recon/sources/httpclient.go"
|
||
via: "Client.Do"
|
||
pattern: "client\\.Do"
|
||
---
|
||
|
||
<objective>
|
||
Implement HuggingFaceSource scanning both Spaces and model repos via the HF Hub API.
|
||
Token optional; unauthenticated requests work but are rate-limited harder.
|
||
|
||
Purpose: RECON-CODE-08.
|
||
Output: pkg/recon/sources/huggingface.go + tests.
|
||
</objective>
|
||
|
||
<execution_context>
|
||
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
|
||
@$HOME/.claude/get-shit-done/templates/summary.md
|
||
</execution_context>
|
||
|
||
<context>
|
||
@.planning/phases/10-osint-code-hosting/10-CONTEXT.md
|
||
@.planning/phases/10-osint-code-hosting/10-01-SUMMARY.md
|
||
@pkg/recon/source.go
|
||
@pkg/recon/sources/httpclient.go
|
||
|
||
<interfaces>
|
||
HuggingFace Hub API:
|
||
GET https://huggingface.co/api/spaces?search=<q>&limit=50
|
||
GET https://huggingface.co/api/models?search=<q>&limit=50
|
||
Response (either): array of { "id": "owner/name", "modelId"|"spaceId": "owner/name" }
|
||
Optional auth: Authorization: Bearer <hf-token>
|
||
|
||
URL derivation: Source = "https://huggingface.co/spaces/<id>" or ".../<id>" for models.
|
||
|
||
Rate: 1000/hour authenticated → rate.Every(3600*time.Millisecond); unauth: rate.Every(10*time.Second), burst 1.
|
||
</interfaces>
|
||
</context>
|
||
|
||
<tasks>
|
||
|
||
<task type="auto" tdd="true">
|
||
<name>Task 1: HuggingFaceSource + tests</name>
|
||
<files>pkg/recon/sources/huggingface.go, pkg/recon/sources/huggingface_test.go</files>
|
||
<behavior>
|
||
- Test A: Enabled always true (token optional)
|
||
- Test B: Sweep hits both /api/spaces and /api/models endpoints for each query
|
||
- Test C: Decodes array of {id} and emits Findings with Source prefixed by "https://huggingface.co/spaces/" or "https://huggingface.co/" for models, SourceType="recon:huggingface"
|
||
- Test D: Authorization header present when token set, absent when empty
|
||
- Test E: Ctx cancellation respected
|
||
- Test F: RateLimit returns slower rate when token empty
|
||
</behavior>
|
||
<action>
|
||
Create `pkg/recon/sources/huggingface.go`:
|
||
- Struct `HuggingFaceSource { Token, BaseURL string; Registry *providers.Registry; Limiters *recon.LimiterRegistry; client *Client }`
|
||
- Default BaseURL: `https://huggingface.co`
|
||
- Name "huggingface", RespectsRobots false, Burst 1
|
||
- RateLimit: token-dependent (see interfaces)
|
||
- Enabled always true
|
||
- Sweep: build keyword list, for each keyword iterate two endpoints
|
||
(`/api/spaces?search=<q>&limit=50`, `/api/models?search=<q>&limit=50`), emit
|
||
Findings. URL prefix differs per endpoint.
|
||
- Compile-time assert
|
||
|
||
Create `pkg/recon/sources/huggingface_test.go` with httptest server that routes
|
||
both paths. Assert exact number of Findings (2 per keyword × number of keywords)
|
||
and URL prefixes.
|
||
</action>
|
||
<verify>
|
||
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run TestHuggingFace -v -timeout 30s</automated>
|
||
</verify>
|
||
<done>
|
||
HuggingFaceSource passes tests covering both endpoints, token modes, cancellation.
|
||
</done>
|
||
</task>
|
||
|
||
</tasks>
|
||
|
||
<verification>
|
||
- `go test ./pkg/recon/sources/ -run TestHuggingFace -v`
|
||
</verification>
|
||
|
||
<success_criteria>
|
||
RECON-CODE-08 satisfied.
|
||
</success_criteria>
|
||
|
||
<output>
|
||
After completion, create `.planning/phases/10-osint-code-hosting/10-06-SUMMARY.md`.
|
||
</output>
|