Files
keyhunter/.planning/phases/10-osint-code-hosting/10-03-PLAN.md

4.4 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
10-osint-code-hosting 03 execute 2
10-01
pkg/recon/sources/gitlab.go
pkg/recon/sources/gitlab_test.go
true
RECON-CODE-02
truths artifacts key_links
GitLabSource.Sweep queries GitLab /api/v4/search?scope=blobs and emits Findings
Disabled when token empty; enabled otherwise
Findings have SourceType="recon:gitlab" and Source = web_url of blob
path provides
pkg/recon/sources/gitlab.go GitLabSource implementing recon.ReconSource
path provides
pkg/recon/sources/gitlab_test.go httptest tests
from to via pattern
pkg/recon/sources/gitlab.go pkg/recon/sources/httpclient.go c.client.Do(ctx, req) client.Do
Implement GitLabSource against GitLab's Search API (/api/v4/search?scope=blobs). Honors PRIVATE-TOKEN header auth, 2000 req/min rate limit.

Purpose: RECON-CODE-02. Output: pkg/recon/sources/gitlab.go + tests.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/phases/10-osint-code-hosting/10-CONTEXT.md @.planning/phases/10-osint-code-hosting/10-01-SUMMARY.md @pkg/recon/source.go @pkg/recon/sources/httpclient.go @pkg/recon/sources/queries.go GitLab Search API (docs: https://docs.gitlab.com/ee/api/search.html): GET /api/v4/search?scope=blobs&search=&per_page=20 Header: PRIVATE-TOKEN: Response (array of blob objects): [{ "basename": "...", "data": "matched snippet", "path": "...", "project_id": 123, "ref": "main", "startline": 42 }, ...] Project web_url must be constructed from project_id → fetch /api/v4/projects/ (or just use basename+path with a placeholder Source — keep it minimal: Source = "https://gitlab.com/projects//-/blob//").

Rate limit: 2000 req/min → rate.Every(30 * time.Millisecond) ≈ 2000/min, burst 5.

Task 1: GitLabSource implementation + tests pkg/recon/sources/gitlab.go, pkg/recon/sources/gitlab_test.go - Test A: Enabled false when token empty - Test B: Sweep queries /api/v4/search with scope=blobs, PRIVATE-TOKEN header set - Test C: Decodes array response, emits one Finding per blob with Source containing project_id + path + ref - Test D: 401 returns wrapped ErrUnauthorized - Test E: Ctx cancellation respected - Test F: Empty token → Sweep returns nil with no calls Create `pkg/recon/sources/gitlab.go` with struct `GitLabSource { Token, BaseURL string; Registry *providers.Registry; Limiters *recon.LimiterRegistry; client *Client }`.
Default BaseURL: `https://gitlab.com`.
Name: "gitlab". RateLimit: `rate.Every(30 * time.Millisecond)`. Burst: 5. RespectsRobots: false.

Sweep loop:
- For each query from BuildQueries(reg, "gitlab"):
  - Build `base + /api/v4/search?scope=blobs&search=<url-escaped>&per_page=20`
  - Set header `PRIVATE-TOKEN: <token>`
  - limiters.Wait, then client.Do
  - Decode `[]glBlob` where glBlob has ProjectID int, Path, Ref, Data, Startline
  - Emit Finding with Source = fmt.Sprintf("%s/projects/%d/-/blob/%s/%s", base, b.ProjectID, b.Ref, b.Path), SourceType="recon:gitlab", Confidence="low", ProviderName derived via keywordIndex(reg)
  - Respect ctx.Done on send

Add compile-time assert: `var _ recon.ReconSource = (*GitLabSource)(nil)`.

Create `pkg/recon/sources/gitlab_test.go` with httptest server returning a JSON
array of two blob objects. Assert both Findings received, Source URLs contain
project IDs, ctx cancellation test, 401 test, empty-token test. Use synthetic
registry with 2 providers.
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run TestGitLab -v -timeout 30s GitLabSource compiles, implements ReconSource, all test behaviors covered. - `go build ./...` - `go test ./pkg/recon/sources/ -run TestGitLab -v`

<success_criteria> RECON-CODE-02 satisfied. </success_criteria>

After completion, create `.planning/phases/10-osint-code-hosting/10-03-SUMMARY.md`.