Files
keyhunter/.planning/phases/10-osint-code-hosting/10-04-PLAN.md

7.1 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
10-osint-code-hosting 04 execute 2
10-01
pkg/recon/sources/bitbucket.go
pkg/recon/sources/bitbucket_test.go
pkg/recon/sources/gist.go
pkg/recon/sources/gist_test.go
true
RECON-CODE-03
RECON-CODE-04
truths artifacts key_links
BitbucketSource queries Bitbucket 2.0 code search API and emits Findings
GistSource queries GitHub Gist search (re-uses GitHub token) and emits Findings
Both disabled when respective credentials are empty
path provides
pkg/recon/sources/bitbucket.go BitbucketSource implementing recon.ReconSource
path provides
pkg/recon/sources/gist.go GistSource implementing recon.ReconSource
from to via pattern
pkg/recon/sources/gist.go pkg/recon/sources/httpclient.go Client.Do with Bearer <github-token> client.Do
from to via pattern
pkg/recon/sources/bitbucket.go pkg/recon/sources/httpclient.go Client.Do client.Do
Implement BitbucketSource (RECON-CODE-03) and GistSource (RECON-CODE-04). Grouped because both are small API integrations with similar shapes (JSON array/values, per-item URL, token gating).

Purpose: RECON-CODE-03, RECON-CODE-04. Output: Two new ReconSource implementations + tests.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/phases/10-osint-code-hosting/10-CONTEXT.md @.planning/phases/10-osint-code-hosting/10-01-SUMMARY.md @pkg/recon/source.go @pkg/recon/sources/httpclient.go @pkg/recon/sources/queries.go Bitbucket 2.0 search (docs: https://developer.atlassian.com/cloud/bitbucket/rest/api-group-search/): GET /2.0/workspaces/{workspace}/search/code?search_query= Auth: Bearer (app password or OAuth) Response: { "values": [{ "content_match_count": N, "file": {"path":"","commit":{...}}, "page_url": "..." }] } Note: Requires a workspace param — make it configurable via SourcesConfig.BitbucketWorkspace; if unset, source is disabled. Rate: 1000/hour → rate.Every(3.6 * time.Second), burst 1.

GitHub Gist search: GitHub does not expose a dedicated /search/gists endpoint that searches gist contents. Use the /gists/public endpoint + client-side filtering as fallback: GET /gists/public?per_page=100 returns public gists; for each gist, fetch /gists/{id} and scan file contents for keyword matches. Keep implementation minimal: just enumerate the first page, match against keyword list, emit Findings with Source = gist.html_url. Auth: Bearer . Rate: 30/min → rate.Every(2s).

Task 1: BitbucketSource + tests pkg/recon/sources/bitbucket.go, pkg/recon/sources/bitbucket_test.go - Test A: Enabled false when token OR workspace empty - Test B: Enabled true when both set - Test C: Sweep queries /2.0/workspaces/{ws}/search/code with Bearer header - Test D: Decodes `{values:[{file:{path,commit:{...}},page_url:"..."}]}` and emits Finding with Source=page_url, SourceType="recon:bitbucket" - Test E: 401 → ErrUnauthorized - Test F: Ctx cancellation Create `pkg/recon/sources/bitbucket.go`: - Struct `BitbucketSource { Token, Workspace, BaseURL string; Registry *providers.Registry; Limiters *recon.LimiterRegistry; client *Client }` - Default BaseURL: `https://api.bitbucket.org` - Name "bitbucket", RateLimit rate.Every(3600*time.Millisecond), Burst 1, RespectsRobots false - Enabled = s.Token != "" && s.Workspace != "" - Sweep: for each query in BuildQueries(reg, "bitbucket"), limiters.Wait, issue GET request, decode into struct with `Values []struct{ PageURL string "json:page_url"; File struct{ Path string } "json:file" }`, emit Findings - Compile-time assert `var _ recon.ReconSource = (*BitbucketSource)(nil)`
Create `pkg/recon/sources/bitbucket_test.go` with httptest server, synthetic
registry, assertions on URL path `/2.0/workspaces/testws/search/code`, Bearer
header, and emitted Findings.
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run TestBitbucket -v -timeout 30s BitbucketSource passes all tests, implements ReconSource. Task 2: GistSource + tests pkg/recon/sources/gist.go, pkg/recon/sources/gist_test.go - Test A: Enabled false when GitHub token empty - Test B: Sweep fetches /gists/public?per_page=100 with Bearer auth - Test C: For each gist, iterates files map; if any file.content contains a provider keyword, emits one Finding with Source=gist.html_url - Test D: Ctx cancellation - Test E: 401 → ErrUnauthorized - Test F: Gist without matching keyword → no Finding emitted Create `pkg/recon/sources/gist.go`: - Struct `GistSource { Token, BaseURL string; Registry *providers.Registry; Limiters *recon.LimiterRegistry; client *Client }` - BaseURL default `https://api.github.com` - Name "gist", RateLimit rate.Every(2*time.Second), Burst 1, RespectsRobots false - Enabled = s.Token != "" - Sweep flow: 1. Build keyword list from registry (flat set) 2. GET /gists/public?per_page=100 with Bearer header 3. Decode `[]struct{ HTMLURL string "json:html_url"; Files map[string]struct{ Filename, RawURL string "json:raw_url" } "json:files" }` 4. For each gist, for each file, if we can match without fetching raw content, skip raw fetch (keep Phase 10 minimal). Fallback: fetch file.RawURL and scan content for any keyword from the set; on hit, emit one Finding per gist (not per file) with ProviderName from matched keyword. 5. Respect limiters.Wait before each outbound request (gist list + each raw fetch) - Compile-time assert `var _ recon.ReconSource = (*GistSource)(nil)`
Create `pkg/recon/sources/gist_test.go`:
- httptest server with two routes: `/gists/public` returns 2 gists each with 1 file, raw_url pointing to same server `/raw/<id>`; `/raw/<id>` returns content containing "sk-proj-" for one and an unrelated string for the other
- Assert exactly 1 Finding emitted, Source matches the gist's html_url
- 401 test, ctx cancellation test, empty-token test
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run TestGist -v -timeout 30s GistSource emits Findings only when a known provider keyword is present in a gist file body; all tests green. - `go build ./...` - `go test ./pkg/recon/sources/ -run "TestBitbucket|TestGist" -v`

<success_criteria> RECON-CODE-03 and RECON-CODE-04 satisfied. </success_criteria>

After completion, create `.planning/phases/10-osint-code-hosting/10-04-SUMMARY.md`.