Merge branch 'worktree-agent-a6700ee2'

This commit is contained in:
salvacybersec
2026-04-06 12:21:07 +03:00
6 changed files with 1144 additions and 559 deletions

View File

@@ -252,7 +252,13 @@ Plans:
3. `keyhunter recon --sources=s3` enumerates publicly accessible S3 buckets and scans readable objects for API key patterns
4. `keyhunter recon --sources=gcs,azureblob,spaces` scans GCS, Azure Blob, and DigitalOcean Spaces; `--sources=minio` discovers MinIO instances via Shodan integration
5. `keyhunter recon --sources=grayhoundwarfare` queries the GrayHatWarfare bucket search engine for matching bucket names
**Plans**: TBD
**Plans**: 4 plans
Plans:
- [ ] 12-01-PLAN.md — ShodanSource + CensysSource + ZoomEyeSource (RECON-IOT-01, RECON-IOT-02, RECON-IOT-03)
- [ ] 12-02-PLAN.md — FOFASource + NetlasSource + BinaryEdgeSource (RECON-IOT-04, RECON-IOT-05, RECON-IOT-06)
- [ ] 12-03-PLAN.md — S3Scanner + GCSScanner + AzureBlobScanner + DOSpacesScanner (RECON-CLOUD-01, RECON-CLOUD-02, RECON-CLOUD-03, RECON-CLOUD-04)
- [ ] 12-04-PLAN.md — RegisterAll wiring + cmd/recon.go credentials + integration test (all Phase 12 reqs)
### Phase 13: OSINT Package Registries & Container/IaC
**Goal**: Users can scan npm, PyPI, and 6 other package registries for packages containing leaked keys, and scan Docker Hub image layers, Kubernetes configs, Terraform state files, Helm charts, and Ansible Galaxy for secrets in infrastructure code

View File

@@ -0,0 +1,193 @@
---
phase: 12-osint_iot_cloud_storage
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/recon/sources/shodan.go
- pkg/recon/sources/shodan_test.go
- pkg/recon/sources/censys.go
- pkg/recon/sources/censys_test.go
- pkg/recon/sources/zoomeye.go
- pkg/recon/sources/zoomeye_test.go
autonomous: true
requirements: [RECON-IOT-01, RECON-IOT-02, RECON-IOT-03]
must_haves:
truths:
- "ShodanSource searches Shodan /shodan/host/search for exposed LLM endpoints and emits findings"
- "CensysSource searches Censys v2 /hosts/search for exposed services and emits findings"
- "ZoomEyeSource searches ZoomEye /host/search for device/service key exposure and emits findings"
- "Each source is disabled (Enabled==false) when its API key is empty"
artifacts:
- path: "pkg/recon/sources/shodan.go"
provides: "ShodanSource implementing recon.ReconSource"
exports: ["ShodanSource"]
- path: "pkg/recon/sources/censys.go"
provides: "CensysSource implementing recon.ReconSource"
exports: ["CensysSource"]
- path: "pkg/recon/sources/zoomeye.go"
provides: "ZoomEyeSource implementing recon.ReconSource"
exports: ["ZoomEyeSource"]
key_links:
- from: "pkg/recon/sources/shodan.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
- from: "pkg/recon/sources/censys.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
- from: "pkg/recon/sources/zoomeye.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
---
<objective>
Implement three IoT scanner recon sources: Shodan, Censys, and ZoomEye.
Purpose: Enable discovery of exposed LLM endpoints (vLLM, Ollama, LiteLLM proxies) via internet-wide device scanners.
Output: Three source files + tests following the established Phase 10 pattern.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
@pkg/recon/sources/github.go
@pkg/recon/sources/bing.go
@pkg/recon/sources/queries.go
@pkg/recon/sources/register.go
<interfaces>
From pkg/recon/source.go:
```go
type ReconSource interface {
Name() string
RateLimit() rate.Limit
Burst() int
RespectsRobots() bool
Enabled(cfg Config) bool
Sweep(ctx context.Context, query string, out chan<- Finding) error
}
```
From pkg/recon/sources/httpclient.go:
```go
type Client struct { HTTP *http.Client; MaxRetries int; UserAgent string }
func NewClient() *Client
func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
var ErrUnauthorized = errors.New("sources: unauthorized (check credentials)")
```
From pkg/recon/sources/queries.go:
```go
func BuildQueries(reg *providers.Registry, source string) []string
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Implement ShodanSource, CensysSource, ZoomEyeSource</name>
<files>pkg/recon/sources/shodan.go, pkg/recon/sources/censys.go, pkg/recon/sources/zoomeye.go</files>
<action>
Create three source files following the BingDorkSource pattern exactly:
**ShodanSource** (shodan.go):
- Struct: `ShodanSource` with fields `APIKey string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Compile-time assertion: `var _ recon.ReconSource = (*ShodanSource)(nil)`
- Name(): "shodan"
- RateLimit(): rate.Every(1 * time.Second) — Shodan allows ~1 req/s on most plans
- Burst(): 1
- RespectsRobots(): false (authenticated REST API)
- Enabled(): returns `s.APIKey != ""`
- BaseURL default: "https://api.shodan.io"
- Sweep(): For each query from BuildQueries(s.Registry, "shodan"), call GET `{base}/shodan/host/search?key={apikey}&query={url.QueryEscape(q)}`. Parse JSON response `{"matches":[{"ip_str":"...","port":N,"data":"..."},...]}`. Emit a Finding per match with Source=`fmt.Sprintf("shodan://%s:%d", match.IPStr, match.Port)`, SourceType="recon:shodan", Confidence="low", ProviderName from keyword index.
- Add `shodanKeywordIndex` helper (same pattern as bingKeywordIndex).
- Error handling: ErrUnauthorized aborts, context cancellation aborts, transient errors continue.
**CensysSource** (censys.go):
- Struct: `CensysSource` with fields `APIId string`, `APISecret string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Name(): "censys"
- RateLimit(): rate.Every(2500 * time.Millisecond) — Censys free tier is 0.4 req/s
- Burst(): 1
- RespectsRobots(): false
- Enabled(): returns `s.APIId != "" && s.APISecret != ""`
- BaseURL default: "https://search.censys.io/api"
- Sweep(): For each query, POST `{base}/v2/hosts/search` with JSON body `{"q":q,"per_page":25}`. Set Basic Auth header using APIId:APISecret. Parse JSON response `{"result":{"hits":[{"ip":"...","services":[{"port":N,"service_name":"..."}]}]}}`. Emit Finding per hit with Source=`fmt.Sprintf("censys://%s", hit.IP)`.
- Add `censysKeywordIndex` helper.
**ZoomEyeSource** (zoomeye.go):
- Struct: `ZoomEyeSource` with fields `APIKey string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Name(): "zoomeye"
- RateLimit(): rate.Every(2 * time.Second)
- Burst(): 1
- RespectsRobots(): false
- Enabled(): returns `s.APIKey != ""`
- BaseURL default: "https://api.zoomeye.org" (ZoomEye uses v1-style API key in header)
- Sweep(): For each query, GET `{base}/host/search?query={url.QueryEscape(q)}&page=1`. Set header `API-KEY: {apikey}`. Parse JSON response `{"matches":[{"ip":"...","portinfo":{"port":N},"banner":"..."}]}`. Emit Finding per match with Source=`fmt.Sprintf("zoomeye://%s:%d", match.IP, match.PortInfo.Port)`.
- Add `zoomeyeKeywordIndex` helper.
Update `formatQuery` in queries.go to add cases for "shodan", "censys", "zoomeye" — all use bare keyword (same as default).
All sources must use `sources.NewClient()` for HTTP, `s.Limiters.Wait(ctx, s.Name(), ...)` before each request, and follow the same error handling pattern as BingDorkSource.Sweep.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go build ./pkg/recon/sources/</automated>
</verify>
<done>Three source files compile, each implements recon.ReconSource interface</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: Unit tests for Shodan, Censys, ZoomEye sources</name>
<files>pkg/recon/sources/shodan_test.go, pkg/recon/sources/censys_test.go, pkg/recon/sources/zoomeye_test.go</files>
<behavior>
- Shodan: httptest server returns mock JSON with 2 matches; Sweep emits 2 findings with "recon:shodan" source type
- Shodan: empty API key => Enabled()==false, Sweep returns nil with 0 findings
- Censys: httptest server returns mock JSON with 2 hits; Sweep emits 2 findings with "recon:censys" source type
- Censys: empty APIId => Enabled()==false
- ZoomEye: httptest server returns mock JSON with 2 matches; Sweep emits 2 findings with "recon:zoomeye" source type
- ZoomEye: empty API key => Enabled()==false
- All: cancelled context returns context error
</behavior>
<action>
Create test files following the pattern in github_test.go / bing_test.go:
- Use httptest.NewServer to mock API responses
- Set BaseURL to test server URL
- Create a minimal providers.Registry with 1-2 test providers containing keywords
- Verify Finding count, SourceType, and Source URL format
- Test disabled state (empty credentials)
- Test context cancellation
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go test ./pkg/recon/sources/ -run "TestShodan|TestCensys|TestZoomEye" -v -count=1</automated>
</verify>
<done>All Shodan, Censys, ZoomEye tests pass; each source emits correct findings from mock API responses</done>
</task>
</tasks>
<verification>
- `go build ./pkg/recon/sources/` compiles without errors
- `go test ./pkg/recon/sources/ -run "TestShodan|TestCensys|TestZoomEye" -v` all pass
- Each source file has compile-time assertion `var _ recon.ReconSource = (*XxxSource)(nil)`
</verification>
<success_criteria>
Three IoT scanner sources (Shodan, Censys, ZoomEye) implement recon.ReconSource, use shared Client for HTTP, respect rate limiting via LimiterRegistry, and pass unit tests with mock API responses.
</success_criteria>
<output>
After completion, create `.planning/phases/12-osint_iot_cloud_storage/12-01-SUMMARY.md`
</output>

View File

@@ -0,0 +1,187 @@
---
phase: 12-osint_iot_cloud_storage
plan: 02
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/recon/sources/fofa.go
- pkg/recon/sources/fofa_test.go
- pkg/recon/sources/netlas.go
- pkg/recon/sources/netlas_test.go
- pkg/recon/sources/binaryedge.go
- pkg/recon/sources/binaryedge_test.go
autonomous: true
requirements: [RECON-IOT-04, RECON-IOT-05, RECON-IOT-06]
must_haves:
truths:
- "FOFASource searches FOFA API for exposed endpoints and emits findings"
- "NetlasSource searches Netlas API for internet-wide scan results and emits findings"
- "BinaryEdgeSource searches BinaryEdge API for exposed services and emits findings"
- "Each source is disabled when its API key/credentials are empty"
artifacts:
- path: "pkg/recon/sources/fofa.go"
provides: "FOFASource implementing recon.ReconSource"
exports: ["FOFASource"]
- path: "pkg/recon/sources/netlas.go"
provides: "NetlasSource implementing recon.ReconSource"
exports: ["NetlasSource"]
- path: "pkg/recon/sources/binaryedge.go"
provides: "BinaryEdgeSource implementing recon.ReconSource"
exports: ["BinaryEdgeSource"]
key_links:
- from: "pkg/recon/sources/fofa.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
- from: "pkg/recon/sources/netlas.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
- from: "pkg/recon/sources/binaryedge.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
---
<objective>
Implement three IoT scanner recon sources: FOFA, Netlas, and BinaryEdge.
Purpose: Complete the IoT/device scanner coverage with Chinese (FOFA) and alternative (Netlas, BinaryEdge) internet search engines.
Output: Three source files + tests following the established Phase 10 pattern.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
@pkg/recon/sources/bing.go
@pkg/recon/sources/queries.go
@pkg/recon/sources/register.go
<interfaces>
From pkg/recon/source.go:
```go
type ReconSource interface {
Name() string
RateLimit() rate.Limit
Burst() int
RespectsRobots() bool
Enabled(cfg Config) bool
Sweep(ctx context.Context, query string, out chan<- Finding) error
}
```
From pkg/recon/sources/httpclient.go:
```go
type Client struct { HTTP *http.Client; MaxRetries int; UserAgent string }
func NewClient() *Client
func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
var ErrUnauthorized = errors.New("sources: unauthorized (check credentials)")
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Implement FOFASource, NetlasSource, BinaryEdgeSource</name>
<files>pkg/recon/sources/fofa.go, pkg/recon/sources/netlas.go, pkg/recon/sources/binaryedge.go</files>
<action>
Create three source files following the BingDorkSource pattern:
**FOFASource** (fofa.go):
- Struct: `FOFASource` with fields `Email string`, `APIKey string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Compile-time assertion: `var _ recon.ReconSource = (*FOFASource)(nil)`
- Name(): "fofa"
- RateLimit(): rate.Every(1 * time.Second) — FOFA allows ~1 req/s
- Burst(): 1
- RespectsRobots(): false
- Enabled(): returns `s.Email != "" && s.APIKey != ""`
- BaseURL default: "https://fofa.info"
- Sweep(): For each query from BuildQueries, base64-encode the query, then GET `{base}/api/v1/search/all?email={email}&key={apikey}&qbase64={base64query}&size=100`. Parse JSON response `{"results":[["ip","port","protocol","host"],...],"size":N}`. Emit Finding per result with Source=`fmt.Sprintf("fofa://%s:%s", result[0], result[1])`, SourceType="recon:fofa".
- Note: FOFA results array contains string arrays, not objects. Each inner array is [host, ip, port].
- Add `fofaKeywordIndex` helper.
**NetlasSource** (netlas.go):
- Struct: `NetlasSource` with fields `APIKey string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Name(): "netlas"
- RateLimit(): rate.Every(1 * time.Second)
- Burst(): 1
- RespectsRobots(): false
- Enabled(): returns `s.APIKey != ""`
- BaseURL default: "https://app.netlas.io"
- Sweep(): For each query, GET `{base}/api/responses/?q={url.QueryEscape(q)}&start=0&indices=`. Set header `X-API-Key: {apikey}`. Parse JSON response `{"items":[{"data":{"ip":"...","port":N}},...]}`. Emit Finding per item with Source=`fmt.Sprintf("netlas://%s:%d", item.Data.IP, item.Data.Port)`.
- Add `netlasKeywordIndex` helper.
**BinaryEdgeSource** (binaryedge.go):
- Struct: `BinaryEdgeSource` with fields `APIKey string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Name(): "binaryedge"
- RateLimit(): rate.Every(2 * time.Second) — BinaryEdge free tier is conservative
- Burst(): 1
- RespectsRobots(): false
- Enabled(): returns `s.APIKey != ""`
- BaseURL default: "https://api.binaryedge.io"
- Sweep(): For each query, GET `{base}/v2/query/search?query={url.QueryEscape(q)}&page=1`. Set header `X-Key: {apikey}`. Parse JSON response `{"events":[{"target":{"ip":"...","port":N}},...]}`. Emit Finding per event with Source=`fmt.Sprintf("binaryedge://%s:%d", event.Target.IP, event.Target.Port)`.
- Add `binaryedgeKeywordIndex` helper.
Update `formatQuery` in queries.go to add cases for "fofa", "netlas", "binaryedge" — all use bare keyword (same as default).
Same patterns as Plan 12-01: use sources.NewClient(), s.Limiters.Wait before requests, standard error handling.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go build ./pkg/recon/sources/</automated>
</verify>
<done>Three source files compile, each implements recon.ReconSource interface</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: Unit tests for FOFA, Netlas, BinaryEdge sources</name>
<files>pkg/recon/sources/fofa_test.go, pkg/recon/sources/netlas_test.go, pkg/recon/sources/binaryedge_test.go</files>
<behavior>
- FOFA: httptest server returns mock JSON with 2 results; Sweep emits 2 findings with "recon:fofa" source type
- FOFA: empty Email or APIKey => Enabled()==false
- Netlas: httptest server returns mock JSON with 2 items; Sweep emits 2 findings with "recon:netlas" source type
- Netlas: empty APIKey => Enabled()==false
- BinaryEdge: httptest server returns mock JSON with 2 events; Sweep emits 2 findings with "recon:binaryedge" source type
- BinaryEdge: empty APIKey => Enabled()==false
- All: cancelled context returns context error
</behavior>
<action>
Create test files following the same httptest pattern used in Plan 12-01:
- Use httptest.NewServer to mock API responses matching each source's expected JSON shape
- Set BaseURL to test server URL
- Create a minimal providers.Registry with 1-2 test providers
- Verify Finding count, SourceType, and Source URL format
- Test disabled state (empty credentials)
- Test context cancellation
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go test ./pkg/recon/sources/ -run "TestFOFA|TestNetlas|TestBinaryEdge" -v -count=1</automated>
</verify>
<done>All FOFA, Netlas, BinaryEdge tests pass; each source emits correct findings from mock API responses</done>
</task>
</tasks>
<verification>
- `go build ./pkg/recon/sources/` compiles without errors
- `go test ./pkg/recon/sources/ -run "TestFOFA|TestNetlas|TestBinaryEdge" -v` all pass
- Each source file has compile-time assertion `var _ recon.ReconSource = (*XxxSource)(nil)`
</verification>
<success_criteria>
Three IoT scanner sources (FOFA, Netlas, BinaryEdge) implement recon.ReconSource, use shared Client for HTTP, respect rate limiting via LimiterRegistry, and pass unit tests with mock API responses.
</success_criteria>
<output>
After completion, create `.planning/phases/12-osint_iot_cloud_storage/12-02-SUMMARY.md`
</output>

View File

@@ -0,0 +1,183 @@
---
phase: 12-osint_iot_cloud_storage
plan: 03
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/recon/sources/s3scanner.go
- pkg/recon/sources/s3scanner_test.go
- pkg/recon/sources/gcsscanner.go
- pkg/recon/sources/gcsscanner_test.go
- pkg/recon/sources/azureblob.go
- pkg/recon/sources/azureblob_test.go
- pkg/recon/sources/dospaces.go
- pkg/recon/sources/dospaces_test.go
autonomous: true
requirements: [RECON-CLOUD-01, RECON-CLOUD-02, RECON-CLOUD-03, RECON-CLOUD-04]
must_haves:
truths:
- "S3Scanner enumerates publicly accessible S3 buckets by name pattern and scans readable objects for API key exposure"
- "GCSScanner scans publicly accessible Google Cloud Storage buckets"
- "AzureBlobScanner scans publicly accessible Azure Blob containers"
- "DOSpacesScanner scans publicly accessible DigitalOcean Spaces"
- "Each cloud scanner is credentialless (uses anonymous HTTP to probe public buckets) and always Enabled"
artifacts:
- path: "pkg/recon/sources/s3scanner.go"
provides: "S3Scanner implementing recon.ReconSource"
exports: ["S3Scanner"]
- path: "pkg/recon/sources/gcsscanner.go"
provides: "GCSScanner implementing recon.ReconSource"
exports: ["GCSScanner"]
- path: "pkg/recon/sources/azureblob.go"
provides: "AzureBlobScanner implementing recon.ReconSource"
exports: ["AzureBlobScanner"]
- path: "pkg/recon/sources/dospaces.go"
provides: "DOSpacesScanner implementing recon.ReconSource"
exports: ["DOSpacesScanner"]
key_links:
- from: "pkg/recon/sources/s3scanner.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
---
<objective>
Implement four cloud storage scanner recon sources: S3Scanner, GCSScanner, AzureBlobScanner, and DOSpacesScanner.
Purpose: Enable discovery of API keys leaked in publicly accessible cloud storage buckets across AWS, GCP, Azure, and DigitalOcean.
Output: Four source files + tests following the established Phase 10 pattern.
Note on RECON-CLOUD-03 (MinIO via Shodan) and RECON-CLOUD-04 (GrayHatWarfare): These are addressed here. MinIO discovery is implemented as a Shodan query variant within S3Scanner (MinIO uses S3-compatible API). GrayHatWarfare is implemented as a dedicated scanner that queries the GrayHatWarfare buckets.grayhatwarfare.com API.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
@pkg/recon/sources/bing.go
@pkg/recon/sources/queries.go
@pkg/recon/sources/register.go
<interfaces>
From pkg/recon/source.go:
```go
type ReconSource interface {
Name() string
RateLimit() rate.Limit
Burst() int
RespectsRobots() bool
Enabled(cfg Config) bool
Sweep(ctx context.Context, query string, out chan<- Finding) error
}
```
From pkg/recon/sources/httpclient.go:
```go
type Client struct { HTTP *http.Client; MaxRetries int; UserAgent string }
func NewClient() *Client
func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Implement S3Scanner and GCSScanner</name>
<files>pkg/recon/sources/s3scanner.go, pkg/recon/sources/gcsscanner.go</files>
<action>
**S3Scanner** (s3scanner.go) — RECON-CLOUD-01 + RECON-CLOUD-03:
- Struct: `S3Scanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client`
- Compile-time assertion: `var _ recon.ReconSource = (*S3Scanner)(nil)`
- Name(): "s3"
- RateLimit(): rate.Every(500 * time.Millisecond) — S3 public reads are generous
- Burst(): 3
- RespectsRobots(): false (direct API calls)
- Enabled(): always true (credentialless — probes public buckets)
- Sweep(): Generates candidate bucket names from provider keywords (e.g., "openai-keys", "anthropic-config", "llm-keys", etc.) using a helper `bucketNames(registry)` that combines provider keywords with common suffixes like "-keys", "-config", "-backup", "-data", "-secrets", "-env". For each candidate bucket:
1. HEAD `https://{bucket}.s3.amazonaws.com/` — if 200/403, bucket exists
2. If 200 (public listing), GET the ListBucket XML, parse `<Key>` elements
3. For keys matching common config file patterns (.env, config.*, *.json, *.yaml, *.yml, *.toml, *.conf), emit a Finding with Source=`s3://{bucket}/{key}`, SourceType="recon:s3", Confidence="medium"
4. Do NOT download object contents (too heavy) — just flag the presence of suspicious files
- Use BaseURL override for tests (default: "https://%s.s3.amazonaws.com")
- Note: MinIO instances (RECON-CLOUD-03) are discovered via Shodan queries in Plan 12-01's ShodanSource using the query "minio" — this source focuses on AWS S3 bucket enumeration.
**GCSScanner** (gcsscanner.go) — RECON-CLOUD-02:
- Struct: `GCSScanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client`
- Name(): "gcs"
- RateLimit(): rate.Every(500 * time.Millisecond)
- Burst(): 3
- RespectsRobots(): false
- Enabled(): always true (credentialless)
- Sweep(): Same bucket enumeration pattern as S3Scanner but using `https://storage.googleapis.com/{bucket}` for HEAD and listing. GCS public bucket listing returns JSON when Accept: application/json is set. Parse `{"items":[{"name":"..."}]}`. Emit findings for config-pattern files with Source=`gs://{bucket}/{name}`, SourceType="recon:gcs".
Both sources share a common `bucketNames` helper function — define it in s3scanner.go and export it for use by both.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go build ./pkg/recon/sources/</automated>
</verify>
<done>S3Scanner and GCSScanner compile and implement recon.ReconSource</done>
</task>
<task type="auto">
<name>Task 2: Implement AzureBlobScanner, DOSpacesScanner, and all cloud scanner tests</name>
<files>pkg/recon/sources/azureblob.go, pkg/recon/sources/dospaces.go, pkg/recon/sources/s3scanner_test.go, pkg/recon/sources/gcsscanner_test.go, pkg/recon/sources/azureblob_test.go, pkg/recon/sources/dospaces_test.go</files>
<action>
**AzureBlobScanner** (azureblob.go) — RECON-CLOUD-02:
- Struct: `AzureBlobScanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client`
- Name(): "azureblob"
- RateLimit(): rate.Every(500 * time.Millisecond)
- Burst(): 3
- RespectsRobots(): false
- Enabled(): always true (credentialless)
- Sweep(): Uses bucket enumeration pattern with Azure Blob URL format `https://{account}.blob.core.windows.net/{container}?restype=container&comp=list`. Generate account names from provider keywords with common suffixes. Parse XML `<EnumBlobResults><Blobs><Blob><Name>...</Name></Blob></Blobs>`. Emit findings for config-pattern files with Source=`azure://{account}/{container}/{name}`, SourceType="recon:azureblob".
**DOSpacesScanner** (dospaces.go) — RECON-CLOUD-02:
- Struct: `DOSpacesScanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client`
- Name(): "spaces"
- RateLimit(): rate.Every(500 * time.Millisecond)
- Burst(): 3
- RespectsRobots(): false
- Enabled(): always true (credentialless)
- Sweep(): Uses bucket enumeration with DO Spaces URL format `https://{bucket}.{region}.digitaloceanspaces.com/`. Iterate regions: nyc3, sfo3, ams3, sgp1, fra1. Same XML ListBucket format as S3 (DO Spaces is S3-compatible). Emit findings with Source=`do://{bucket}/{key}`, SourceType="recon:spaces".
**Tests** (all four test files):
Each test file follows the httptest pattern:
- Mock server returns appropriate XML/JSON for bucket listing
- Verify Sweep emits correct number of findings with correct SourceType and Source URL format
- Verify Enabled() returns true (credentialless sources)
- Test with empty registry (no keywords => no bucket names => no findings)
- Test context cancellation
Use a minimal providers.Registry with 1 test provider having keyword "testprov" so bucket names like "testprov-keys" are generated.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go test ./pkg/recon/sources/ -run "TestS3Scanner|TestGCSScanner|TestAzureBlob|TestDOSpaces" -v -count=1</automated>
</verify>
<done>All four cloud scanner sources compile and pass tests; each emits findings with correct source type and URL format</done>
</task>
</tasks>
<verification>
- `go build ./pkg/recon/sources/` compiles without errors
- `go test ./pkg/recon/sources/ -run "TestS3Scanner|TestGCSScanner|TestAzureBlob|TestDOSpaces" -v` all pass
- Each source file has compile-time assertion
</verification>
<success_criteria>
Four cloud storage scanners (S3, GCS, Azure Blob, DO Spaces) implement recon.ReconSource with credentialless public bucket enumeration, use shared Client for HTTP, and pass unit tests.
</success_criteria>
<output>
After completion, create `.planning/phases/12-osint_iot_cloud_storage/12-03-SUMMARY.md`
</output>

View File

@@ -0,0 +1,217 @@
---
phase: 12-osint_iot_cloud_storage
plan: 04
type: execute
wave: 2
depends_on: [12-01, 12-02, 12-03]
files_modified:
- pkg/recon/sources/register.go
- cmd/recon.go
- pkg/recon/sources/integration_test.go
autonomous: true
requirements: [RECON-IOT-01, RECON-IOT-02, RECON-IOT-03, RECON-IOT-04, RECON-IOT-05, RECON-IOT-06, RECON-CLOUD-01, RECON-CLOUD-02, RECON-CLOUD-03, RECON-CLOUD-04]
must_haves:
truths:
- "RegisterAll registers all 28 sources (18 Phase 10-11 + 10 Phase 12)"
- "cmd/recon.go populates SourcesConfig with all Phase 12 credential fields from env/viper"
- "Integration test proves all 10 new sources are registered and discoverable by name"
artifacts:
- path: "pkg/recon/sources/register.go"
provides: "RegisterAll with all Phase 12 sources added"
contains: "Phase 12"
- path: "cmd/recon.go"
provides: "buildReconEngine with Phase 12 credential wiring"
contains: "ShodanAPIKey"
- path: "pkg/recon/sources/integration_test.go"
provides: "Integration test covering all 28 registered sources"
contains: "28"
key_links:
- from: "pkg/recon/sources/register.go"
to: "pkg/recon/sources/shodan.go"
via: "engine.Register(&ShodanSource{...})"
pattern: "ShodanSource"
- from: "cmd/recon.go"
to: "pkg/recon/sources/register.go"
via: "sources.RegisterAll(e, cfg)"
pattern: "RegisterAll"
---
<objective>
Wire all 10 Phase 12 sources into RegisterAll and cmd/recon.go, plus integration test.
Purpose: Make all IoT and cloud storage sources available via `keyhunter recon list` and `keyhunter recon full`.
Output: Updated RegisterAll (28 sources total), updated cmd/recon.go with credential wiring, integration test.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/sources/register.go
@cmd/recon.go
@pkg/recon/sources/integration_test.go
<interfaces>
From pkg/recon/sources/register.go:
```go
type SourcesConfig struct {
GitHubToken string
// ... existing Phase 10-11 fields ...
Registry *providers.Registry
Limiters *recon.LimiterRegistry
}
func RegisterAll(engine *recon.Engine, cfg SourcesConfig)
```
From cmd/recon.go:
```go
func buildReconEngine() *recon.Engine // constructs engine with all sources
func firstNonEmpty(a, b string) string // env -> viper precedence
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Extend SourcesConfig, RegisterAll, and cmd/recon.go</name>
<files>pkg/recon/sources/register.go, cmd/recon.go</files>
<action>
**SourcesConfig** (register.go) — add these fields after the existing Phase 11 fields:
```go
// Phase 12: IoT scanner API keys.
ShodanAPIKey string
CensysAPIId string
CensysAPISecret string
ZoomEyeAPIKey string
FOFAEmail string
FOFAAPIKey string
NetlasAPIKey string
BinaryEdgeAPIKey string
```
**RegisterAll** (register.go) — add after the Phase 11 paste site registrations:
```go
// Phase 12: IoT scanner sources.
engine.Register(&ShodanSource{
APIKey: cfg.ShodanAPIKey,
Registry: reg,
Limiters: lim,
})
engine.Register(&CensysSource{
APIId: cfg.CensysAPIId,
APISecret: cfg.CensysAPISecret,
Registry: reg,
Limiters: lim,
})
engine.Register(&ZoomEyeSource{
APIKey: cfg.ZoomEyeAPIKey,
Registry: reg,
Limiters: lim,
})
engine.Register(&FOFASource{
Email: cfg.FOFAEmail,
APIKey: cfg.FOFAAPIKey,
Registry: reg,
Limiters: lim,
})
engine.Register(&NetlasSource{
APIKey: cfg.NetlasAPIKey,
Registry: reg,
Limiters: lim,
})
engine.Register(&BinaryEdgeSource{
APIKey: cfg.BinaryEdgeAPIKey,
Registry: reg,
Limiters: lim,
})
// Phase 12: Cloud storage sources (credentialless).
engine.Register(&S3Scanner{
Registry: reg,
Limiters: lim,
})
engine.Register(&GCSScanner{
Registry: reg,
Limiters: lim,
})
engine.Register(&AzureBlobScanner{
Registry: reg,
Limiters: lim,
})
engine.Register(&DOSpacesScanner{
Registry: reg,
Limiters: lim,
})
```
Update the RegisterAll doc comment to say "28 sources total" (18 Phase 10-11 + 10 Phase 12).
**cmd/recon.go** — in buildReconEngine(), add to the SourcesConfig literal:
```go
ShodanAPIKey: firstNonEmpty(os.Getenv("SHODAN_API_KEY"), viper.GetString("recon.shodan.api_key")),
CensysAPIId: firstNonEmpty(os.Getenv("CENSYS_API_ID"), viper.GetString("recon.censys.api_id")),
CensysAPISecret: firstNonEmpty(os.Getenv("CENSYS_API_SECRET"), viper.GetString("recon.censys.api_secret")),
ZoomEyeAPIKey: firstNonEmpty(os.Getenv("ZOOMEYE_API_KEY"), viper.GetString("recon.zoomeye.api_key")),
FOFAEmail: firstNonEmpty(os.Getenv("FOFA_EMAIL"), viper.GetString("recon.fofa.email")),
FOFAAPIKey: firstNonEmpty(os.Getenv("FOFA_API_KEY"), viper.GetString("recon.fofa.api_key")),
NetlasAPIKey: firstNonEmpty(os.Getenv("NETLAS_API_KEY"), viper.GetString("recon.netlas.api_key")),
BinaryEdgeAPIKey: firstNonEmpty(os.Getenv("BINARYEDGE_API_KEY"), viper.GetString("recon.binaryedge.api_key")),
```
Update the reconCmd Long description to mention Phase 12 sources.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go build ./cmd/...</automated>
</verify>
<done>RegisterAll registers 28 sources; cmd/recon.go wires all Phase 12 credentials from env/viper</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: Integration test for all 28 registered sources</name>
<files>pkg/recon/sources/integration_test.go</files>
<behavior>
- TestRegisterAll_Phase12 registers all sources, asserts 28 total
- All 10 new source names are present: shodan, censys, zoomeye, fofa, netlas, binaryedge, s3, gcs, azureblob, spaces
- IoT sources with empty credentials report Enabled()==false
- Cloud storage sources (credentialless) report Enabled()==true
- SweepAll with short context timeout completes without panic
</behavior>
<action>
Extend the existing integration_test.go (which currently tests 18 Phase 10-11 sources):
- Update the expected source count from 18 to 28
- Add all 10 new source names to the expected names list
- Add assertions that IoT sources (shodan, censys, zoomeye, fofa, netlas, binaryedge) are Enabled()==false when credentials are empty
- Add assertions that cloud sources (s3, gcs, azureblob, spaces) are Enabled()==true (credentialless)
- Keep the existing SweepAll test with short context timeout, verify no panics
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go test ./pkg/recon/sources/ -run "TestRegisterAll" -v -count=1</automated>
</verify>
<done>Integration test passes with 28 registered sources; all Phase 12 source names are discoverable</done>
</task>
</tasks>
<verification>
- `go build ./cmd/...` compiles without errors
- `go test ./pkg/recon/sources/ -run "TestRegisterAll" -v` passes with 28 sources
- `go test ./pkg/recon/sources/ -v -count=1` all tests pass (existing + new)
</verification>
<success_criteria>
All 10 Phase 12 sources are wired into RegisterAll and discoverable via the recon engine. cmd/recon.go reads credentials from env vars and viper config. Integration test confirms 28 total sources registered.
</success_criteria>
<output>
After completion, create `.planning/phases/12-osint_iot_cloud_storage/12-04-SUMMARY.md`
</output>

915
README.md

File diff suppressed because it is too large Load Diff