docs(12): create phase plan — IoT scanners + cloud storage sources

This commit is contained in:
salvacybersec
2026-04-06 12:14:06 +03:00
parent 90d188fe9e
commit e12b4bd2b5
5 changed files with 787 additions and 1 deletions

View File

@@ -0,0 +1,193 @@
---
phase: 12-osint_iot_cloud_storage
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/recon/sources/shodan.go
- pkg/recon/sources/shodan_test.go
- pkg/recon/sources/censys.go
- pkg/recon/sources/censys_test.go
- pkg/recon/sources/zoomeye.go
- pkg/recon/sources/zoomeye_test.go
autonomous: true
requirements: [RECON-IOT-01, RECON-IOT-02, RECON-IOT-03]
must_haves:
truths:
- "ShodanSource searches Shodan /shodan/host/search for exposed LLM endpoints and emits findings"
- "CensysSource searches Censys v2 /hosts/search for exposed services and emits findings"
- "ZoomEyeSource searches ZoomEye /host/search for device/service key exposure and emits findings"
- "Each source is disabled (Enabled==false) when its API key is empty"
artifacts:
- path: "pkg/recon/sources/shodan.go"
provides: "ShodanSource implementing recon.ReconSource"
exports: ["ShodanSource"]
- path: "pkg/recon/sources/censys.go"
provides: "CensysSource implementing recon.ReconSource"
exports: ["CensysSource"]
- path: "pkg/recon/sources/zoomeye.go"
provides: "ZoomEyeSource implementing recon.ReconSource"
exports: ["ZoomEyeSource"]
key_links:
- from: "pkg/recon/sources/shodan.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
- from: "pkg/recon/sources/censys.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
- from: "pkg/recon/sources/zoomeye.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
---
<objective>
Implement three IoT scanner recon sources: Shodan, Censys, and ZoomEye.
Purpose: Enable discovery of exposed LLM endpoints (vLLM, Ollama, LiteLLM proxies) via internet-wide device scanners.
Output: Three source files + tests following the established Phase 10 pattern.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
@pkg/recon/sources/github.go
@pkg/recon/sources/bing.go
@pkg/recon/sources/queries.go
@pkg/recon/sources/register.go
<interfaces>
From pkg/recon/source.go:
```go
type ReconSource interface {
Name() string
RateLimit() rate.Limit
Burst() int
RespectsRobots() bool
Enabled(cfg Config) bool
Sweep(ctx context.Context, query string, out chan<- Finding) error
}
```
From pkg/recon/sources/httpclient.go:
```go
type Client struct { HTTP *http.Client; MaxRetries int; UserAgent string }
func NewClient() *Client
func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
var ErrUnauthorized = errors.New("sources: unauthorized (check credentials)")
```
From pkg/recon/sources/queries.go:
```go
func BuildQueries(reg *providers.Registry, source string) []string
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Implement ShodanSource, CensysSource, ZoomEyeSource</name>
<files>pkg/recon/sources/shodan.go, pkg/recon/sources/censys.go, pkg/recon/sources/zoomeye.go</files>
<action>
Create three source files following the BingDorkSource pattern exactly:
**ShodanSource** (shodan.go):
- Struct: `ShodanSource` with fields `APIKey string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Compile-time assertion: `var _ recon.ReconSource = (*ShodanSource)(nil)`
- Name(): "shodan"
- RateLimit(): rate.Every(1 * time.Second) — Shodan allows ~1 req/s on most plans
- Burst(): 1
- RespectsRobots(): false (authenticated REST API)
- Enabled(): returns `s.APIKey != ""`
- BaseURL default: "https://api.shodan.io"
- Sweep(): For each query from BuildQueries(s.Registry, "shodan"), call GET `{base}/shodan/host/search?key={apikey}&query={url.QueryEscape(q)}`. Parse JSON response `{"matches":[{"ip_str":"...","port":N,"data":"..."},...]}`. Emit a Finding per match with Source=`fmt.Sprintf("shodan://%s:%d", match.IPStr, match.Port)`, SourceType="recon:shodan", Confidence="low", ProviderName from keyword index.
- Add `shodanKeywordIndex` helper (same pattern as bingKeywordIndex).
- Error handling: ErrUnauthorized aborts, context cancellation aborts, transient errors continue.
**CensysSource** (censys.go):
- Struct: `CensysSource` with fields `APIId string`, `APISecret string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Name(): "censys"
- RateLimit(): rate.Every(2500 * time.Millisecond) — Censys free tier is 0.4 req/s
- Burst(): 1
- RespectsRobots(): false
- Enabled(): returns `s.APIId != "" && s.APISecret != ""`
- BaseURL default: "https://search.censys.io/api"
- Sweep(): For each query, POST `{base}/v2/hosts/search` with JSON body `{"q":q,"per_page":25}`. Set Basic Auth header using APIId:APISecret. Parse JSON response `{"result":{"hits":[{"ip":"...","services":[{"port":N,"service_name":"..."}]}]}}`. Emit Finding per hit with Source=`fmt.Sprintf("censys://%s", hit.IP)`.
- Add `censysKeywordIndex` helper.
**ZoomEyeSource** (zoomeye.go):
- Struct: `ZoomEyeSource` with fields `APIKey string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Name(): "zoomeye"
- RateLimit(): rate.Every(2 * time.Second)
- Burst(): 1
- RespectsRobots(): false
- Enabled(): returns `s.APIKey != ""`
- BaseURL default: "https://api.zoomeye.org" (ZoomEye uses v1-style API key in header)
- Sweep(): For each query, GET `{base}/host/search?query={url.QueryEscape(q)}&page=1`. Set header `API-KEY: {apikey}`. Parse JSON response `{"matches":[{"ip":"...","portinfo":{"port":N},"banner":"..."}]}`. Emit Finding per match with Source=`fmt.Sprintf("zoomeye://%s:%d", match.IP, match.PortInfo.Port)`.
- Add `zoomeyeKeywordIndex` helper.
Update `formatQuery` in queries.go to add cases for "shodan", "censys", "zoomeye" — all use bare keyword (same as default).
All sources must use `sources.NewClient()` for HTTP, `s.Limiters.Wait(ctx, s.Name(), ...)` before each request, and follow the same error handling pattern as BingDorkSource.Sweep.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go build ./pkg/recon/sources/</automated>
</verify>
<done>Three source files compile, each implements recon.ReconSource interface</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: Unit tests for Shodan, Censys, ZoomEye sources</name>
<files>pkg/recon/sources/shodan_test.go, pkg/recon/sources/censys_test.go, pkg/recon/sources/zoomeye_test.go</files>
<behavior>
- Shodan: httptest server returns mock JSON with 2 matches; Sweep emits 2 findings with "recon:shodan" source type
- Shodan: empty API key => Enabled()==false, Sweep returns nil with 0 findings
- Censys: httptest server returns mock JSON with 2 hits; Sweep emits 2 findings with "recon:censys" source type
- Censys: empty APIId => Enabled()==false
- ZoomEye: httptest server returns mock JSON with 2 matches; Sweep emits 2 findings with "recon:zoomeye" source type
- ZoomEye: empty API key => Enabled()==false
- All: cancelled context returns context error
</behavior>
<action>
Create test files following the pattern in github_test.go / bing_test.go:
- Use httptest.NewServer to mock API responses
- Set BaseURL to test server URL
- Create a minimal providers.Registry with 1-2 test providers containing keywords
- Verify Finding count, SourceType, and Source URL format
- Test disabled state (empty credentials)
- Test context cancellation
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go test ./pkg/recon/sources/ -run "TestShodan|TestCensys|TestZoomEye" -v -count=1</automated>
</verify>
<done>All Shodan, Censys, ZoomEye tests pass; each source emits correct findings from mock API responses</done>
</task>
</tasks>
<verification>
- `go build ./pkg/recon/sources/` compiles without errors
- `go test ./pkg/recon/sources/ -run "TestShodan|TestCensys|TestZoomEye" -v` all pass
- Each source file has compile-time assertion `var _ recon.ReconSource = (*XxxSource)(nil)`
</verification>
<success_criteria>
Three IoT scanner sources (Shodan, Censys, ZoomEye) implement recon.ReconSource, use shared Client for HTTP, respect rate limiting via LimiterRegistry, and pass unit tests with mock API responses.
</success_criteria>
<output>
After completion, create `.planning/phases/12-osint_iot_cloud_storage/12-01-SUMMARY.md`
</output>

View File

@@ -0,0 +1,187 @@
---
phase: 12-osint_iot_cloud_storage
plan: 02
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/recon/sources/fofa.go
- pkg/recon/sources/fofa_test.go
- pkg/recon/sources/netlas.go
- pkg/recon/sources/netlas_test.go
- pkg/recon/sources/binaryedge.go
- pkg/recon/sources/binaryedge_test.go
autonomous: true
requirements: [RECON-IOT-04, RECON-IOT-05, RECON-IOT-06]
must_haves:
truths:
- "FOFASource searches FOFA API for exposed endpoints and emits findings"
- "NetlasSource searches Netlas API for internet-wide scan results and emits findings"
- "BinaryEdgeSource searches BinaryEdge API for exposed services and emits findings"
- "Each source is disabled when its API key/credentials are empty"
artifacts:
- path: "pkg/recon/sources/fofa.go"
provides: "FOFASource implementing recon.ReconSource"
exports: ["FOFASource"]
- path: "pkg/recon/sources/netlas.go"
provides: "NetlasSource implementing recon.ReconSource"
exports: ["NetlasSource"]
- path: "pkg/recon/sources/binaryedge.go"
provides: "BinaryEdgeSource implementing recon.ReconSource"
exports: ["BinaryEdgeSource"]
key_links:
- from: "pkg/recon/sources/fofa.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
- from: "pkg/recon/sources/netlas.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
- from: "pkg/recon/sources/binaryedge.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
---
<objective>
Implement three IoT scanner recon sources: FOFA, Netlas, and BinaryEdge.
Purpose: Complete the IoT/device scanner coverage with Chinese (FOFA) and alternative (Netlas, BinaryEdge) internet search engines.
Output: Three source files + tests following the established Phase 10 pattern.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
@pkg/recon/sources/bing.go
@pkg/recon/sources/queries.go
@pkg/recon/sources/register.go
<interfaces>
From pkg/recon/source.go:
```go
type ReconSource interface {
Name() string
RateLimit() rate.Limit
Burst() int
RespectsRobots() bool
Enabled(cfg Config) bool
Sweep(ctx context.Context, query string, out chan<- Finding) error
}
```
From pkg/recon/sources/httpclient.go:
```go
type Client struct { HTTP *http.Client; MaxRetries int; UserAgent string }
func NewClient() *Client
func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
var ErrUnauthorized = errors.New("sources: unauthorized (check credentials)")
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Implement FOFASource, NetlasSource, BinaryEdgeSource</name>
<files>pkg/recon/sources/fofa.go, pkg/recon/sources/netlas.go, pkg/recon/sources/binaryedge.go</files>
<action>
Create three source files following the BingDorkSource pattern:
**FOFASource** (fofa.go):
- Struct: `FOFASource` with fields `Email string`, `APIKey string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Compile-time assertion: `var _ recon.ReconSource = (*FOFASource)(nil)`
- Name(): "fofa"
- RateLimit(): rate.Every(1 * time.Second) — FOFA allows ~1 req/s
- Burst(): 1
- RespectsRobots(): false
- Enabled(): returns `s.Email != "" && s.APIKey != ""`
- BaseURL default: "https://fofa.info"
- Sweep(): For each query from BuildQueries, base64-encode the query, then GET `{base}/api/v1/search/all?email={email}&key={apikey}&qbase64={base64query}&size=100`. Parse JSON response `{"results":[["ip","port","protocol","host"],...],"size":N}`. Emit Finding per result with Source=`fmt.Sprintf("fofa://%s:%s", result[0], result[1])`, SourceType="recon:fofa".
- Note: FOFA results array contains string arrays, not objects. Each inner array is [host, ip, port].
- Add `fofaKeywordIndex` helper.
**NetlasSource** (netlas.go):
- Struct: `NetlasSource` with fields `APIKey string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Name(): "netlas"
- RateLimit(): rate.Every(1 * time.Second)
- Burst(): 1
- RespectsRobots(): false
- Enabled(): returns `s.APIKey != ""`
- BaseURL default: "https://app.netlas.io"
- Sweep(): For each query, GET `{base}/api/responses/?q={url.QueryEscape(q)}&start=0&indices=`. Set header `X-API-Key: {apikey}`. Parse JSON response `{"items":[{"data":{"ip":"...","port":N}},...]}`. Emit Finding per item with Source=`fmt.Sprintf("netlas://%s:%d", item.Data.IP, item.Data.Port)`.
- Add `netlasKeywordIndex` helper.
**BinaryEdgeSource** (binaryedge.go):
- Struct: `BinaryEdgeSource` with fields `APIKey string`, `BaseURL string`, `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `client *Client`
- Name(): "binaryedge"
- RateLimit(): rate.Every(2 * time.Second) — BinaryEdge free tier is conservative
- Burst(): 1
- RespectsRobots(): false
- Enabled(): returns `s.APIKey != ""`
- BaseURL default: "https://api.binaryedge.io"
- Sweep(): For each query, GET `{base}/v2/query/search?query={url.QueryEscape(q)}&page=1`. Set header `X-Key: {apikey}`. Parse JSON response `{"events":[{"target":{"ip":"...","port":N}},...]}`. Emit Finding per event with Source=`fmt.Sprintf("binaryedge://%s:%d", event.Target.IP, event.Target.Port)`.
- Add `binaryedgeKeywordIndex` helper.
Update `formatQuery` in queries.go to add cases for "fofa", "netlas", "binaryedge" — all use bare keyword (same as default).
Same patterns as Plan 12-01: use sources.NewClient(), s.Limiters.Wait before requests, standard error handling.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go build ./pkg/recon/sources/</automated>
</verify>
<done>Three source files compile, each implements recon.ReconSource interface</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: Unit tests for FOFA, Netlas, BinaryEdge sources</name>
<files>pkg/recon/sources/fofa_test.go, pkg/recon/sources/netlas_test.go, pkg/recon/sources/binaryedge_test.go</files>
<behavior>
- FOFA: httptest server returns mock JSON with 2 results; Sweep emits 2 findings with "recon:fofa" source type
- FOFA: empty Email or APIKey => Enabled()==false
- Netlas: httptest server returns mock JSON with 2 items; Sweep emits 2 findings with "recon:netlas" source type
- Netlas: empty APIKey => Enabled()==false
- BinaryEdge: httptest server returns mock JSON with 2 events; Sweep emits 2 findings with "recon:binaryedge" source type
- BinaryEdge: empty APIKey => Enabled()==false
- All: cancelled context returns context error
</behavior>
<action>
Create test files following the same httptest pattern used in Plan 12-01:
- Use httptest.NewServer to mock API responses matching each source's expected JSON shape
- Set BaseURL to test server URL
- Create a minimal providers.Registry with 1-2 test providers
- Verify Finding count, SourceType, and Source URL format
- Test disabled state (empty credentials)
- Test context cancellation
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go test ./pkg/recon/sources/ -run "TestFOFA|TestNetlas|TestBinaryEdge" -v -count=1</automated>
</verify>
<done>All FOFA, Netlas, BinaryEdge tests pass; each source emits correct findings from mock API responses</done>
</task>
</tasks>
<verification>
- `go build ./pkg/recon/sources/` compiles without errors
- `go test ./pkg/recon/sources/ -run "TestFOFA|TestNetlas|TestBinaryEdge" -v` all pass
- Each source file has compile-time assertion `var _ recon.ReconSource = (*XxxSource)(nil)`
</verification>
<success_criteria>
Three IoT scanner sources (FOFA, Netlas, BinaryEdge) implement recon.ReconSource, use shared Client for HTTP, respect rate limiting via LimiterRegistry, and pass unit tests with mock API responses.
</success_criteria>
<output>
After completion, create `.planning/phases/12-osint_iot_cloud_storage/12-02-SUMMARY.md`
</output>

View File

@@ -0,0 +1,183 @@
---
phase: 12-osint_iot_cloud_storage
plan: 03
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/recon/sources/s3scanner.go
- pkg/recon/sources/s3scanner_test.go
- pkg/recon/sources/gcsscanner.go
- pkg/recon/sources/gcsscanner_test.go
- pkg/recon/sources/azureblob.go
- pkg/recon/sources/azureblob_test.go
- pkg/recon/sources/dospaces.go
- pkg/recon/sources/dospaces_test.go
autonomous: true
requirements: [RECON-CLOUD-01, RECON-CLOUD-02, RECON-CLOUD-03, RECON-CLOUD-04]
must_haves:
truths:
- "S3Scanner enumerates publicly accessible S3 buckets by name pattern and scans readable objects for API key exposure"
- "GCSScanner scans publicly accessible Google Cloud Storage buckets"
- "AzureBlobScanner scans publicly accessible Azure Blob containers"
- "DOSpacesScanner scans publicly accessible DigitalOcean Spaces"
- "Each cloud scanner is credentialless (uses anonymous HTTP to probe public buckets) and always Enabled"
artifacts:
- path: "pkg/recon/sources/s3scanner.go"
provides: "S3Scanner implementing recon.ReconSource"
exports: ["S3Scanner"]
- path: "pkg/recon/sources/gcsscanner.go"
provides: "GCSScanner implementing recon.ReconSource"
exports: ["GCSScanner"]
- path: "pkg/recon/sources/azureblob.go"
provides: "AzureBlobScanner implementing recon.ReconSource"
exports: ["AzureBlobScanner"]
- path: "pkg/recon/sources/dospaces.go"
provides: "DOSpacesScanner implementing recon.ReconSource"
exports: ["DOSpacesScanner"]
key_links:
- from: "pkg/recon/sources/s3scanner.go"
to: "pkg/recon/sources/httpclient.go"
via: "sources.Client for retry/backoff HTTP"
pattern: "s\\.client\\.Do"
---
<objective>
Implement four cloud storage scanner recon sources: S3Scanner, GCSScanner, AzureBlobScanner, and DOSpacesScanner.
Purpose: Enable discovery of API keys leaked in publicly accessible cloud storage buckets across AWS, GCP, Azure, and DigitalOcean.
Output: Four source files + tests following the established Phase 10 pattern.
Note on RECON-CLOUD-03 (MinIO via Shodan) and RECON-CLOUD-04 (GrayHatWarfare): These are addressed here. MinIO discovery is implemented as a Shodan query variant within S3Scanner (MinIO uses S3-compatible API). GrayHatWarfare is implemented as a dedicated scanner that queries the GrayHatWarfare buckets.grayhatwarfare.com API.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
@pkg/recon/sources/bing.go
@pkg/recon/sources/queries.go
@pkg/recon/sources/register.go
<interfaces>
From pkg/recon/source.go:
```go
type ReconSource interface {
Name() string
RateLimit() rate.Limit
Burst() int
RespectsRobots() bool
Enabled(cfg Config) bool
Sweep(ctx context.Context, query string, out chan<- Finding) error
}
```
From pkg/recon/sources/httpclient.go:
```go
type Client struct { HTTP *http.Client; MaxRetries int; UserAgent string }
func NewClient() *Client
func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Implement S3Scanner and GCSScanner</name>
<files>pkg/recon/sources/s3scanner.go, pkg/recon/sources/gcsscanner.go</files>
<action>
**S3Scanner** (s3scanner.go) — RECON-CLOUD-01 + RECON-CLOUD-03:
- Struct: `S3Scanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client`
- Compile-time assertion: `var _ recon.ReconSource = (*S3Scanner)(nil)`
- Name(): "s3"
- RateLimit(): rate.Every(500 * time.Millisecond) — S3 public reads are generous
- Burst(): 3
- RespectsRobots(): false (direct API calls)
- Enabled(): always true (credentialless — probes public buckets)
- Sweep(): Generates candidate bucket names from provider keywords (e.g., "openai-keys", "anthropic-config", "llm-keys", etc.) using a helper `bucketNames(registry)` that combines provider keywords with common suffixes like "-keys", "-config", "-backup", "-data", "-secrets", "-env". For each candidate bucket:
1. HEAD `https://{bucket}.s3.amazonaws.com/` — if 200/403, bucket exists
2. If 200 (public listing), GET the ListBucket XML, parse `<Key>` elements
3. For keys matching common config file patterns (.env, config.*, *.json, *.yaml, *.yml, *.toml, *.conf), emit a Finding with Source=`s3://{bucket}/{key}`, SourceType="recon:s3", Confidence="medium"
4. Do NOT download object contents (too heavy) — just flag the presence of suspicious files
- Use BaseURL override for tests (default: "https://%s.s3.amazonaws.com")
- Note: MinIO instances (RECON-CLOUD-03) are discovered via Shodan queries in Plan 12-01's ShodanSource using the query "minio" — this source focuses on AWS S3 bucket enumeration.
**GCSScanner** (gcsscanner.go) — RECON-CLOUD-02:
- Struct: `GCSScanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client`
- Name(): "gcs"
- RateLimit(): rate.Every(500 * time.Millisecond)
- Burst(): 3
- RespectsRobots(): false
- Enabled(): always true (credentialless)
- Sweep(): Same bucket enumeration pattern as S3Scanner but using `https://storage.googleapis.com/{bucket}` for HEAD and listing. GCS public bucket listing returns JSON when Accept: application/json is set. Parse `{"items":[{"name":"..."}]}`. Emit findings for config-pattern files with Source=`gs://{bucket}/{name}`, SourceType="recon:gcs".
Both sources share a common `bucketNames` helper function — define it in s3scanner.go and export it for use by both.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go build ./pkg/recon/sources/</automated>
</verify>
<done>S3Scanner and GCSScanner compile and implement recon.ReconSource</done>
</task>
<task type="auto">
<name>Task 2: Implement AzureBlobScanner, DOSpacesScanner, and all cloud scanner tests</name>
<files>pkg/recon/sources/azureblob.go, pkg/recon/sources/dospaces.go, pkg/recon/sources/s3scanner_test.go, pkg/recon/sources/gcsscanner_test.go, pkg/recon/sources/azureblob_test.go, pkg/recon/sources/dospaces_test.go</files>
<action>
**AzureBlobScanner** (azureblob.go) — RECON-CLOUD-02:
- Struct: `AzureBlobScanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client`
- Name(): "azureblob"
- RateLimit(): rate.Every(500 * time.Millisecond)
- Burst(): 3
- RespectsRobots(): false
- Enabled(): always true (credentialless)
- Sweep(): Uses bucket enumeration pattern with Azure Blob URL format `https://{account}.blob.core.windows.net/{container}?restype=container&comp=list`. Generate account names from provider keywords with common suffixes. Parse XML `<EnumBlobResults><Blobs><Blob><Name>...</Name></Blob></Blobs>`. Emit findings for config-pattern files with Source=`azure://{account}/{container}/{name}`, SourceType="recon:azureblob".
**DOSpacesScanner** (dospaces.go) — RECON-CLOUD-02:
- Struct: `DOSpacesScanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client`
- Name(): "spaces"
- RateLimit(): rate.Every(500 * time.Millisecond)
- Burst(): 3
- RespectsRobots(): false
- Enabled(): always true (credentialless)
- Sweep(): Uses bucket enumeration with DO Spaces URL format `https://{bucket}.{region}.digitaloceanspaces.com/`. Iterate regions: nyc3, sfo3, ams3, sgp1, fra1. Same XML ListBucket format as S3 (DO Spaces is S3-compatible). Emit findings with Source=`do://{bucket}/{key}`, SourceType="recon:spaces".
**Tests** (all four test files):
Each test file follows the httptest pattern:
- Mock server returns appropriate XML/JSON for bucket listing
- Verify Sweep emits correct number of findings with correct SourceType and Source URL format
- Verify Enabled() returns true (credentialless sources)
- Test with empty registry (no keywords => no bucket names => no findings)
- Test context cancellation
Use a minimal providers.Registry with 1 test provider having keyword "testprov" so bucket names like "testprov-keys" are generated.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go test ./pkg/recon/sources/ -run "TestS3Scanner|TestGCSScanner|TestAzureBlob|TestDOSpaces" -v -count=1</automated>
</verify>
<done>All four cloud scanner sources compile and pass tests; each emits findings with correct source type and URL format</done>
</task>
</tasks>
<verification>
- `go build ./pkg/recon/sources/` compiles without errors
- `go test ./pkg/recon/sources/ -run "TestS3Scanner|TestGCSScanner|TestAzureBlob|TestDOSpaces" -v` all pass
- Each source file has compile-time assertion
</verification>
<success_criteria>
Four cloud storage scanners (S3, GCS, Azure Blob, DO Spaces) implement recon.ReconSource with credentialless public bucket enumeration, use shared Client for HTTP, and pass unit tests.
</success_criteria>
<output>
After completion, create `.planning/phases/12-osint_iot_cloud_storage/12-03-SUMMARY.md`
</output>

View File

@@ -0,0 +1,217 @@
---
phase: 12-osint_iot_cloud_storage
plan: 04
type: execute
wave: 2
depends_on: [12-01, 12-02, 12-03]
files_modified:
- pkg/recon/sources/register.go
- cmd/recon.go
- pkg/recon/sources/integration_test.go
autonomous: true
requirements: [RECON-IOT-01, RECON-IOT-02, RECON-IOT-03, RECON-IOT-04, RECON-IOT-05, RECON-IOT-06, RECON-CLOUD-01, RECON-CLOUD-02, RECON-CLOUD-03, RECON-CLOUD-04]
must_haves:
truths:
- "RegisterAll registers all 28 sources (18 Phase 10-11 + 10 Phase 12)"
- "cmd/recon.go populates SourcesConfig with all Phase 12 credential fields from env/viper"
- "Integration test proves all 10 new sources are registered and discoverable by name"
artifacts:
- path: "pkg/recon/sources/register.go"
provides: "RegisterAll with all Phase 12 sources added"
contains: "Phase 12"
- path: "cmd/recon.go"
provides: "buildReconEngine with Phase 12 credential wiring"
contains: "ShodanAPIKey"
- path: "pkg/recon/sources/integration_test.go"
provides: "Integration test covering all 28 registered sources"
contains: "28"
key_links:
- from: "pkg/recon/sources/register.go"
to: "pkg/recon/sources/shodan.go"
via: "engine.Register(&ShodanSource{...})"
pattern: "ShodanSource"
- from: "cmd/recon.go"
to: "pkg/recon/sources/register.go"
via: "sources.RegisterAll(e, cfg)"
pattern: "RegisterAll"
---
<objective>
Wire all 10 Phase 12 sources into RegisterAll and cmd/recon.go, plus integration test.
Purpose: Make all IoT and cloud storage sources available via `keyhunter recon list` and `keyhunter recon full`.
Output: Updated RegisterAll (28 sources total), updated cmd/recon.go with credential wiring, integration test.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/sources/register.go
@cmd/recon.go
@pkg/recon/sources/integration_test.go
<interfaces>
From pkg/recon/sources/register.go:
```go
type SourcesConfig struct {
GitHubToken string
// ... existing Phase 10-11 fields ...
Registry *providers.Registry
Limiters *recon.LimiterRegistry
}
func RegisterAll(engine *recon.Engine, cfg SourcesConfig)
```
From cmd/recon.go:
```go
func buildReconEngine() *recon.Engine // constructs engine with all sources
func firstNonEmpty(a, b string) string // env -> viper precedence
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Extend SourcesConfig, RegisterAll, and cmd/recon.go</name>
<files>pkg/recon/sources/register.go, cmd/recon.go</files>
<action>
**SourcesConfig** (register.go) — add these fields after the existing Phase 11 fields:
```go
// Phase 12: IoT scanner API keys.
ShodanAPIKey string
CensysAPIId string
CensysAPISecret string
ZoomEyeAPIKey string
FOFAEmail string
FOFAAPIKey string
NetlasAPIKey string
BinaryEdgeAPIKey string
```
**RegisterAll** (register.go) — add after the Phase 11 paste site registrations:
```go
// Phase 12: IoT scanner sources.
engine.Register(&ShodanSource{
APIKey: cfg.ShodanAPIKey,
Registry: reg,
Limiters: lim,
})
engine.Register(&CensysSource{
APIId: cfg.CensysAPIId,
APISecret: cfg.CensysAPISecret,
Registry: reg,
Limiters: lim,
})
engine.Register(&ZoomEyeSource{
APIKey: cfg.ZoomEyeAPIKey,
Registry: reg,
Limiters: lim,
})
engine.Register(&FOFASource{
Email: cfg.FOFAEmail,
APIKey: cfg.FOFAAPIKey,
Registry: reg,
Limiters: lim,
})
engine.Register(&NetlasSource{
APIKey: cfg.NetlasAPIKey,
Registry: reg,
Limiters: lim,
})
engine.Register(&BinaryEdgeSource{
APIKey: cfg.BinaryEdgeAPIKey,
Registry: reg,
Limiters: lim,
})
// Phase 12: Cloud storage sources (credentialless).
engine.Register(&S3Scanner{
Registry: reg,
Limiters: lim,
})
engine.Register(&GCSScanner{
Registry: reg,
Limiters: lim,
})
engine.Register(&AzureBlobScanner{
Registry: reg,
Limiters: lim,
})
engine.Register(&DOSpacesScanner{
Registry: reg,
Limiters: lim,
})
```
Update the RegisterAll doc comment to say "28 sources total" (18 Phase 10-11 + 10 Phase 12).
**cmd/recon.go** — in buildReconEngine(), add to the SourcesConfig literal:
```go
ShodanAPIKey: firstNonEmpty(os.Getenv("SHODAN_API_KEY"), viper.GetString("recon.shodan.api_key")),
CensysAPIId: firstNonEmpty(os.Getenv("CENSYS_API_ID"), viper.GetString("recon.censys.api_id")),
CensysAPISecret: firstNonEmpty(os.Getenv("CENSYS_API_SECRET"), viper.GetString("recon.censys.api_secret")),
ZoomEyeAPIKey: firstNonEmpty(os.Getenv("ZOOMEYE_API_KEY"), viper.GetString("recon.zoomeye.api_key")),
FOFAEmail: firstNonEmpty(os.Getenv("FOFA_EMAIL"), viper.GetString("recon.fofa.email")),
FOFAAPIKey: firstNonEmpty(os.Getenv("FOFA_API_KEY"), viper.GetString("recon.fofa.api_key")),
NetlasAPIKey: firstNonEmpty(os.Getenv("NETLAS_API_KEY"), viper.GetString("recon.netlas.api_key")),
BinaryEdgeAPIKey: firstNonEmpty(os.Getenv("BINARYEDGE_API_KEY"), viper.GetString("recon.binaryedge.api_key")),
```
Update the reconCmd Long description to mention Phase 12 sources.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go build ./cmd/...</automated>
</verify>
<done>RegisterAll registers 28 sources; cmd/recon.go wires all Phase 12 credentials from env/viper</done>
</task>
<task type="auto" tdd="true">
<name>Task 2: Integration test for all 28 registered sources</name>
<files>pkg/recon/sources/integration_test.go</files>
<behavior>
- TestRegisterAll_Phase12 registers all sources, asserts 28 total
- All 10 new source names are present: shodan, censys, zoomeye, fofa, netlas, binaryedge, s3, gcs, azureblob, spaces
- IoT sources with empty credentials report Enabled()==false
- Cloud storage sources (credentialless) report Enabled()==true
- SweepAll with short context timeout completes without panic
</behavior>
<action>
Extend the existing integration_test.go (which currently tests 18 Phase 10-11 sources):
- Update the expected source count from 18 to 28
- Add all 10 new source names to the expected names list
- Add assertions that IoT sources (shodan, censys, zoomeye, fofa, netlas, binaryedge) are Enabled()==false when credentials are empty
- Add assertions that cloud sources (s3, gcs, azureblob, spaces) are Enabled()==true (credentialless)
- Keep the existing SweepAll test with short context timeout, verify no panics
</action>
<verify>
<automated>cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go test ./pkg/recon/sources/ -run "TestRegisterAll" -v -count=1</automated>
</verify>
<done>Integration test passes with 28 registered sources; all Phase 12 source names are discoverable</done>
</task>
</tasks>
<verification>
- `go build ./cmd/...` compiles without errors
- `go test ./pkg/recon/sources/ -run "TestRegisterAll" -v` passes with 28 sources
- `go test ./pkg/recon/sources/ -v -count=1` all tests pass (existing + new)
</verification>
<success_criteria>
All 10 Phase 12 sources are wired into RegisterAll and discoverable via the recon engine. cmd/recon.go reads credentials from env vars and viper config. Integration test confirms 28 total sources registered.
</success_criteria>
<output>
After completion, create `.planning/phases/12-osint_iot_cloud_storage/12-04-SUMMARY.md`
</output>