--- phase: 12-osint_iot_cloud_storage plan: 03 type: execute wave: 1 depends_on: [] files_modified: - pkg/recon/sources/s3scanner.go - pkg/recon/sources/s3scanner_test.go - pkg/recon/sources/gcsscanner.go - pkg/recon/sources/gcsscanner_test.go - pkg/recon/sources/azureblob.go - pkg/recon/sources/azureblob_test.go - pkg/recon/sources/dospaces.go - pkg/recon/sources/dospaces_test.go autonomous: true requirements: [RECON-CLOUD-01, RECON-CLOUD-02, RECON-CLOUD-03, RECON-CLOUD-04] must_haves: truths: - "S3Scanner enumerates publicly accessible S3 buckets by name pattern and scans readable objects for API key exposure" - "GCSScanner scans publicly accessible Google Cloud Storage buckets" - "AzureBlobScanner scans publicly accessible Azure Blob containers" - "DOSpacesScanner scans publicly accessible DigitalOcean Spaces" - "Each cloud scanner is credentialless (uses anonymous HTTP to probe public buckets) and always Enabled" artifacts: - path: "pkg/recon/sources/s3scanner.go" provides: "S3Scanner implementing recon.ReconSource" exports: ["S3Scanner"] - path: "pkg/recon/sources/gcsscanner.go" provides: "GCSScanner implementing recon.ReconSource" exports: ["GCSScanner"] - path: "pkg/recon/sources/azureblob.go" provides: "AzureBlobScanner implementing recon.ReconSource" exports: ["AzureBlobScanner"] - path: "pkg/recon/sources/dospaces.go" provides: "DOSpacesScanner implementing recon.ReconSource" exports: ["DOSpacesScanner"] key_links: - from: "pkg/recon/sources/s3scanner.go" to: "pkg/recon/sources/httpclient.go" via: "sources.Client for retry/backoff HTTP" pattern: "s\\.client\\.Do" --- Implement four cloud storage scanner recon sources: S3Scanner, GCSScanner, AzureBlobScanner, and DOSpacesScanner. Purpose: Enable discovery of API keys leaked in publicly accessible cloud storage buckets across AWS, GCP, Azure, and DigitalOcean. Output: Four source files + tests following the established Phase 10 pattern. Note on RECON-CLOUD-03 (MinIO via Shodan) and RECON-CLOUD-04 (GrayHatWarfare): These are addressed here. MinIO discovery is implemented as a Shodan query variant within S3Scanner (MinIO uses S3-compatible API). GrayHatWarfare is implemented as a dedicated scanner that queries the GrayHatWarfare buckets.grayhatwarfare.com API. @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md @.planning/PROJECT.md @.planning/ROADMAP.md @.planning/STATE.md @pkg/recon/source.go @pkg/recon/sources/httpclient.go @pkg/recon/sources/bing.go @pkg/recon/sources/queries.go @pkg/recon/sources/register.go From pkg/recon/source.go: ```go type ReconSource interface { Name() string RateLimit() rate.Limit Burst() int RespectsRobots() bool Enabled(cfg Config) bool Sweep(ctx context.Context, query string, out chan<- Finding) error } ``` From pkg/recon/sources/httpclient.go: ```go type Client struct { HTTP *http.Client; MaxRetries int; UserAgent string } func NewClient() *Client func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error) ``` Task 1: Implement S3Scanner and GCSScanner pkg/recon/sources/s3scanner.go, pkg/recon/sources/gcsscanner.go **S3Scanner** (s3scanner.go) — RECON-CLOUD-01 + RECON-CLOUD-03: - Struct: `S3Scanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client` - Compile-time assertion: `var _ recon.ReconSource = (*S3Scanner)(nil)` - Name(): "s3" - RateLimit(): rate.Every(500 * time.Millisecond) — S3 public reads are generous - Burst(): 3 - RespectsRobots(): false (direct API calls) - Enabled(): always true (credentialless — probes public buckets) - Sweep(): Generates candidate bucket names from provider keywords (e.g., "openai-keys", "anthropic-config", "llm-keys", etc.) using a helper `bucketNames(registry)` that combines provider keywords with common suffixes like "-keys", "-config", "-backup", "-data", "-secrets", "-env". For each candidate bucket: 1. HEAD `https://{bucket}.s3.amazonaws.com/` — if 200/403, bucket exists 2. If 200 (public listing), GET the ListBucket XML, parse `` elements 3. For keys matching common config file patterns (.env, config.*, *.json, *.yaml, *.yml, *.toml, *.conf), emit a Finding with Source=`s3://{bucket}/{key}`, SourceType="recon:s3", Confidence="medium" 4. Do NOT download object contents (too heavy) — just flag the presence of suspicious files - Use BaseURL override for tests (default: "https://%s.s3.amazonaws.com") - Note: MinIO instances (RECON-CLOUD-03) are discovered via Shodan queries in Plan 12-01's ShodanSource using the query "minio" — this source focuses on AWS S3 bucket enumeration. **GCSScanner** (gcsscanner.go) — RECON-CLOUD-02: - Struct: `GCSScanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client` - Name(): "gcs" - RateLimit(): rate.Every(500 * time.Millisecond) - Burst(): 3 - RespectsRobots(): false - Enabled(): always true (credentialless) - Sweep(): Same bucket enumeration pattern as S3Scanner but using `https://storage.googleapis.com/{bucket}` for HEAD and listing. GCS public bucket listing returns JSON when Accept: application/json is set. Parse `{"items":[{"name":"..."}]}`. Emit findings for config-pattern files with Source=`gs://{bucket}/{name}`, SourceType="recon:gcs". Both sources share a common `bucketNames` helper function — define it in s3scanner.go and export it for use by both. cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go build ./pkg/recon/sources/ S3Scanner and GCSScanner compile and implement recon.ReconSource Task 2: Implement AzureBlobScanner, DOSpacesScanner, and all cloud scanner tests pkg/recon/sources/azureblob.go, pkg/recon/sources/dospaces.go, pkg/recon/sources/s3scanner_test.go, pkg/recon/sources/gcsscanner_test.go, pkg/recon/sources/azureblob_test.go, pkg/recon/sources/dospaces_test.go **AzureBlobScanner** (azureblob.go) — RECON-CLOUD-02: - Struct: `AzureBlobScanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client` - Name(): "azureblob" - RateLimit(): rate.Every(500 * time.Millisecond) - Burst(): 3 - RespectsRobots(): false - Enabled(): always true (credentialless) - Sweep(): Uses bucket enumeration pattern with Azure Blob URL format `https://{account}.blob.core.windows.net/{container}?restype=container&comp=list`. Generate account names from provider keywords with common suffixes. Parse XML `...`. Emit findings for config-pattern files with Source=`azure://{account}/{container}/{name}`, SourceType="recon:azureblob". **DOSpacesScanner** (dospaces.go) — RECON-CLOUD-02: - Struct: `DOSpacesScanner` with fields `Registry *providers.Registry`, `Limiters *recon.LimiterRegistry`, `BaseURL string`, `client *Client` - Name(): "spaces" - RateLimit(): rate.Every(500 * time.Millisecond) - Burst(): 3 - RespectsRobots(): false - Enabled(): always true (credentialless) - Sweep(): Uses bucket enumeration with DO Spaces URL format `https://{bucket}.{region}.digitaloceanspaces.com/`. Iterate regions: nyc3, sfo3, ams3, sgp1, fra1. Same XML ListBucket format as S3 (DO Spaces is S3-compatible). Emit findings with Source=`do://{bucket}/{key}`, SourceType="recon:spaces". **Tests** (all four test files): Each test file follows the httptest pattern: - Mock server returns appropriate XML/JSON for bucket listing - Verify Sweep emits correct number of findings with correct SourceType and Source URL format - Verify Enabled() returns true (credentialless sources) - Test with empty registry (no keywords => no bucket names => no findings) - Test context cancellation Use a minimal providers.Registry with 1 test provider having keyword "testprov" so bucket names like "testprov-keys" are generated. cd /home/salva/Documents/apikey/.claude/worktrees/agent-a6700ee2 && go test ./pkg/recon/sources/ -run "TestS3Scanner|TestGCSScanner|TestAzureBlob|TestDOSpaces" -v -count=1 All four cloud scanner sources compile and pass tests; each emits findings with correct source type and URL format - `go build ./pkg/recon/sources/` compiles without errors - `go test ./pkg/recon/sources/ -run "TestS3Scanner|TestGCSScanner|TestAzureBlob|TestDOSpaces" -v` all pass - Each source file has compile-time assertion Four cloud storage scanners (S3, GCS, Azure Blob, DO Spaces) implement recon.ReconSource with credentialless public bucket enumeration, use shared Client for HTTP, and pass unit tests. After completion, create `.planning/phases/12-osint_iot_cloud_storage/12-03-SUMMARY.md`