docs(15): create phase plan — forums, collaboration, log aggregators

This commit is contained in:
salvacybersec
2026-04-06 13:47:43 +03:00
parent 554e93435f
commit 1affb0d864
5 changed files with 846 additions and 1 deletions

View File

@@ -304,7 +304,13 @@ Plans:
2. `keyhunter recon --sources=devto,medium,telegram,discord` scans publicly accessible posts, articles, and indexed channel content 2. `keyhunter recon --sources=devto,medium,telegram,discord` scans publicly accessible posts, articles, and indexed channel content
3. `keyhunter recon --sources=notion,confluence,trello,googledocs` scans publicly accessible pages via dorking and direct API access where available 3. `keyhunter recon --sources=notion,confluence,trello,googledocs` scans publicly accessible pages via dorking and direct API access where available
4. `keyhunter recon --sources=elasticsearch,grafana,sentry` discovers exposed instances and scans accessible log data and dashboards 4. `keyhunter recon --sources=elasticsearch,grafana,sentry` discovers exposed instances and scans accessible log data and dashboards
**Plans**: TBD **Plans**: 4 plans
Plans:
- [ ] 15-01-PLAN.md — StackOverflow, Reddit, HackerNews, Discord, Slack, DevTo forum sources (RECON-FORUM-01..06)
- [ ] 15-02-PLAN.md — Trello, Notion, Confluence, GoogleDocs collaboration sources (RECON-COLLAB-01..04)
- [ ] 15-03-PLAN.md — Elasticsearch, Grafana, Sentry, Kibana, Splunk log aggregator sources (RECON-LOG-01..03)
- [ ] 15-04-PLAN.md — RegisterAll wiring + integration test (all Phase 15 reqs)
### Phase 16: OSINT Threat Intel, Mobile, DNS & API Marketplaces ### Phase 16: OSINT Threat Intel, Mobile, DNS & API Marketplaces
**Goal**: Users can search threat intelligence platforms, scan decompiled Android APKs, perform DNS/subdomain discovery for config endpoint probing, and scan Postman/SwaggerHub API collections for leaked LLM keys **Goal**: Users can search threat intelligence platforms, scan decompiled Android APKs, perform DNS/subdomain discovery for config endpoint probing, and scan Postman/SwaggerHub API collections for leaked LLM keys

View File

@@ -0,0 +1,226 @@
---
phase: 15-osint_forums_collaboration_log_aggregators
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/recon/sources/stackoverflow.go
- pkg/recon/sources/stackoverflow_test.go
- pkg/recon/sources/reddit.go
- pkg/recon/sources/reddit_test.go
- pkg/recon/sources/hackernews.go
- pkg/recon/sources/hackernews_test.go
- pkg/recon/sources/discord.go
- pkg/recon/sources/discord_test.go
- pkg/recon/sources/slack.go
- pkg/recon/sources/slack_test.go
- pkg/recon/sources/devto.go
- pkg/recon/sources/devto_test.go
autonomous: true
requirements:
- RECON-FORUM-01
- RECON-FORUM-02
- RECON-FORUM-03
- RECON-FORUM-04
- RECON-FORUM-05
- RECON-FORUM-06
must_haves:
truths:
- "StackOverflow source searches SE API for LLM keyword matches and scans content"
- "Reddit source searches Reddit for LLM keyword matches and scans content"
- "HackerNews source searches Algolia HN API for keyword matches and scans content"
- "Discord source searches indexed Discord content for keyword matches"
- "Slack source searches indexed Slack content for keyword matches"
- "DevTo source searches dev.to API for keyword matches and scans articles"
artifacts:
- path: "pkg/recon/sources/stackoverflow.go"
provides: "StackOverflowSource implementing ReconSource"
contains: "func (s *StackOverflowSource) Sweep"
- path: "pkg/recon/sources/reddit.go"
provides: "RedditSource implementing ReconSource"
contains: "func (s *RedditSource) Sweep"
- path: "pkg/recon/sources/hackernews.go"
provides: "HackerNewsSource implementing ReconSource"
contains: "func (s *HackerNewsSource) Sweep"
- path: "pkg/recon/sources/discord.go"
provides: "DiscordSource implementing ReconSource"
contains: "func (s *DiscordSource) Sweep"
- path: "pkg/recon/sources/slack.go"
provides: "SlackSource implementing ReconSource"
contains: "func (s *SlackSource) Sweep"
- path: "pkg/recon/sources/devto.go"
provides: "DevToSource implementing ReconSource"
contains: "func (s *DevToSource) Sweep"
key_links:
- from: "pkg/recon/sources/stackoverflow.go"
to: "pkg/recon/sources/httpclient.go"
via: "Client.Do for HTTP requests"
pattern: "client\\.Do"
- from: "pkg/recon/sources/hackernews.go"
to: "pkg/recon/sources/httpclient.go"
via: "Client.Do for Algolia API"
pattern: "client\\.Do"
---
<objective>
Implement six forum/discussion ReconSource implementations: StackOverflow, Reddit, HackerNews, Discord, Slack, and DevTo.
Purpose: Enable scanning developer forums and discussion platforms where API keys are commonly shared in code examples, questions, and discussions.
Output: 6 source files + 6 test files in pkg/recon/sources/
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
@pkg/recon/sources/travisci.go
@pkg/recon/sources/travisci_test.go
<interfaces>
<!-- Executor must implement recon.ReconSource for each source -->
From pkg/recon/source.go:
```go
type ReconSource interface {
Name() string
RateLimit() rate.Limit
Burst() int
RespectsRobots() bool
Enabled(cfg Config) bool
Sweep(ctx context.Context, query string, out chan<- Finding) error
}
```
From pkg/recon/sources/httpclient.go:
```go
func NewClient() *Client
func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
```
From pkg/recon/sources/register.go:
```go
func BuildQueries(reg *providers.Registry, sourceName string) []string
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: StackOverflow, Reddit, HackerNews sources</name>
<files>
pkg/recon/sources/stackoverflow.go
pkg/recon/sources/stackoverflow_test.go
pkg/recon/sources/reddit.go
pkg/recon/sources/reddit_test.go
pkg/recon/sources/hackernews.go
pkg/recon/sources/hackernews_test.go
</files>
<action>
Create three ReconSource implementations following the exact TravisCISource pattern (struct with BaseURL, Registry, Limiters, Client fields; interface compliance var check; BuildQueries for keywords).
**StackOverflowSource** (stackoverflow.go):
- Name: "stackoverflow"
- RateLimit: rate.Every(2*time.Second), Burst: 3
- RespectsRobots: false (API-based)
- Enabled: always true (credentialless, uses public API)
- Sweep: For each BuildQueries keyword, GET `{base}/2.3/search/excerpts?order=desc&sort=relevance&q={keyword}&site=stackoverflow` (Stack Exchange API v2.3). Parse JSON response with `items[].body` or `items[].excerpt`. Run ciLogKeyPattern regex against each item body. Emit Finding with SourceType "recon:stackoverflow", Source set to the question/answer URL.
- BaseURL default: "https://api.stackexchange.com"
- Limit response reading to 256KB per response.
**RedditSource** (reddit.go):
- Name: "reddit"
- RateLimit: rate.Every(2*time.Second), Burst: 2
- RespectsRobots: false (API/JSON endpoint)
- Enabled: always true (credentialless, uses public JSON endpoints)
- Sweep: For each BuildQueries keyword, GET `{base}/search.json?q={keyword}&sort=new&limit=25&restrict_sr=false` (Reddit JSON API, no OAuth needed for public search). Parse JSON `data.children[].data.selftext`. Run ciLogKeyPattern regex. Emit Finding with SourceType "recon:reddit".
- BaseURL default: "https://www.reddit.com"
- Set User-Agent to a descriptive string (Reddit blocks default UA).
**HackerNewsSource** (hackernews.go):
- Name: "hackernews"
- RateLimit: rate.Every(1*time.Second), Burst: 5
- RespectsRobots: false (Algolia API)
- Enabled: always true (credentialless)
- Sweep: For each BuildQueries keyword, GET `{base}/api/v1/search?query={keyword}&tags=comment&hitsPerPage=20` (Algolia HN Search API). Parse JSON `hits[].comment_text`. Run ciLogKeyPattern regex. Emit Finding with SourceType "recon:hackernews".
- BaseURL default: "https://hn.algolia.com"
Each test file follows travisci_test.go pattern: TestXxx_Name, TestXxx_Enabled, TestXxx_Sweep with httptest server returning mock JSON containing an API key pattern, asserting at least one finding with correct SourceType.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestStackOverflow|TestReddit|TestHackerNews" -count=1 -v</automated>
</verify>
<done>Three forum sources compile, pass interface checks, and tests confirm Sweep emits findings from mock API responses</done>
</task>
<task type="auto">
<name>Task 2: Discord, Slack, DevTo sources</name>
<files>
pkg/recon/sources/discord.go
pkg/recon/sources/discord_test.go
pkg/recon/sources/slack.go
pkg/recon/sources/slack_test.go
pkg/recon/sources/devto.go
pkg/recon/sources/devto_test.go
</files>
<action>
Create three more ReconSource implementations following the same pattern.
**DiscordSource** (discord.go):
- Name: "discord"
- RateLimit: rate.Every(3*time.Second), Burst: 2
- RespectsRobots: false
- Enabled: always true (credentialless, uses search engine dorking approach)
- Sweep: Discord does not have a public content search API. Use Google-style dorking approach: for each BuildQueries keyword, GET `{base}/search?q=site:discord.com+{keyword}&format=json` against a configurable search endpoint. In practice this source discovers Discord content indexed by search engines. Parse response for URLs and content, run ciLogKeyPattern. Emit Finding with SourceType "recon:discord".
- BaseURL default: "https://search.discobot.dev" (placeholder, overridden in tests via BaseURL)
- This is a best-effort scraping source since Discord has no public API for message search.
**SlackSource** (slack.go):
- Name: "slack"
- RateLimit: rate.Every(3*time.Second), Burst: 2
- RespectsRobots: false
- Enabled: always true (credentialless, uses search engine dorking approach)
- Sweep: Similar to Discord - Slack messages are not publicly searchable via API without workspace auth. Use dorking approach: for each keyword, GET `{base}/search?q=site:slack-archive.org+OR+site:slack-files.com+{keyword}&format=json`. Parse results, run ciLogKeyPattern. Emit Finding with SourceType "recon:slack".
- BaseURL default: "https://search.slackarchive.dev" (placeholder, overridden in tests)
**DevToSource** (devto.go):
- Name: "devto"
- RateLimit: rate.Every(1*time.Second), Burst: 5
- RespectsRobots: false (API-based)
- Enabled: always true (credentialless, public API)
- Sweep: For each BuildQueries keyword, GET `{base}/api/articles?tag={keyword}&per_page=10&state=rising` (dev.to public API). Parse JSON array of articles, for each article fetch `{base}/api/articles/{id}` to get `body_markdown`. Run ciLogKeyPattern. Emit Finding with SourceType "recon:devto".
- BaseURL default: "https://dev.to"
- Limit to first 5 articles to stay within rate limits.
Each test file: TestXxx_Name, TestXxx_Enabled, TestXxx_Sweep with httptest mock server. Discord and Slack tests mock the search endpoint returning results with API key content. DevTo test mocks /api/articles list and /api/articles/{id} detail endpoint.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestDiscord|TestSlack|TestDevTo" -count=1 -v</automated>
</verify>
<done>Three more forum/messaging sources compile, pass interface checks, and tests confirm Sweep emits findings from mock responses</done>
</task>
</tasks>
<verification>
cd /home/salva/Documents/apikey && go build ./... && go vet ./pkg/recon/sources/
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestStackOverflow|TestReddit|TestHackerNews|TestDiscord|TestSlack|TestDevTo" -count=1
</verification>
<success_criteria>
- All 6 forum sources implement recon.ReconSource interface
- All 6 test files pass with httptest-based mocks
- Each source uses BuildQueries + Client.Do + ciLogKeyPattern (or similar) pattern
- go vet and go build pass cleanly
</success_criteria>
<output>
After completion, create `.planning/phases/15-osint_forums_collaboration_log_aggregators/15-01-SUMMARY.md`
</output>

View File

@@ -0,0 +1,191 @@
---
phase: 15-osint_forums_collaboration_log_aggregators
plan: 02
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/recon/sources/trello.go
- pkg/recon/sources/trello_test.go
- pkg/recon/sources/notion.go
- pkg/recon/sources/notion_test.go
- pkg/recon/sources/confluence.go
- pkg/recon/sources/confluence_test.go
- pkg/recon/sources/googledocs.go
- pkg/recon/sources/googledocs_test.go
autonomous: true
requirements:
- RECON-COLLAB-01
- RECON-COLLAB-02
- RECON-COLLAB-03
- RECON-COLLAB-04
must_haves:
truths:
- "Trello source searches public Trello boards for leaked API keys"
- "Notion source searches publicly shared Notion pages for keys"
- "Confluence source searches exposed Confluence instances for keys"
- "Google Docs source searches public documents for keys"
artifacts:
- path: "pkg/recon/sources/trello.go"
provides: "TrelloSource implementing ReconSource"
contains: "func (s *TrelloSource) Sweep"
- path: "pkg/recon/sources/notion.go"
provides: "NotionSource implementing ReconSource"
contains: "func (s *NotionSource) Sweep"
- path: "pkg/recon/sources/confluence.go"
provides: "ConfluenceSource implementing ReconSource"
contains: "func (s *ConfluenceSource) Sweep"
- path: "pkg/recon/sources/googledocs.go"
provides: "GoogleDocsSource implementing ReconSource"
contains: "func (s *GoogleDocsSource) Sweep"
key_links:
- from: "pkg/recon/sources/trello.go"
to: "pkg/recon/sources/httpclient.go"
via: "Client.Do for Trello API"
pattern: "client\\.Do"
- from: "pkg/recon/sources/confluence.go"
to: "pkg/recon/sources/httpclient.go"
via: "Client.Do for Confluence REST API"
pattern: "client\\.Do"
---
<objective>
Implement four collaboration tool ReconSource implementations: Trello, Notion, Confluence, and Google Docs.
Purpose: Enable scanning publicly accessible collaboration tool pages and documents where API keys are inadvertently shared in team documentation, project boards, and shared docs.
Output: 4 source files + 4 test files in pkg/recon/sources/
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
@pkg/recon/sources/travisci.go
@pkg/recon/sources/travisci_test.go
<interfaces>
From pkg/recon/source.go:
```go
type ReconSource interface {
Name() string
RateLimit() rate.Limit
Burst() int
RespectsRobots() bool
Enabled(cfg Config) bool
Sweep(ctx context.Context, query string, out chan<- Finding) error
}
```
From pkg/recon/sources/httpclient.go:
```go
func NewClient() *Client
func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
```
From pkg/recon/sources/register.go:
```go
func BuildQueries(reg *providers.Registry, sourceName string) []string
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Trello and Notion sources</name>
<files>
pkg/recon/sources/trello.go
pkg/recon/sources/trello_test.go
pkg/recon/sources/notion.go
pkg/recon/sources/notion_test.go
</files>
<action>
Create two ReconSource implementations following the TravisCISource pattern.
**TrelloSource** (trello.go):
- Name: "trello"
- RateLimit: rate.Every(2*time.Second), Burst: 3
- RespectsRobots: false (API-based)
- Enabled: always true (credentialless — Trello public boards are accessible without auth)
- Sweep: Trello has a public search API for public boards. For each BuildQueries keyword, GET `{base}/1/search?query={keyword}&modelTypes=cards&card_fields=name,desc&cards_limit=10` (Trello REST API, public boards are searchable without API key). Parse JSON `cards[].desc` (card descriptions often contain pasted credentials). Run ciLogKeyPattern regex. Emit Finding with SourceType "recon:trello", Source set to card URL `https://trello.com/c/{id}`.
- BaseURL default: "https://api.trello.com"
- Read up to 256KB per response.
**NotionSource** (notion.go):
- Name: "notion"
- RateLimit: rate.Every(3*time.Second), Burst: 2
- RespectsRobots: true (scrapes public pages found via dorking)
- Enabled: always true (credentialless — uses dorking to find public Notion pages)
- Sweep: Notion has no public search API. Use a dorking approach: for each BuildQueries keyword, GET `{base}/search?q=site:notion.site+OR+site:notion.so+{keyword}&format=json`. Parse search results for Notion page URLs. For each URL, fetch the page HTML and run ciLogKeyPattern against text content. Emit Finding with SourceType "recon:notion".
- BaseURL default: "https://search.notion.dev" (placeholder, overridden in tests via BaseURL)
- This is a best-effort source since Notion public pages require dorking to discover.
Test files: TestXxx_Name, TestXxx_Enabled, TestXxx_Sweep with httptest mock. Trello test mocks /1/search endpoint returning card JSON with API key in desc field. Notion test mocks search + page fetch endpoints.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestTrello|TestNotion" -count=1 -v</automated>
</verify>
<done>Trello and Notion sources compile, pass interface checks, tests confirm Sweep emits findings from mock responses</done>
</task>
<task type="auto">
<name>Task 2: Confluence and Google Docs sources</name>
<files>
pkg/recon/sources/confluence.go
pkg/recon/sources/confluence_test.go
pkg/recon/sources/googledocs.go
pkg/recon/sources/googledocs_test.go
</files>
<action>
Create two more ReconSource implementations.
**ConfluenceSource** (confluence.go):
- Name: "confluence"
- RateLimit: rate.Every(3*time.Second), Burst: 2
- RespectsRobots: true (scrapes publicly exposed Confluence wikis)
- Enabled: always true (credentialless — targets exposed instances)
- Sweep: Exposed Confluence instances have a REST API at `/rest/api/content/search`. For each BuildQueries keyword, GET `{base}/rest/api/content/search?cql=text~"{keyword}"&limit=10&expand=body.storage`. Parse JSON `results[].body.storage.value` (HTML content). Strip HTML tags (simple regex or strings approach), run ciLogKeyPattern. Emit Finding with SourceType "recon:confluence", Source as page URL.
- BaseURL default: "https://confluence.example.com" (always overridden — no single default instance)
- In practice the query string from `keyhunter recon --sources=confluence --query="target.atlassian.net"` would provide the target. If no target can be determined from the query, return nil early.
**GoogleDocsSource** (googledocs.go):
- Name: "googledocs"
- RateLimit: rate.Every(3*time.Second), Burst: 2
- RespectsRobots: true (scrapes public Google Docs)
- Enabled: always true (credentialless)
- Sweep: Google Docs shared publicly are accessible via their export URL. Use dorking approach: for each BuildQueries keyword, GET `{base}/search?q=site:docs.google.com+{keyword}&format=json`. For each discovered doc URL, fetch `{docURL}/export?format=txt` to get plain text. Run ciLogKeyPattern. Emit Finding with SourceType "recon:googledocs".
- BaseURL default: "https://search.googledocs.dev" (placeholder, overridden in tests)
- Best-effort source relying on search engine indexing of public docs.
Test files: TestXxx_Name, TestXxx_Enabled, TestXxx_Sweep with httptest mock. Confluence test mocks /rest/api/content/search returning CQL results with key in body.storage.value. GoogleDocs test mocks search + export endpoints.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestConfluence|TestGoogleDocs" -count=1 -v</automated>
</verify>
<done>Confluence and Google Docs sources compile, pass interface checks, tests confirm Sweep emits findings from mock responses</done>
</task>
</tasks>
<verification>
cd /home/salva/Documents/apikey && go build ./... && go vet ./pkg/recon/sources/
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestTrello|TestNotion|TestConfluence|TestGoogleDocs" -count=1
</verification>
<success_criteria>
- All 4 collaboration sources implement recon.ReconSource interface
- All 4 test files pass with httptest-based mocks
- Each source follows the established pattern (BuildQueries + Client.Do + ciLogKeyPattern)
- go vet and go build pass cleanly
</success_criteria>
<output>
After completion, create `.planning/phases/15-osint_forums_collaboration_log_aggregators/15-02-SUMMARY.md`
</output>

View File

@@ -0,0 +1,215 @@
---
phase: 15-osint_forums_collaboration_log_aggregators
plan: 03
type: execute
wave: 1
depends_on: []
files_modified:
- pkg/recon/sources/elasticsearch.go
- pkg/recon/sources/elasticsearch_test.go
- pkg/recon/sources/grafana.go
- pkg/recon/sources/grafana_test.go
- pkg/recon/sources/sentry.go
- pkg/recon/sources/sentry_test.go
- pkg/recon/sources/kibana.go
- pkg/recon/sources/kibana_test.go
- pkg/recon/sources/splunk.go
- pkg/recon/sources/splunk_test.go
autonomous: true
requirements:
- RECON-LOG-01
- RECON-LOG-02
- RECON-LOG-03
must_haves:
truths:
- "Elasticsearch source searches exposed ES instances for documents containing API keys"
- "Grafana source searches exposed Grafana dashboards for API keys in queries and annotations"
- "Sentry source searches exposed Sentry instances for API keys in error reports"
- "Kibana source searches exposed Kibana instances for API keys in saved objects"
- "Splunk source searches exposed Splunk instances for API keys in log data"
artifacts:
- path: "pkg/recon/sources/elasticsearch.go"
provides: "ElasticsearchSource implementing ReconSource"
contains: "func (s *ElasticsearchSource) Sweep"
- path: "pkg/recon/sources/grafana.go"
provides: "GrafanaSource implementing ReconSource"
contains: "func (s *GrafanaSource) Sweep"
- path: "pkg/recon/sources/sentry.go"
provides: "SentrySource implementing ReconSource"
contains: "func (s *SentrySource) Sweep"
- path: "pkg/recon/sources/kibana.go"
provides: "KibanaSource implementing ReconSource"
contains: "func (s *KibanaSource) Sweep"
- path: "pkg/recon/sources/splunk.go"
provides: "SplunkSource implementing ReconSource"
contains: "func (s *SplunkSource) Sweep"
key_links:
- from: "pkg/recon/sources/elasticsearch.go"
to: "pkg/recon/sources/httpclient.go"
via: "Client.Do for ES _search API"
pattern: "client\\.Do"
- from: "pkg/recon/sources/grafana.go"
to: "pkg/recon/sources/httpclient.go"
via: "Client.Do for Grafana API"
pattern: "client\\.Do"
---
<objective>
Implement five log aggregator ReconSource implementations: Elasticsearch, Grafana, Sentry, Kibana, and Splunk.
Purpose: Enable scanning exposed logging/monitoring dashboards where API keys frequently appear in log entries, error reports, and dashboard configurations. RECON-LOG-01 covers Elasticsearch+Kibana together, RECON-LOG-02 covers Grafana, RECON-LOG-03 covers Sentry. Splunk is an additional log aggregator that fits naturally in this category.
Output: 5 source files + 5 test files in pkg/recon/sources/
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/source.go
@pkg/recon/sources/httpclient.go
@pkg/recon/sources/travisci.go
@pkg/recon/sources/travisci_test.go
<interfaces>
From pkg/recon/source.go:
```go
type ReconSource interface {
Name() string
RateLimit() rate.Limit
Burst() int
RespectsRobots() bool
Enabled(cfg Config) bool
Sweep(ctx context.Context, query string, out chan<- Finding) error
}
```
From pkg/recon/sources/httpclient.go:
```go
func NewClient() *Client
func (c *Client) Do(ctx context.Context, req *http.Request) (*http.Response, error)
```
From pkg/recon/sources/register.go:
```go
func BuildQueries(reg *providers.Registry, sourceName string) []string
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Elasticsearch, Kibana, Splunk sources</name>
<files>
pkg/recon/sources/elasticsearch.go
pkg/recon/sources/elasticsearch_test.go
pkg/recon/sources/kibana.go
pkg/recon/sources/kibana_test.go
pkg/recon/sources/splunk.go
pkg/recon/sources/splunk_test.go
</files>
<action>
Create three ReconSource implementations following the TravisCISource pattern. These target exposed instances discovered via the query parameter (e.g. `keyhunter recon --sources=elasticsearch --query="target-es.example.com"`).
**ElasticsearchSource** (elasticsearch.go):
- Name: "elasticsearch"
- RateLimit: rate.Every(2*time.Second), Burst: 3
- RespectsRobots: false (API-based)
- Enabled: always true (credentialless — targets exposed instances without auth)
- Sweep: Exposed Elasticsearch instances allow unauthenticated queries. For each BuildQueries keyword, POST `{base}/_search` with JSON body `{"query":{"query_string":{"query":"{keyword}"}},"size":20}`. Parse JSON `hits.hits[]._source` (stringify the _source object). Run ciLogKeyPattern against stringified source. Emit Finding with SourceType "recon:elasticsearch", Source as `{base}/{index}/{id}`.
- BaseURL default: "http://localhost:9200" (always overridden by query target)
- If BaseURL is the default and query does not look like a URL, return nil early (no target to scan).
- Read up to 512KB per response (ES responses can be large).
**KibanaSource** (kibana.go):
- Name: "kibana"
- RateLimit: rate.Every(2*time.Second), Burst: 3
- RespectsRobots: false (API-based)
- Enabled: always true (credentialless)
- Sweep: Exposed Kibana instances have a saved objects API. GET `{base}/api/saved_objects/_find?type=visualization&type=dashboard&search={keyword}&per_page=20` with header `kbn-xsrf: true`. Parse JSON `saved_objects[].attributes` (stringify). Run ciLogKeyPattern. Also try GET `{base}/api/saved_objects/_find?type=index-pattern&per_page=10` to discover index patterns, then query ES via Kibana proxy: GET `{base}/api/console/proxy?path=/{index}/_search&method=GET` with keyword query. Emit Finding with SourceType "recon:kibana".
- BaseURL default: "http://localhost:5601" (always overridden)
**SplunkSource** (splunk.go):
- Name: "splunk"
- RateLimit: rate.Every(3*time.Second), Burst: 2
- RespectsRobots: false (API-based)
- Enabled: always true (credentialless — targets exposed Splunk Web)
- Sweep: Exposed Splunk instances may allow unauthenticated search via REST API. For each BuildQueries keyword, GET `{base}/services/search/jobs/export?search=search+{keyword}&output_mode=json&count=20`. Parse JSON results, run ciLogKeyPattern. Emit Finding with SourceType "recon:splunk".
- BaseURL default: "https://localhost:8089" (always overridden)
- If no target, return nil early.
Tests: httptest mock servers. ES test mocks POST /_search returning hits with API key in _source. Kibana test mocks /api/saved_objects/_find. Splunk test mocks /services/search/jobs/export.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestElasticsearch|TestKibana|TestSplunk" -count=1 -v</automated>
</verify>
<done>Three log aggregator sources compile, pass interface checks, tests confirm Sweep emits findings from mock API responses</done>
</task>
<task type="auto">
<name>Task 2: Grafana and Sentry sources</name>
<files>
pkg/recon/sources/grafana.go
pkg/recon/sources/grafana_test.go
pkg/recon/sources/sentry.go
pkg/recon/sources/sentry_test.go
</files>
<action>
Create two more ReconSource implementations.
**GrafanaSource** (grafana.go):
- Name: "grafana"
- RateLimit: rate.Every(2*time.Second), Burst: 3
- RespectsRobots: false (API-based)
- Enabled: always true (credentialless — targets exposed Grafana instances)
- Sweep: Exposed Grafana instances allow unauthenticated dashboard browsing when anonymous access is enabled. For each BuildQueries keyword:
1. GET `{base}/api/search?query={keyword}&type=dash-db&limit=10` to find dashboards.
2. For each dashboard, GET `{base}/api/dashboards/uid/{uid}` to get dashboard JSON.
3. Stringify the dashboard JSON panels and targets, run ciLogKeyPattern.
4. Also check `{base}/api/datasources` for data source configs that may contain credentials.
Emit Finding with SourceType "recon:grafana", Source as dashboard URL.
- BaseURL default: "http://localhost:3000" (always overridden)
**SentrySource** (sentry.go):
- Name: "sentry"
- RateLimit: rate.Every(2*time.Second), Burst: 3
- RespectsRobots: false (API-based)
- Enabled: always true (credentialless — targets exposed Sentry instances)
- Sweep: Exposed Sentry instances (self-hosted) may have the API accessible. For each BuildQueries keyword:
1. GET `{base}/api/0/issues/?query={keyword}&limit=10` to search issues.
2. For each issue, GET `{base}/api/0/issues/{id}/events/?limit=5` to get events.
3. Stringify event data (tags, breadcrumbs, exception values), run ciLogKeyPattern.
Emit Finding with SourceType "recon:sentry".
- BaseURL default: "https://sentry.example.com" (always overridden)
- Error reports commonly contain API keys in request headers, environment variables, and stack traces.
Tests: httptest mock servers. Grafana test mocks /api/search + /api/dashboards/uid/{uid} returning dashboard JSON with API key. Sentry test mocks /api/0/issues/ + /api/0/issues/{id}/events/ returning event data with API key.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestGrafana|TestSentry" -count=1 -v</automated>
</verify>
<done>Grafana and Sentry sources compile, pass interface checks, tests confirm Sweep emits findings from mock API responses</done>
</task>
</tasks>
<verification>
cd /home/salva/Documents/apikey && go build ./... && go vet ./pkg/recon/sources/
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestElasticsearch|TestKibana|TestSplunk|TestGrafana|TestSentry" -count=1
</verification>
<success_criteria>
- All 5 log aggregator sources implement recon.ReconSource interface
- All 5 test files pass with httptest-based mocks
- Each source follows the established pattern (BuildQueries + Client.Do + ciLogKeyPattern)
- go vet and go build pass cleanly
</success_criteria>
<output>
After completion, create `.planning/phases/15-osint_forums_collaboration_log_aggregators/15-03-SUMMARY.md`
</output>

View File

@@ -0,0 +1,207 @@
---
phase: 15-osint_forums_collaboration_log_aggregators
plan: 04
type: execute
wave: 2
depends_on:
- 15-01
- 15-02
- 15-03
files_modified:
- pkg/recon/sources/register.go
- pkg/recon/sources/register_test.go
- pkg/recon/sources/integration_test.go
- cmd/recon.go
autonomous: true
requirements:
- RECON-FORUM-01
- RECON-FORUM-02
- RECON-FORUM-03
- RECON-FORUM-04
- RECON-FORUM-05
- RECON-FORUM-06
- RECON-COLLAB-01
- RECON-COLLAB-02
- RECON-COLLAB-03
- RECON-COLLAB-04
- RECON-LOG-01
- RECON-LOG-02
- RECON-LOG-03
must_haves:
truths:
- "RegisterAll wires all 15 new Phase 15 sources onto the engine (67 total)"
- "cmd/recon.go reads any new Phase 15 credentials from viper/env and passes to SourcesConfig"
- "Integration test confirms all 67 sources are registered and forum/collab/log sources produce findings"
artifacts:
- path: "pkg/recon/sources/register.go"
provides: "RegisterAll extended with 15 Phase 15 sources"
contains: "Phase 15"
- path: "pkg/recon/sources/register_test.go"
provides: "Updated test expecting 67 sources"
contains: "67"
key_links:
- from: "pkg/recon/sources/register.go"
to: "pkg/recon/sources/stackoverflow.go"
via: "engine.Register(&StackOverflowSource{})"
pattern: "StackOverflowSource"
- from: "pkg/recon/sources/register.go"
to: "pkg/recon/sources/elasticsearch.go"
via: "engine.Register(&ElasticsearchSource{})"
pattern: "ElasticsearchSource"
- from: "cmd/recon.go"
to: "pkg/recon/sources/register.go"
via: "sources.RegisterAll(engine, cfg)"
pattern: "RegisterAll"
---
<objective>
Wire all 15 Phase 15 sources into RegisterAll, update cmd/recon.go for any new credentials, update register_test.go to expect 67 sources, and add integration test coverage.
Purpose: Complete Phase 15 by connecting all new sources to the engine and verifying end-to-end registration.
Output: Updated register.go, register_test.go, integration_test.go, cmd/recon.go
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/PROJECT.md
@.planning/ROADMAP.md
@.planning/STATE.md
@pkg/recon/sources/register.go
@pkg/recon/sources/register_test.go
@cmd/recon.go
<interfaces>
From pkg/recon/sources/register.go (current state):
```go
type SourcesConfig struct {
// ... existing fields for Phase 10-14 ...
Registry *providers.Registry
Limiters *recon.LimiterRegistry
}
func RegisterAll(engine *recon.Engine, cfg SourcesConfig) { ... }
```
New Phase 15 source types to register (all credentialless — no new SourcesConfig fields needed):
```go
// Forum sources (Plan 15-01):
&StackOverflowSource{Registry: reg, Limiters: lim}
&RedditSource{Registry: reg, Limiters: lim}
&HackerNewsSource{Registry: reg, Limiters: lim}
&DiscordSource{Registry: reg, Limiters: lim}
&SlackSource{Registry: reg, Limiters: lim}
&DevToSource{Registry: reg, Limiters: lim}
// Collaboration sources (Plan 15-02):
&TrelloSource{Registry: reg, Limiters: lim}
&NotionSource{Registry: reg, Limiters: lim}
&ConfluenceSource{Registry: reg, Limiters: lim}
&GoogleDocsSource{Registry: reg, Limiters: lim}
// Log aggregator sources (Plan 15-03):
&ElasticsearchSource{Registry: reg, Limiters: lim}
&GrafanaSource{Registry: reg, Limiters: lim}
&SentrySource{Registry: reg, Limiters: lim}
&KibanaSource{Registry: reg, Limiters: lim}
&SplunkSource{Registry: reg, Limiters: lim}
```
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: Wire RegisterAll + update register_test.go</name>
<files>
pkg/recon/sources/register.go
pkg/recon/sources/register_test.go
</files>
<action>
Extend RegisterAll in register.go to register all 15 Phase 15 sources. Add a comment block:
```go
// Phase 15: Forum sources (credentialless).
engine.Register(&StackOverflowSource{Registry: reg, Limiters: lim})
engine.Register(&RedditSource{Registry: reg, Limiters: lim})
engine.Register(&HackerNewsSource{Registry: reg, Limiters: lim})
engine.Register(&DiscordSource{Registry: reg, Limiters: lim})
engine.Register(&SlackSource{Registry: reg, Limiters: lim})
engine.Register(&DevToSource{Registry: reg, Limiters: lim})
// Phase 15: Collaboration sources (credentialless).
engine.Register(&TrelloSource{Registry: reg, Limiters: lim})
engine.Register(&NotionSource{Registry: reg, Limiters: lim})
engine.Register(&ConfluenceSource{Registry: reg, Limiters: lim})
engine.Register(&GoogleDocsSource{Registry: reg, Limiters: lim})
// Phase 15: Log aggregator sources (credentialless).
engine.Register(&ElasticsearchSource{Registry: reg, Limiters: lim})
engine.Register(&GrafanaSource{Registry: reg, Limiters: lim})
engine.Register(&SentrySource{Registry: reg, Limiters: lim})
engine.Register(&KibanaSource{Registry: reg, Limiters: lim})
engine.Register(&SplunkSource{Registry: reg, Limiters: lim})
```
Update the RegisterAll doc comment to say "67 sources total" (52 + 15).
All Phase 15 sources are credentialless, so NO new SourcesConfig fields are needed. Do NOT modify SourcesConfig.
Update register_test.go:
- Rename test to TestRegisterAll_WiresAllSixtySevenSources
- Add all 15 new source names to the `want` slice in alphabetical order: "confluence", "devto", "discord", "elasticsearch", "googledocs", "grafana", "hackernews", "kibana", "notion", "reddit", "sentry", "slack", "splunk", "stackoverflow", "trello"
- Update count test to expect 67: `if n := len(eng.List()); n != 67`
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestRegisterAll" -count=1 -v</automated>
</verify>
<done>RegisterAll registers 67 sources, register_test.go passes with full alphabetical name list</done>
</task>
<task type="auto">
<name>Task 2: Integration test + cmd/recon.go update</name>
<files>
pkg/recon/sources/integration_test.go
cmd/recon.go
</files>
<action>
**cmd/recon.go**: No new SourcesConfig fields needed (all Phase 15 sources are credentialless). However, update any source count comments in cmd/recon.go if they reference "52 sources" to say "67 sources".
**integration_test.go**: Add a test function TestPhase15_ForumCollabLogSources that:
1. Creates httptest servers for at least 3 representative sources (stackoverflow, trello, elasticsearch).
2. Registers those sources with BaseURL pointed at the test servers.
3. Calls Sweep on each, collects findings from the channel.
4. Asserts at least one finding per source with correct SourceType.
The test servers should return mock JSON responses that contain API key patterns (e.g., `sk-proj-ABCDEF1234567890` in a Stack Overflow answer body, a Trello card description, and an Elasticsearch document _source).
Follow the existing integration_test.go patterns for httptest setup and assertion style.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestPhase15" -count=1 -v</automated>
</verify>
<done>Integration test passes confirming Phase 15 sources produce findings from mock servers; cmd/recon.go updated</done>
</task>
</tasks>
<verification>
cd /home/salva/Documents/apikey && go build ./... && go vet ./...
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -run "TestRegisterAll|TestPhase15" -count=1
cd /home/salva/Documents/apikey && go test ./pkg/recon/sources/ -count=1
</verification>
<success_criteria>
- RegisterAll registers exactly 67 sources (52 existing + 15 new)
- All source names appear in alphabetical order in register_test.go
- Integration test confirms representative Phase 15 sources produce findings
- Full test suite passes: go test ./pkg/recon/sources/ -count=1
- go build ./... compiles cleanly
</success_criteria>
<output>
After completion, create `.planning/phases/15-osint_forums_collaboration_log_aggregators/15-04-SUMMARY.md`
</output>