salvacybersec
91becd961f
Merge branch 'worktree-agent-a7f84823'
2026-04-06 01:20:25 +03:00
salvacybersec
6928ca4e70
Merge branch 'worktree-agent-a2fe7ff3'
2026-04-06 01:20:25 +03:00
salvacybersec
12c402ab67
docs(10-07): complete sandbox/IDE scraping sources plan
2026-04-06 01:19:57 +03:00
salvacybersec
21d5551aa4
docs(10-04): complete Bitbucket + Gist sources plan
2026-04-06 01:18:53 +03:00
salvacybersec
ecebffd27d
feat(10-07): add SandboxesSource aggregator (codepen/jsfiddle/stackblitz/glitch/observable)
...
- Single ReconSource umbrella iterating per-platform HTML or JSON search endpoints
- Per-platform failures logged and skipped (log-and-continue); ctx cancel aborts fast
- Sub-platform identifier encoded in Finding.KeyMasked as 'platform=<name>' (pragmatic slot)
- Gitpod intentionally omitted (no public search)
- 5 httptest-backed tests covering HTML+JSON extraction, platform-failure tolerance, ctx cancel
2026-04-06 01:18:15 +03:00
salvacybersec
3715a75be7
docs(10-02): complete GitHubSource plan
2026-04-06 01:17:21 +03:00
salvacybersec
0e16e8ea4c
feat(10-04): add GistSource for public gist keyword recon
...
- GistSource implements recon.ReconSource (RECON-CODE-04)
- Lists /gists/public?per_page=100, fetches each file's raw content,
scans against provider keyword set, emits one Finding per matching gist
- Disabled when GitHub token empty
- Rate: rate.Every(2s), burst 1 (30 req/min GitHub limit)
- 256KB read cap per file; skips gists without keyword matches
- httptest coverage: enable gating, sweep match, no-match, 401, ctx cancel
2026-04-06 01:17:07 +03:00
salvacybersec
62a347f476
feat(10-07): add Replit and CodeSandbox scraping sources
...
- ReplitSource scrapes /search HTML extracting /@user/repl anchors
- CodeSandboxSource scrapes /search HTML extracting /s/slug anchors
- Both use golang.org/x/net/html parser, 10 req/min rate, RespectsRobots=true
- 10 httptest-backed tests covering extraction, ctx cancel, rate/name assertions
2026-04-06 01:16:39 +03:00
salvacybersec
223c23e672
docs(10-03): complete GitLabSource plan summary
2026-04-06 01:16:34 +03:00
salvacybersec
ab636dc5e1
fix(10-02): stabilize GitHubSource provider-name test
2026-04-06 01:15:51 +03:00
salvacybersec
0137dc57b1
feat(10-03): add GitLabSource for /api/v4/search blobs
...
- Implements recon.ReconSource against GitLab Search API
- PRIVATE-TOKEN header auth; rate.Every(30ms) burst 5 (~2000/min)
- Disabled when token empty; Sweep returns nil without calls
- Emits Finding per blob with Source=/projects/<id>/-/blob/<ref>/<path>
- 401 wrapped as ErrUnauthorized; ctx cancellation honored
- httptest coverage: enabled gating, happy path, 401, ctx cancel, iface assert
2026-04-06 01:15:49 +03:00
salvacybersec
d279abf449
feat(10-04): add BitbucketSource for code search recon
...
- BitbucketSource implements recon.ReconSource (RECON-CODE-03)
- Queries /2.0/workspaces/{ws}/search/code with Bearer auth
- Disabled when token OR workspace empty
- Rate: rate.Every(3.6s), burst 1 (Bitbucket 1000/hr limit)
- httptest coverage: enable gating, sweep, 401, ctx cancel
2026-04-06 01:15:42 +03:00
salvacybersec
fb6cb53975
feat(10-02): implement GitHubSource recon.ReconSource
2026-04-06 01:14:52 +03:00
salvacybersec
03deb603b3
test(10-02): add failing tests for GitHubSource
2026-04-06 01:12:56 +03:00
salvacybersec
9b1aaae28d
docs(10-01): complete recon sources foundation plan
2026-04-06 01:10:57 +03:00
salvacybersec
9273f356e6
feat(10-01): add provider-driven query generator and RegisterAll skeleton
...
- BuildQueries(reg, source) dedups keywords and formats per-source syntax
- github/gist use 'keyword' in:file; others use bare keyword
- SourcesConfig placeholder struct for Wave 2 plans to depend on
- RegisterAll no-op stub (Plan 10-09 will fill)
2026-04-06 01:09:57 +03:00
salvacybersec
75024e4701
feat(10-01): add shared retry HTTP client for recon sources
...
- Client.Do retries 429/403/5xx honoring Retry-After
- 401 returns ErrUnauthorized immediately (no retry)
- Context cancellation honored during retry sleeps
- Default UA keyhunter-recon/1.0, 30s timeout, 2 retries
2026-04-06 01:09:02 +03:00
salvacybersec
191bdee3bc
docs(10-osint-code-hosting): create phase 10 plans (9 plans across 3 waves)
2026-04-06 01:07:15 +03:00
salvacybersec
cfe090a5c9
docs(10): OSINT code hosting context
2026-04-06 00:59:18 +03:00
salvacybersec
226274ca9e
docs(phase-09): complete phase execution
2026-04-06 00:56:36 +03:00
salvacybersec
4b8599d959
docs(09-06): complete phase 9 OSINT infrastructure
...
- Add 09-06-SUMMARY.md (integration test + phase summary plan)
- Update STATE.md progress and metrics
- Update ROADMAP.md phase 09 status
- Mark RECON-INFRA-05/06/07/08 complete in REQUIREMENTS.md
2026-04-06 00:53:35 +03:00
salvacybersec
d29a7d30b2
docs(09-06): add phase 09 completion summary
...
Documents all 4 RECON-INFRA requirement IDs as complete, summarizes
decisions (per-source limiters, default-allow robots, SHA256 dedup,
UA pool of 10), lists handoff contract for Phases 10-16.
2026-04-06 00:52:20 +03:00
salvacybersec
a754ff7546
test(09-06): add recon pipeline integration test
...
- Exercises Engine + LimiterRegistry + Stealth + Dedup end-to-end
- testSource emits 5 findings with one duplicate pair (Dedup -> 4)
- TestRobotsOnlyWhenRespectsRobots asserts robots gating via httptest
- Covers RECON-INFRA-05/06/07/08
2026-04-06 00:51:08 +03:00
salvacybersec
0ff9edc6c1
docs(09-05): complete recon CLI command tree plan
2026-04-06 00:48:42 +03:00
salvacybersec
86a6bb864b
feat(09-05): add recon full/list commands and remove stub
...
- cmd/recon.go owns reconCmd with full and list subcommands
- Wires pkg/recon.Engine.SweepAll + Dedup with ExampleSource registered
- Adds --stealth, --respect-robots (default true), --query flags
- Removes reconCmd stub from cmd/stubs.go
2026-04-06 00:47:32 +03:00
salvacybersec
c2137edc41
merge: plan 09-03 stealth+dedup
2026-04-06 00:45:13 +03:00
salvacybersec
1eb86ca308
docs(09-03): complete stealth UA pool and dedup plan
...
- Stealth UA pool (10 browsers) + RandomUserAgent/StealthHeaders
- Stable cross-source Dedup keyed by sha256(provider|masked|source)
- Mark RECON-INFRA-06 complete
2026-04-06 00:44:37 +03:00
salvacybersec
fb1e7f8bf5
docs(09-01): complete recon framework foundation plan
2026-04-06 00:44:04 +03:00
salvacybersec
4dbc38dcc5
docs(09-04): complete robots.txt cache plan
...
Adds SUMMARY, marks RECON-INFRA-07 complete, updates phase 9 roadmap.
2026-04-06 00:43:49 +03:00
salvacybersec
2988fdf9b3
feat(09-03): implement stable cross-source finding Dedup
...
- Dedup drops duplicates keyed by sha256(ProviderName|KeyMasked|Source)
- Preserves input order and first-seen metadata (stable dedup)
- Same provider+masked with different Source URLs are kept separate
- Uses engine.Finding directly to avoid alias collision with Plan 09-01
2026-04-06 00:43:07 +03:00
salvacybersec
851b2432b8
feat(09-01): add Engine with parallel fanout and ExampleSource
...
- Engine.Register/List/SweepAll with ants pool fanout
- ExampleSource emits two deterministic findings (SourceType=recon:example)
- Tests cover Register/List idempotency, SweepAll aggregation, empty-registry,
and Enabled() filtering
2026-04-06 00:42:51 +03:00
salvacybersec
ecfa2bff28
test(09-03): add failing test for cross-source Dedup
2026-04-06 00:42:45 +03:00
salvacybersec
0373931490
feat(09-04): implement RobotsCache with 1h per-host TTL
...
- Parses robots.txt via temoto/robotstxt
- Caches per host for 1 hour; second call within TTL skips HTTP fetch
- Default-allow on network/parse/4xx/5xx errors
- Matches 'keyhunter' user-agent against disallowed paths
- Client field allows httptest injection
Satisfies RECON-INFRA-07.
2026-04-06 00:42:33 +03:00
salvacybersec
2c140e9661
feat(09-03): implement stealth UA pool and StealthHeaders
...
- Pool of 10 realistic browser User-Agents (Chrome/Firefox/Safari/Edge)
- Covers Windows, macOS, Linux, iOS, Android
- RandomUserAgent returns a random pool entry
- StealthHeaders returns UA + Accept-Language header map
2026-04-06 00:42:22 +03:00
salvacybersec
1d5d12740c
docs(09-02): complete LimiterRegistry plan
2026-04-06 00:42:15 +03:00
salvacybersec
4bd6c6b05f
test(09-04): add failing tests for RobotsCache
...
- Allowed/Disallowed path matching
- Cache hit counter assertion
- Default-allow on 5xx network error
- keyhunter UA matching precedence
2026-04-06 00:42:03 +03:00
salvacybersec
bbbc05fa46
test(09-03): add failing test for stealth UA pool
2026-04-06 00:41:55 +03:00
salvacybersec
590fc33955
feat(09-02): add LimiterRegistry with per-source rate limiters and jitter
...
- NewLimiterRegistry + For(name, rate, burst) idempotent lookup
- Wait blocks on token then applies 100ms-1s jitter when stealth
- Per-source isolation (RECON-INFRA-05), ctx cancellation honored
- Tests: isolation, idempotency, ctx cancel, jitter range, no-jitter
2026-04-06 00:41:33 +03:00
salvacybersec
10af12d358
feat(09-01): add ReconSource interface and Config
...
- Define ReconSource interface: Name/RateLimit/Burst/RespectsRobots/Enabled/Sweep
- Alias recon.Finding = engine.Finding for shared storage path
- Config struct carries Stealth, RespectRobots, EnabledSources, Query
2026-04-06 00:40:46 +03:00
salvacybersec
c3b9fb4043
chore(09-04): add github.com/temoto/robotstxt dependency
...
- Added temoto/robotstxt v1.1.2 for robots.txt parsing in recon sources
2026-04-06 00:40:39 +03:00
salvacybersec
ff128c8063
docs(09): create phase plan
2026-04-06 00:39:27 +03:00
salvacybersec
72414e090a
docs(09): OSINT infrastructure context
2026-04-06 00:33:44 +03:00
salvacybersec
ed25d9806d
docs(phase-08): complete phase execution
2026-04-06 00:32:47 +03:00
salvacybersec
84cfa17c39
docs(08-06): complete dorks CLI command tree plan
2026-04-06 00:28:56 +03:00
salvacybersec
c281c96040
feat(08-06): add dorks run/add/delete with injectable executor
...
- Add run subcommand dispatching via dorks.Runner (github live,
other sources wrapped into friendly ErrSourceNotImplemented)
- Add add subcommand with source/category validation and embedded
ID collision guard
- Add delete subcommand that refuses embedded dork ids
- Expose newGitHubExecutor as package var for test injection
- cmd/dorks_test.go covers list filtering, add persistence + list
merge marker, invalid source rejection, embedded collision,
embedded delete refusal, custom delete, shodan not-implemented
path, GitHub missing-token auth hint, fake executor run, yaml
export merge, and info for both origins
Completes DORK-03 (list/run/add/export/info/delete) and DORK-04
(--source/--category filtering).
2026-04-06 00:27:41 +03:00
salvacybersec
b7934ce169
feat(08-06): add dorks list/info/export commands
...
- Replace cmd/stubs.go dorksCmd stub with full command tree
- Add cmd/dorks.go with list, info, export subcommands
- Wire Registry + custom_dorks merge for list/export
- Bind GITHUB_TOKEN env var via viper for downstream run
Satisfies part of DORK-03 (list/info/export) and DORK-04 (source/category
filtering). run/add/delete land in Task 2.
2026-04-06 00:26:36 +03:00
salvacybersec
f9e3ad99f8
docs(08-07): complete dork guardrail test plan
2026-04-06 00:25:55 +03:00
salvacybersec
2c554b9c9c
test(08-07): add dork count + uniqueness guardrail
...
- TestDorkCountGuardrail: enforces DORK-02 >=150 floor
- TestDorkCountPerSource: per-source minimums (github>=50, google>=30, shodan>=20, censys>=15, zoomeye/fofa/gitlab>=10, bing>=5)
- TestDorkCategoriesPresent: all 5 DORK-01 categories present
- TestDorkIDsUnique: no collisions across source files
2026-04-06 00:24:51 +03:00
salvacybersec
3a1ee18198
docs(08-05): complete GitHub Code Search live executor plan
...
- GitHubExecutor implements Executor interface against api.github.com/search/code
- Retry-After honored once for 403/429; ctx cancel respected during sleep
- ErrMissingAuth wrapped for empty token AND 401 server response
- 8 httptest-backed subtests cover success/limit-cap/retry/rate-limit/401/422/source
- Zero new dependencies (stdlib net/http + net/url only)
2026-04-06 00:23:16 +03:00
salvacybersec
2617b22753
docs(08-03): complete Google + Shodan dorks plan
...
- 30 Google + 20 Shodan dorks delivered
- Requirements DORK-01, DORK-02, DORK-04 marked complete
- SUMMARY.md records list-format YAML + dual-location mirror pattern
2026-04-06 00:22:38 +03:00