salvacybersec/keyhunter

Fork 0

Files

salvacybersec f9e3ad99f8 docs(08-07): complete dork guardrail test plan

2026-04-06 00:25:55 +03:00

4.8 KiB

Raw Blame History

phase, plan, subsystem, tags, one_liner, requires, provides, affects, tech_stack, key_files, decisions, metrics

phase

plan

subsystem

tags

one_liner

requires

provides

affects

tech_stack

key_files

decisions

metrics

08-dork-engine

dorks

test

guardrail

regression-prevention

Guardrail test locks DORK-02 floor at >=150 embedded dorks with per-source minimums and ID uniqueness

pkg/dorks.NewRegistry

pkg/dorks.Registry.Stats

pkg/dorks.Registry.List

CI enforcement of DORK-02 150+ floor

per-source minimum enforcement

cross-source dork ID uniqueness guarantee

pkg/dorks/

added

patterns

table-driven per-source minimums

guardrail test against real embedded FS (no mocks)

created

modified

pkg/dorks/count_test.go

Test hits real embedded filesystem via NewRegistry() rather than a synthetic slice — a synthetic slice would not catch YAML regressions.

Per-source minimums are hardcoded to the planned distribution (50/30/20/15/10/10/10/5) so removing a file from any source fails CI even if total still clears 150.

Stats.BySource / Stats.ByCategory field names matched the plan exactly — no adjustments needed.

duration	completed	tasks_total	tasks_completed	files_created	files_modified
~3m	2026-04-05	1	1	1	0

Phase 08 Plan 07: Dork Count Guardrail Test Summary

Guardrail test suite (pkg/dorks/count_test.go) that enforces the DORK-02 "150+ built-in dorks" requirement against the real embedded filesystem via NewRegistry(). Four tests catch total regressions, per-source drops, missing categories, and ID collisions — the three failure modes a future contributor could introduce without noticing.

What Was Built

Single test file with four TestDork* functions exercising the live embedded corpus:

TestDorkCountGuardrail — asserts len(NewRegistry().List()) >= 150. Error message cites DORK-02 so future maintainers know the threshold is a requirement, not a suggestion.
TestDorkCountPerSource — table-driven check against Stats().BySource. Minimums: github>=50, google>=30, shodan>=20, censys>=15, zoomeye>=10, fofa>=10, gitlab>=10, bing>=5.
TestDorkCategoriesPresent — confirms all five DORK-01 categories (frontier, specialized, infrastructure, emerging, enterprise) appear at least once in Stats().ByCategory.
TestDorkIDsUnique — walks Registry.List() building a seen-map; any duplicate ID across source files fails the test and reports both source files involved.

Verification Results

=== RUN   TestDorkCountGuardrail
--- PASS: TestDorkCountGuardrail (0.00s)
=== RUN   TestDorkCountPerSource
--- PASS: TestDorkCountPerSource (0.00s)
=== RUN   TestDorkCategoriesPresent
--- PASS: TestDorkCategoriesPresent (0.00s)
=== RUN   TestDorkIDsUnique
--- PASS: TestDorkIDsUnique (0.00s)
PASS
ok  github.com/salvacybersec/keyhunter/pkg/dorks  0.017s

Full go test ./pkg/dorks/... also passes (2.024s).

Current embedded corpus state (captured during verification):

Source	Count	Min	Status
github	50	50	at floor
google	30	30	at floor
shodan	20	20	at floor
censys	15	15	at floor
zoomeye	10	10	at floor
fofa	10	10	at floor
gitlab	10	10	at floor
bing	5	5	at floor
total	150	150	at floor

Category	Count
infrastructure	63
frontier	45
specialized	24
emerging	13
enterprise	5

Every source and every category sits at the exact minimum — the guardrail is biting immediately, which is the whole point. Any regression would flip at least one row red.

Deviations from Plan

None - plan executed exactly as written. The Stats struct field names (BySource, ByCategory as map[string]int) matched the plan's notes, so no test adjustments were needed.

Key Decisions

Real FS over synthetic — Tests call NewRegistry() directly rather than building a NewRegistryFromDorks(slice) fixture. Synthetic fixtures would not catch the most likely regression (someone deleting a YAML file).
Hardcoded per-source minimums — The 50/30/20/15/10/10/10/5 distribution is written into the test, not derived. If a future plan wants to raise a floor, it must also update the test, which is the correct coupling.
Duplicate-ID test reports both sources — Error message includes both the first and second source of a collision so a reviewer can resolve the conflict without grep.

Files Created

pkg/dorks/count_test.go (78 lines) — four guardrail tests

Commits

2c554b9 test(08-07): add dork count + uniqueness guardrail

Self-Check: PASSED

pkg/dorks/count_test.go: FOUND
commit 2c554b9: FOUND
all four guardrail tests: PASSED against real embedded FS
full go test ./pkg/dorks/... suite: PASSED

4.8 KiB Raw Blame History