Files
keyhunter/.planning/phases/08-dork-engine/08-05-PLAN.md
2026-04-06 00:13:13 +03:00

8.9 KiB

phase, plan, type, wave, depends_on, files_modified, autonomous, requirements, must_haves
phase plan type wave depends_on files_modified autonomous requirements must_haves
08-dork-engine 05 execute 2
08-01
pkg/dorks/github.go
pkg/dorks/github_test.go
true
DORK-02
truths artifacts key_links
GitHubExecutor.Source() returns "github"
GitHubExecutor.Execute runs GitHub Code Search against api.github.com and returns []Match
Missing token returns ErrMissingAuth with setup instructions
Retry-After header is honored (sleep + retry once) for 403/429
Response items mapped to Match with URL, Path, Snippet (text_matches)
path provides contains
pkg/dorks/github.go GitHubExecutor implementing Executor interface type GitHubExecutor struct
path provides contains
pkg/dorks/github_test.go httptest server exercising success/auth/rate-limit paths httptest.NewServer
from to via pattern
pkg/dorks/github.go https://api.github.com/search/code net/http client api.github.com/search/code
from to via pattern
pkg/dorks/github.go pkg/dorks/executor.go Executor interface interface satisfaction Execute(ctx
Implement the live GitHub Code Search executor — the only source that actually runs in Phase 8 (all other executors stay stubbed with ErrSourceNotImplemented). Hits `GET https://api.github.com/search/code?q={query}`, authenticated via GITHUB_TOKEN env var / viper config. Honors rate-limit response codes. Maps response items to pkg/dorks.Match entries consumable by the engine pipeline in downstream phases.

Purpose: Satisfies the "GitHub live" slice of DORK-02 and unblocks keyhunter dorks run --source=github in Plan 08-06. Output: Working pkg/dorks.GitHubExecutor + httptest-backed test suite.

<execution_context> @$HOME/.claude/get-shit-done/workflows/execute-plan.md @$HOME/.claude/get-shit-done/templates/summary.md </execution_context>

@.planning/phases/08-dork-engine/08-CONTEXT.md @.planning/phases/08-dork-engine/08-01-PLAN.md @pkg/dorks/executor.go ```go type Executor interface { Source() string Execute(ctx context.Context, d Dork, limit int) ([]Match, error) }

type Match struct { DorkID string Source string URL string Snippet string Path string }

var ErrMissingAuth = errors.New("dork source requires auth credentials")

</interfaces>
</context>

<tasks>

<task type="auto" tdd="true">
  <name>Task 1: GitHubExecutor with net/http + Retry-After handling</name>
  <files>pkg/dorks/github.go, pkg/dorks/github_test.go</files>
  <behavior>
    - Test: Execute with empty token returns ErrMissingAuth (wrapped) without hitting HTTP
    - Test: Execute with httptest server returning 200 + items parses response into []Match with URL/Path/Snippet
    - Test: limit=5 caps returned Match count at 5 even if API returns 10
    - Test: 403 with X-RateLimit-Remaining=0 and Retry-After=1 sleeps and retries once, then succeeds
    - Test: 401 returns ErrMissingAuth (token rejected)
    - Test: 422 (invalid query) returns a descriptive error containing the status code
    - Test: Source() returns "github"
  </behavior>
  <action>
    Create pkg/dorks/github.go:

    ```go
    package dorks

    import (
        "context"
        "encoding/json"
        "fmt"
        "io"
        "net/http"
        "strconv"
        "time"
    )

    type GitHubExecutor struct {
        Token      string
        BaseURL    string // default "https://api.github.com", overridable for tests
        HTTPClient *http.Client
        MaxRetries int    // default 1
    }

    func NewGitHubExecutor(token string) *GitHubExecutor {
        return &GitHubExecutor{
            Token:      token,
            BaseURL:    "https://api.github.com",
            HTTPClient: &http.Client{Timeout: 30 * time.Second},
            MaxRetries: 1,
        }
    }

    func (g *GitHubExecutor) Source() string { return "github" }

    type ghSearchResponse struct {
        TotalCount int `json:"total_count"`
        Items      []struct {
            Name       string `json:"name"`
            Path       string `json:"path"`
            HTMLURL    string `json:"html_url"`
            Repository struct {
                FullName string `json:"full_name"`
            } `json:"repository"`
            TextMatches []struct {
                Fragment string `json:"fragment"`
            } `json:"text_matches"`
        } `json:"items"`
    }

    func (g *GitHubExecutor) Execute(ctx context.Context, d Dork, limit int) ([]Match, error) {
        if g.Token == "" {
            return nil, fmt.Errorf("%w: set GITHUB_TOKEN env var or `keyhunter config set dorks.github.token <pat>` (needs public_repo scope)", ErrMissingAuth)
        }
        if limit <= 0 || limit > 100 {
            limit = 30
        }

        url := fmt.Sprintf("%s/search/code?q=%s&per_page=%d", g.BaseURL, urlQueryEscape(d.Query), limit)

        var resp *http.Response
        for attempt := 0; attempt <= g.MaxRetries; attempt++ {
            req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)
            if err != nil { return nil, err }
            req.Header.Set("Accept", "application/vnd.github.v3.text-match+json")
            req.Header.Set("Authorization", "Bearer "+g.Token)
            req.Header.Set("User-Agent", "keyhunter-dork-engine")

            r, err := g.HTTPClient.Do(req)
            if err != nil { return nil, fmt.Errorf("github search: %w", err) }

            if r.StatusCode == http.StatusOK {
                resp = r
                break
            }

            body, _ := io.ReadAll(r.Body)
            r.Body.Close()

            switch r.StatusCode {
            case http.StatusUnauthorized:
                return nil, fmt.Errorf("%w: github token rejected (401)", ErrMissingAuth)
            case http.StatusForbidden, http.StatusTooManyRequests:
                if attempt < g.MaxRetries {
                    sleep := parseRetryAfter(r.Header.Get("Retry-After"))
                    select {
                    case <-time.After(sleep):
                        continue
                    case <-ctx.Done():
                        return nil, ctx.Err()
                    }
                }
                return nil, fmt.Errorf("github rate limit: %d %s", r.StatusCode, string(body))
            default:
                return nil, fmt.Errorf("github search failed: %d %s", r.StatusCode, string(body))
            }
        }
        defer resp.Body.Close()

        var parsed ghSearchResponse
        if err := json.NewDecoder(resp.Body).Decode(&parsed); err != nil {
            return nil, fmt.Errorf("decoding github response: %w", err)
        }

        out := make([]Match, 0, len(parsed.Items))
        for _, it := range parsed.Items {
            snippet := ""
            if len(it.TextMatches) > 0 {
                snippet = it.TextMatches[0].Fragment
            }
            out = append(out, Match{
                DorkID:  d.ID,
                Source:  "github",
                URL:     it.HTMLURL,
                Path:    it.Repository.FullName + "/" + it.Path,
                Snippet: snippet,
            })
            if len(out) >= limit { break }
        }
        return out, nil
    }

    func parseRetryAfter(v string) time.Duration {
        if v == "" { return time.Second }
        if secs, err := strconv.Atoi(v); err == nil {
            return time.Duration(secs) * time.Second
        }
        return time.Second
    }

    func urlQueryEscape(s string) string {
        return (&url.URL{Path: s}).EscapedPath() // wrong — use url.QueryEscape
    }
    ```

    Fix the helper: import "net/url" and use `url.QueryEscape(s)` — do NOT hand-roll.

    Create pkg/dorks/github_test.go using httptest.NewServer. Override
    executor.BaseURL to the test server URL. One subtest per behavior case.
    For Retry-After test: server returns 403 with Retry-After: 1 on first
    request, 200 with fake items on second.

    Do NOT register GitHubExecutor into a global Runner here — Plan 08-06 does
    the wiring inside cmd/dorks.go via NewGitHubExecutor(viper.GetString(...)).
  </action>
  <verify>
    <automated>cd /home/salva/Documents/apikey && go test ./pkg/dorks/... -run GitHub -v</automated>
  </verify>
  <done>
    All GitHub executor test cases pass; Execute honors token, rate limit, and
    limit cap; Match fields populated from real response shape.
  </done>
</task>

</tasks>

<verification>
`go test ./pkg/dorks/...` passes including all new GitHub cases.
</verification>

<success_criteria>
- pkg/dorks.GitHubExecutor implements Executor interface
- Live GitHub Code Search calls are testable via httptest (BaseURL override)
- ErrMissingAuth surfaces with actionable setup instructions
- Retry-After respected once before giving up
</success_criteria>

<output>
After completion, create `.planning/phases/08-dork-engine/08-05-SUMMARY.md`
</output>