Files
keyhunter/.planning/phases/03-tier-3-9-providers/03-01-PLAN.md
2026-04-05 14:39:54 +03:00

545 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

---
phase: 03-tier-3-9-providers
plan: 01
type: execute
wave: 1
depends_on: []
files_modified:
- providers/deepseek.yaml
- providers/zhipu.yaml
- providers/moonshot.yaml
- providers/qwen.yaml
- providers/baidu.yaml
- providers/bytedance.yaml
- providers/01ai.yaml
- providers/minimax.yaml
- providers/baichuan.yaml
- providers/stepfun.yaml
- providers/sensetime.yaml
- providers/iflytek.yaml
- providers/tencent.yaml
- providers/siliconflow.yaml
- providers/360ai.yaml
- providers/kuaishou.yaml
- pkg/providers/definitions/deepseek.yaml
- pkg/providers/definitions/zhipu.yaml
- pkg/providers/definitions/moonshot.yaml
- pkg/providers/definitions/qwen.yaml
- pkg/providers/definitions/baidu.yaml
- pkg/providers/definitions/bytedance.yaml
- pkg/providers/definitions/01ai.yaml
- pkg/providers/definitions/minimax.yaml
- pkg/providers/definitions/baichuan.yaml
- pkg/providers/definitions/stepfun.yaml
- pkg/providers/definitions/sensetime.yaml
- pkg/providers/definitions/iflytek.yaml
- pkg/providers/definitions/tencent.yaml
- pkg/providers/definitions/siliconflow.yaml
- pkg/providers/definitions/360ai.yaml
- pkg/providers/definitions/kuaishou.yaml
autonomous: true
requirements: [PROV-04]
must_haves:
truths:
- "Registry loads 16 Tier 4 Chinese/regional provider YAMLs"
- "Providers without documented key formats use keyword-only detection (no patterns entry)"
- "DeepSeek uses documented sk- prefix pattern"
- "All YAMLs are dual-located and diff-clean"
artifacts:
- path: "providers/deepseek.yaml"
provides: "DeepSeek sk- prefix pattern + keywords"
contains: "deepseek"
- path: "providers/qwen.yaml"
provides: "Alibaba Qwen/DashScope keyword-only detection"
contains: "dashscope"
- path: "providers/baidu.yaml"
provides: "Baidu ERNIE/Wenxin keyword detection"
contains: "wenxin"
key_links:
- from: "provider keywords[]"
to: "Registry Aho-Corasick automaton"
via: "NewRegistry()"
pattern: "keywords"
---
<objective>
Create 16 Tier 4 Chinese/regional LLM provider YAML definitions. These providers mostly lack documented key formats, so detection relies on strong keyword lists anchored to their SDK envs, domains, and API hostnames — not on generic regex (which caused false positives in Phase 2).
Purpose: Satisfy PROV-04 (16 Tier 4 providers). Chinese/regional providers are high-value targets but low-signal for regex detection — keyword-only matching is the correct mitigation.
Output: 32 YAML files (16 providers × 2 locations).
Addresses PROV-04.
</objective>
<execution_context>
@$HOME/.claude/get-shit-done/workflows/execute-plan.md
@$HOME/.claude/get-shit-done/templates/summary.md
</execution_context>
<context>
@.planning/ROADMAP.md
@.planning/phases/03-tier-3-9-providers/03-CONTEXT.md
@pkg/providers/schema.go
@providers/mistral.yaml
@providers/cohere.yaml
<interfaces>
Provider schema (pkg/providers/schema.go): FormatVersion, Name, DisplayName, Tier, LastVerified, Keywords, Patterns, Verify. NO `category` field. Confidence values: high|medium|low.
Patterns array MAY be empty/omitted — registry allows keyword-only providers. Keywords list MUST have ≥3 entries (Aho-Corasick pre-filter requirement).
Files must be dual-located: providers/X.yaml AND pkg/providers/definitions/X.yaml (Go embed cannot use '..' paths).
Phase 2 lesson: generic regex like `[A-Za-z0-9]{32,64}` causes false positives. For providers without distinctive prefixes, OMIT the patterns field entirely (keyword-only detection).
</interfaces>
</context>
<tasks>
<task type="auto">
<name>Task 1: DeepSeek, Zhipu, Moonshot, Qwen, Baidu, ByteDance, 01.AI, MiniMax YAMLs</name>
<files>providers/deepseek.yaml, providers/zhipu.yaml, providers/moonshot.yaml, providers/qwen.yaml, providers/baidu.yaml, providers/bytedance.yaml, providers/01ai.yaml, providers/minimax.yaml, pkg/providers/definitions/deepseek.yaml, pkg/providers/definitions/zhipu.yaml, pkg/providers/definitions/moonshot.yaml, pkg/providers/definitions/qwen.yaml, pkg/providers/definitions/baidu.yaml, pkg/providers/definitions/bytedance.yaml, pkg/providers/definitions/01ai.yaml, pkg/providers/definitions/minimax.yaml</files>
<read_first>
- pkg/providers/schema.go
- providers/mistral.yaml (reference: keyword-anchored generic pattern)
- .planning/phases/03-tier-3-9-providers/03-CONTEXT.md (lessons from Phase 2)
</read_first>
<action>
Create each file in providers/ AND pkg/providers/definitions/ (verbatim copy). DeepSeek has a documented sk- prefix so it gets a pattern; all others are keyword-only (omit patterns entirely).
providers/deepseek.yaml:
```yaml
format_version: 1
name: deepseek
display_name: DeepSeek
tier: 4
last_verified: "2026-04-05"
keywords:
- "deepseek"
- "DEEPSEEK_API_KEY"
- "api.deepseek.com"
- "deepseek-chat"
- "deepseek-coder"
patterns:
- regex: 'sk-[a-f0-9]{32}'
entropy_min: 3.5
confidence: medium
verify:
method: GET
url: https://api.deepseek.com/v1/models
headers:
Authorization: "Bearer {KEY}"
valid_status: [200]
invalid_status: [401, 403]
```
providers/zhipu.yaml:
```yaml
format_version: 1
name: zhipu
display_name: Zhipu AI (GLM)
tier: 4
last_verified: "2026-04-05"
keywords:
- "zhipu"
- "ZHIPU_API_KEY"
- "ZHIPUAI_API_KEY"
- "bigmodel.cn"
- "open.bigmodel.cn"
- "glm-4"
- "chatglm"
verify:
method: GET
url: https://open.bigmodel.cn/api/paas/v4/models
headers:
Authorization: "Bearer {KEY}"
valid_status: [200]
invalid_status: [401, 403]
```
providers/moonshot.yaml:
```yaml
format_version: 1
name: moonshot
display_name: Moonshot AI (Kimi)
tier: 4
last_verified: "2026-04-05"
keywords:
- "moonshot"
- "MOONSHOT_API_KEY"
- "api.moonshot.cn"
- "kimi"
- "moonshot-v1"
patterns:
- regex: 'sk-[A-Za-z0-9]{48}'
entropy_min: 4.0
confidence: medium
verify:
method: GET
url: https://api.moonshot.cn/v1/models
headers:
Authorization: "Bearer {KEY}"
valid_status: [200]
invalid_status: [401, 403]
```
providers/qwen.yaml:
```yaml
format_version: 1
name: qwen
display_name: Alibaba Qwen (DashScope)
tier: 4
last_verified: "2026-04-05"
keywords:
- "dashscope"
- "DASHSCOPE_API_KEY"
- "qwen"
- "qwen-turbo"
- "qwen-max"
- "dashscope.aliyuncs.com"
patterns:
- regex: 'sk-[a-f0-9]{32}'
entropy_min: 3.5
confidence: medium
verify:
method: GET
url: https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation
headers:
Authorization: "Bearer {KEY}"
valid_status: [200, 400]
invalid_status: [401, 403]
```
providers/baidu.yaml:
```yaml
format_version: 1
name: baidu
display_name: Baidu ERNIE (Wenxin)
tier: 4
last_verified: "2026-04-05"
keywords:
- "wenxin"
- "ernie"
- "BAIDU_API_KEY"
- "QIANFAN_AK"
- "QIANFAN_SK"
- "aip.baidubce.com"
- "qianfan"
verify:
method: POST
url: ""
headers: {}
valid_status: []
invalid_status: []
```
providers/bytedance.yaml:
```yaml
format_version: 1
name: bytedance
display_name: ByteDance Doubao (Volcengine)
tier: 4
last_verified: "2026-04-05"
keywords:
- "doubao"
- "volcengine"
- "VOLC_ACCESSKEY"
- "VOLC_SECRETKEY"
- "ARK_API_KEY"
- "ark.cn-beijing.volces.com"
verify:
method: GET
url: ""
headers: {}
valid_status: []
invalid_status: []
```
providers/01ai.yaml:
```yaml
format_version: 1
name: 01ai
display_name: 01.AI (Yi)
tier: 4
last_verified: "2026-04-05"
keywords:
- "01.ai"
- "yi-large"
- "yi-34b"
- "YI_API_KEY"
- "api.lingyiwanwu.com"
- "lingyiwanwu"
verify:
method: GET
url: https://api.lingyiwanwu.com/v1/models
headers:
Authorization: "Bearer {KEY}"
valid_status: [200]
invalid_status: [401, 403]
```
providers/minimax.yaml:
```yaml
format_version: 1
name: minimax
display_name: MiniMax
tier: 4
last_verified: "2026-04-05"
keywords:
- "minimax"
- "MINIMAX_API_KEY"
- "MINIMAX_GROUP_ID"
- "api.minimax.chat"
- "abab6"
verify:
method: GET
url: https://api.minimax.chat/v1/text/chatcompletion_v2
headers:
Authorization: "Bearer {KEY}"
valid_status: [200, 400]
invalid_status: [401, 403]
```
Copy each file VERBATIM to pkg/providers/definitions/ with the same basename.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && for f in deepseek zhipu moonshot qwen baidu bytedance 01ai minimax; do diff providers/$f.yaml pkg/providers/definitions/$f.yaml || exit 1; done && go test ./pkg/providers/... -count=1 && go test ./pkg/engine/... -count=1</automated>
</verify>
<acceptance_criteria>
- All 16 files exist (8 providers × 2 locations)
- `grep -q 'DEEPSEEK_API_KEY' providers/deepseek.yaml`
- `grep -q 'dashscope' providers/qwen.yaml`
- `grep -q 'wenxin' providers/baidu.yaml`
- `grep -q 'lingyiwanwu' providers/01ai.yaml`
- `grep -L 'patterns:' providers/zhipu.yaml providers/baidu.yaml providers/bytedance.yaml` returns all three (keyword-only, no patterns field)
- `diff providers/deepseek.yaml pkg/providers/definitions/deepseek.yaml` returns no diff
- `go test ./pkg/providers/... -count=1` passes
- `go test ./pkg/engine/... -count=1` passes (no regression from new providers)
</acceptance_criteria>
<done>8 Chinese/regional providers dual-located, registry loads them, engine tests pass.</done>
</task>
<task type="auto">
<name>Task 2: Baichuan, StepFun, SenseTime, iFlytek, Tencent, SiliconFlow, 360 AI, Kuaishou YAMLs</name>
<files>providers/baichuan.yaml, providers/stepfun.yaml, providers/sensetime.yaml, providers/iflytek.yaml, providers/tencent.yaml, providers/siliconflow.yaml, providers/360ai.yaml, providers/kuaishou.yaml, pkg/providers/definitions/baichuan.yaml, pkg/providers/definitions/stepfun.yaml, pkg/providers/definitions/sensetime.yaml, pkg/providers/definitions/iflytek.yaml, pkg/providers/definitions/tencent.yaml, pkg/providers/definitions/siliconflow.yaml, pkg/providers/definitions/360ai.yaml, pkg/providers/definitions/kuaishou.yaml</files>
<read_first>
- pkg/providers/schema.go
- providers/deepseek.yaml (created in Task 1, reference for format)
</read_first>
<action>
All 8 of these providers lack publicly documented key formats. Use KEYWORD-ONLY detection — omit the patterns field entirely.
providers/baichuan.yaml:
```yaml
format_version: 1
name: baichuan
display_name: Baichuan AI
tier: 4
last_verified: "2026-04-05"
keywords:
- "baichuan"
- "BAICHUAN_API_KEY"
- "api.baichuan-ai.com"
- "baichuan2"
- "baichuan-turbo"
verify:
method: GET
url: https://api.baichuan-ai.com/v1/chat/completions
headers:
Authorization: "Bearer {KEY}"
valid_status: [200, 400]
invalid_status: [401, 403]
```
providers/stepfun.yaml:
```yaml
format_version: 1
name: stepfun
display_name: StepFun (阶跃星辰)
tier: 4
last_verified: "2026-04-05"
keywords:
- "stepfun"
- "STEP_API_KEY"
- "STEPFUN_API_KEY"
- "api.stepfun.com"
- "step-1v"
verify:
method: GET
url: https://api.stepfun.com/v1/models
headers:
Authorization: "Bearer {KEY}"
valid_status: [200]
invalid_status: [401, 403]
```
providers/sensetime.yaml:
```yaml
format_version: 1
name: sensetime
display_name: SenseTime SenseNova
tier: 4
last_verified: "2026-04-05"
keywords:
- "sensetime"
- "sensenova"
- "SENSETIME_API_KEY"
- "SENSENOVA_API_KEY"
- "api.sensenova.cn"
verify:
method: GET
url: ""
headers: {}
valid_status: []
invalid_status: []
```
providers/iflytek.yaml:
```yaml
format_version: 1
name: iflytek
display_name: iFlytek Spark (讯飞星火)
tier: 4
last_verified: "2026-04-05"
keywords:
- "iflytek"
- "xf_yun"
- "spark_desk"
- "XFYUN_API_KEY"
- "XFYUN_API_SECRET"
- "XFYUN_APPID"
- "spark-api.xf-yun.com"
verify:
method: GET
url: ""
headers: {}
valid_status: []
invalid_status: []
```
providers/tencent.yaml:
```yaml
format_version: 1
name: tencent
display_name: Tencent Hunyuan
tier: 4
last_verified: "2026-04-05"
keywords:
- "hunyuan"
- "TENCENTCLOUD_SECRET_ID"
- "TENCENTCLOUD_SECRET_KEY"
- "hunyuan.tencentcloudapi.com"
- "tencent-hunyuan"
verify:
method: GET
url: ""
headers: {}
valid_status: []
invalid_status: []
```
providers/siliconflow.yaml:
```yaml
format_version: 1
name: siliconflow
display_name: SiliconFlow
tier: 4
last_verified: "2026-04-05"
keywords:
- "siliconflow"
- "SILICONFLOW_API_KEY"
- "api.siliconflow.cn"
- "siliconflow.cn"
patterns:
- regex: 'sk-[a-z]{20,}'
entropy_min: 3.5
confidence: low
verify:
method: GET
url: https://api.siliconflow.cn/v1/models
headers:
Authorization: "Bearer {KEY}"
valid_status: [200]
invalid_status: [401, 403]
```
providers/360ai.yaml:
```yaml
format_version: 1
name: 360ai
display_name: 360 AI Brain
tier: 4
last_verified: "2026-04-05"
keywords:
- "360gpt"
- "QIHOO_API_KEY"
- "api.360.cn"
- "ai.360.com"
- "360-ai"
verify:
method: GET
url: ""
headers: {}
valid_status: []
invalid_status: []
```
providers/kuaishou.yaml:
```yaml
format_version: 1
name: kuaishou
display_name: Kuaishou KwaiYii
tier: 4
last_verified: "2026-04-05"
keywords:
- "kuaishou"
- "kwaiyii"
- "KUAISHOU_API_KEY"
- "KWAI_API_KEY"
- "kwai-ai"
verify:
method: GET
url: ""
headers: {}
valid_status: []
invalid_status: []
```
Copy each file VERBATIM to pkg/providers/definitions/.
</action>
<verify>
<automated>cd /home/salva/Documents/apikey && for f in baichuan stepfun sensetime iflytek tencent siliconflow 360ai kuaishou; do diff providers/$f.yaml pkg/providers/definitions/$f.yaml || exit 1; done && go test ./pkg/providers/... -count=1 && go test ./pkg/engine/... -count=1</automated>
</verify>
<acceptance_criteria>
- All 16 files exist (8 providers × 2 locations)
- `grep -q 'baichuan-ai.com' providers/baichuan.yaml`
- `grep -q 'hunyuan' providers/tencent.yaml`
- `grep -q 'spark_desk' providers/iflytek.yaml`
- 7 of 8 files have NO patterns field: `grep -L 'patterns:' providers/{baichuan,stepfun,sensetime,iflytek,tencent,360ai,kuaishou}.yaml` returns all 7
- `go test ./pkg/providers/... -count=1` passes
- `go test ./pkg/engine/... -count=1` passes
- Total Tier 4 provider count = 16: `grep -l 'tier: 4' providers/*.yaml | wc -l` → 16
</acceptance_criteria>
<done>All 16 Tier 4 Chinese/regional providers exist dual-located. PROV-04 satisfied.</done>
</task>
</tasks>
<verification>
`grep -l 'tier: 4' providers/*.yaml | wc -l` returns 16.
`go test ./pkg/providers/... ./pkg/engine/... -count=1` passes.
</verification>
<success_criteria>
- 16 Tier 4 providers dual-located
- 13 use keyword-only detection (no patterns field)
- 3 use documented prefix patterns (deepseek, moonshot, qwen, siliconflow — note: 4 with patterns, adjust count)
- Registry loads all without validation errors
- No engine test regressions
</success_criteria>
<output>
After completion, create `.planning/phases/03-tier-3-9-providers/03-01-SUMMARY.md`
</output>