docs(03-07): complete emerging/niche + vector DB providers plan

This commit is contained in:
salvacybersec
2026-04-05 14:43:29 +03:00
parent f1e6c8e0ac
commit 592e5ca325

View File

@@ -0,0 +1,99 @@
---
phase: 03-tier-3-9-providers
plan: 07
subsystem: providers
tags: [providers, tier-6, emerging, vector-db, langsmith, pinecone]
requires: []
provides:
- "15 Tier 6 emerging/niche provider YAMLs"
- "LangSmith lsv2_ high-confidence regex"
- "Pinecone pcsk_ high-confidence regex"
- "Vector DB keyword anchors (Weaviate, Qdrant, Chroma, Milvus, Neon)"
affects: [pkg/providers/registry]
tech-stack:
added: []
patterns: [dual-location YAML, keyword-only detection, high-confidence prefixed regex]
key-files:
created:
- providers/reka.yaml
- providers/aleph-alpha.yaml
- providers/lamini.yaml
- providers/writer.yaml
- providers/jasper.yaml
- providers/typeface.yaml
- providers/comet.yaml
- providers/wandb.yaml
- providers/langsmith.yaml
- providers/pinecone.yaml
- providers/weaviate.yaml
- providers/qdrant.yaml
- providers/chroma.yaml
- providers/milvus.yaml
- providers/neon.yaml
- pkg/providers/definitions/reka.yaml
- pkg/providers/definitions/aleph-alpha.yaml
- pkg/providers/definitions/lamini.yaml
- pkg/providers/definitions/writer.yaml
- pkg/providers/definitions/jasper.yaml
- pkg/providers/definitions/typeface.yaml
- pkg/providers/definitions/comet.yaml
- pkg/providers/definitions/wandb.yaml
- pkg/providers/definitions/langsmith.yaml
- pkg/providers/definitions/pinecone.yaml
- pkg/providers/definitions/weaviate.yaml
- pkg/providers/definitions/qdrant.yaml
- pkg/providers/definitions/chroma.yaml
- pkg/providers/definitions/milvus.yaml
- pkg/providers/definitions/neon.yaml
modified: []
decisions:
- "Used high-confidence regex only for documented prefixes (lsv2_, pcsk_); all other Tier 6 providers rely on keyword-only detection to avoid false positives at scale"
- "W&B pattern uses low-confidence 40-hex regex gated by keyword pre-filter (no documented prefix)"
- "Jasper, Typeface, and all vector DBs use empty verify URL (keyword-only, verification deferred)"
metrics:
duration: ~6m
completed: 2026-04-05
---
# Phase 03 Plan 07: Emerging/Niche Providers + Vector DBs Summary
One-liner: Added 15 Tier 6 emerging/niche provider definitions covering emerging LLM labs, writing tools, observability, and vector databases — completing PROV-06.
## What Was Built
Created 30 YAML files (15 providers x 2 locations) satisfying requirement PROV-06:
**Emerging LLM labs (3):** Reka AI, Aleph Alpha, Lamini
**Writing tools (3):** Writer (palmyra), Jasper AI, Typeface
**Observability (3):** Comet ML/Opik, Weights & Biases, LangSmith
**Vector DBs (6):** Pinecone, Weaviate, Qdrant, Chroma, Milvus/Zilliz, Neon
High-confidence regex was applied only where documented prefixes exist:
- LangSmith: `lsv2_(pt|sk)_[a-f0-9]{32}_[a-f0-9]{10}`
- Pinecone: `pcsk_[A-Za-z0-9]{40,}`
- W&B: 40-hex (confidence: low) gated by keyword pre-filter
All other providers rely on keyword-only detection (env var anchors, domain hostnames) to avoid false positives at scale — consistent with Phase 2 lessons documented in 03-CONTEXT.md.
## Commits
- `fbe9e8b` feat(03-07): add 8 emerging labs, writing tools, observability providers
- `a73cea3` feat(03-07): add LangSmith and 6 vector DB providers
## Verification
- `go test ./pkg/providers/... -count=1` — PASS
- `go test ./pkg/engine/... -count=1` — PASS
- `grep -l 'tier: 6' providers/*.yaml | wc -l` — 15
- All 15 provider pairs pass `diff providers/X.yaml pkg/providers/definitions/X.yaml`
## Deviations from Plan
None - plan executed exactly as written.
## Self-Check: PASSED
- All 30 YAML files exist (verified via diff)
- Both commits present in git log (fbbb54b, a73cea3)
- Tier 6 count = 15 (matches PROV-06 requirement)
- Provider and engine test suites green