Harden DeepSeek agent: LiteLLM adapter, DSML/reasoning/embeddings/error fixes

- LiteLLMAdapter (subclasses OpenAIAdapter via _acreate hook): routes DeepSeek
  through LiteLLM. Opt-in AGENTIC_DEFAULT_MODEL_PROVIDER=litellm. A/B beat the
  hand-rolled adapter (0 DSML, 0 parse-fails). Defensive chunk.usage getattr,
  token-estimate usage fallback for billing, quiet litellm logs.
- DSML parser: tolerate single/multi fullwidth pipes, honor string="true/false"
  typed args (openai_adapter fallback when DeepSeek leaks tool calls as text).
- Thinking mode: capture and round-trip reasoning_content across turns.
- Embeddings: dedicated AGENTIC_EMBEDDINGS_API_KEY (DeepSeek has no embeddings);
  disable cleanly when unset to avoid per-turn 401.
- claude_format: friendly generic error messages to the chat, raw only in logs.
- acai agent max_tokens 4096->16384 (whole-file writes no longer truncate);
  system.md size-based edit policy; strict tools opt-in (off).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Jordan Diaz
2026-06-07 14:49:48 +00:00
parent e34a39e3bf
commit 6a03fdf284
12 changed files with 396 additions and 58 deletions

View File

@@ -32,6 +32,33 @@ class Settings(BaseSettings):
anthropic_base_url: str = "" # Custom base URL (for MiniMax Anthropic-compatible, etc.)
openai_api_key: str = ""
openai_base_url: str = "" # Custom base URL (for MiniMax, DeepInfra, etc.)
# --- Embeddings (semantic search) ---
# Credenciales DEDICADAS para embeddings. Necesarias porque el chat usa
# `openai_api_key` apuntando a un endpoint compatible (p.ej. DeepSeek, que NO
# tiene API de embeddings). Si vacio, cae a `openai_api_key` por compat. El
# base_url vacio => OpenAI real (api.openai.com); NO hereda `openai_base_url`.
embeddings_api_key: str = ""
embeddings_base_url: str = ""
embeddings_model: str = "text-embedding-3-small"
# Spike LiteLLM: si default_model_provider=litellm, modelo a usar (formato
# litellm, p.ej. "deepseek/deepseek-v4-pro"). Vacío → deriva de default_model_id.
litellm_model: str = ""
@property
def effective_embeddings_key(self) -> str:
"""Key a usar para embeddings. Prioriza la dedicada; reutiliza la del
chat SOLO si el chat es OpenAI real (sin `openai_base_url` custom) — si
apunta a DeepSeek u otro proveedor, esa key no sirve para embeddings."""
if self.embeddings_api_key:
return self.embeddings_api_key
if not self.openai_base_url:
return self.openai_api_key
return ""
@property
def embeddings_enabled(self) -> bool:
return bool(self.effective_embeddings_key or self.embeddings_base_url)
default_model_provider: str = "claude"
default_model_id: str = "claude-sonnet-4-20250514"
# Modelo override SOLO para el sub-loop del planner (acai_plan). Si vacio,
@@ -43,6 +70,11 @@ class Settings(BaseSettings):
planner_max_tokens: int = 16000
max_tokens: int = 4096
temperature: float = 0.3
# DeepSeek strict function calling (beta). OPT-IN (default False): exige schemas
# tipo OpenAI (additionalProperties:false, todos required, etc.) que los tools MCP
# actuales NO cumplen → da 400. Para activarlo: schemas compatibles + base_url
# https://api.deepseek.com/beta + AGENTIC_DEEPSEEK_STRICT_TOOLS=true.
deepseek_strict_tools: bool = False
# --- Context engine ---
model_context_window: int = 0 # 0 = use legacy fixed budget / explicit override