Harden DeepSeek agent: LiteLLM adapter, DSML/reasoning/embeddings/error fixes

- LiteLLMAdapter (subclasses OpenAIAdapter via _acreate hook): routes DeepSeek
  through LiteLLM. Opt-in AGENTIC_DEFAULT_MODEL_PROVIDER=litellm. A/B beat the
  hand-rolled adapter (0 DSML, 0 parse-fails). Defensive chunk.usage getattr,
  token-estimate usage fallback for billing, quiet litellm logs.
- DSML parser: tolerate single/multi fullwidth pipes, honor string="true/false"
  typed args (openai_adapter fallback when DeepSeek leaks tool calls as text).
- Thinking mode: capture and round-trip reasoning_content across turns.
- Embeddings: dedicated AGENTIC_EMBEDDINGS_API_KEY (DeepSeek has no embeddings);
  disable cleanly when unset to avoid per-turn 401.
- claude_format: friendly generic error messages to the chat, raw only in logs.
- acai agent max_tokens 4096->16384 (whole-file writes no longer truncate);
  system.md size-based edit policy; strict tools opt-in (off).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Jordan Diaz
2026-06-07 14:49:48 +00:00
parent e34a39e3bf
commit 6a03fdf284
12 changed files with 396 additions and 58 deletions

View File

@@ -4,7 +4,11 @@ description: "Agente genérico de Acai CMS: crea módulos, edita contenido, gest
icon: "code"
category: "development"
temperature: 0.2
max_tokens: 4096
# 16K de salida: cubre escribir un fichero entero (acai_write) + el razonamiento
# (thinking) en un solo turno. Con 4096 el JSON del tool_use se truncaba a mitad
# en ficheros medianos y el agente caia en micro-ediciones lentas. v4-pro soporta
# hasta 384K de salida, asi que 16K es conservador.
max_tokens: 16384
context_sections:
- immutable_rules
- project_profile