fix(adapter): ejecutar tool calls que DeepSeek emite como texto DSML

Tercer modo de fallo del conector OpenAI (distinto de followup_mode y de finish_reason=stop): DeepSeek a veces emite las tool calls en su formato interno DSML (<｜｜DSML｜｜tool_calls>…, con U+FF5C) como TEXTO en el content, en vez de como tool_calls nativos. El endpoint OpenAI no lo convierte, asi que el adapter lo trataba como texto y el agente "se paraba" mostrando DSML inerte (0 tools). Fix en OpenAIAdapter.stream: reutiliza el parser del claude_adapter (_parse_xml_tool_calls / _TOOL_CALL_OPEN_RE). Acumula el content; si detecta el inicio de un tool call en texto deja de emitirlo al usuario (DSML no debe verse); al cerrar el turno, si no hubo tool_calls nativos, parsea el content y emite los tool calls encontrados como tool_use para que el engine los ejecute. Validado: el DSML real de la sesion (2x acai_grep) se parsea correctamente. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 20:15:49 +00:00
parent d6b04e4122
commit e34a39e3bf
1 changed files with 44 additions and 7 deletions
--- a/src/adapters/openai_adapter.py
+++ b/src/adapters/openai_adapter.py
@@ -55,9 +55,19 @@ class OpenAIAdapter(ModelAdapter):

        stream = await self._client.chat.completions.create(**kwargs)

+        # Fallback de tool-calls-en-texto: DeepSeek a veces emite las tool calls
+        # en su formato interno DSML como TEXTO (en el content) en vez de como
+        # tool_calls nativos. El endpoint OpenAI no lo convierte, asi que sin
+        # esto el agente "se para" mostrando DSML inerte. Reutilizamos el parser
+        # del claude_adapter.
+        from .claude_adapter import _parse_xml_tool_calls, _TOOL_CALL_OPEN_RE
+
        tool_calls_acc: dict[int, dict[str, str]] = {}

        final_usage: dict[str, int] = {}
+        full_content = ""       # content acumulado (para el fallback DSML)
+        emitted_chars = 0       # cuanto de full_content ya se emitio como delta
+        suppress_text = False   # tras detectar un tool-call-en-texto, no emitir mas

        async for chunk in stream:
            # With include_usage, the last chunk has usage but no choices
@@ -79,7 +89,19 @@ class OpenAIAdapter(ModelAdapter):

            # Text content
            if delta and delta.content:
-                yield StreamChunk(delta=delta.content)
+                full_content += delta.content
+                if not suppress_text:
+                    # Si arranca un tool call en texto (DSML/XML), emitimos lo
+                    # previo y dejamos de emitir el resto (el DSML no debe verse).
+                    m = _TOOL_CALL_OPEN_RE.search(full_content, emitted_chars)
+                    if m:
+                        suppress_text = True
+                        if m.start() > emitted_chars:
+                            yield StreamChunk(delta=full_content[emitted_chars:m.start()])
+                        emitted_chars = len(full_content)
+                    else:
+                        yield StreamChunk(delta=full_content[emitted_chars:])
+                        emitted_chars = len(full_content)

            # Tool calls
            if delta and delta.tool_calls:
@@ -126,6 +148,21 @@ class OpenAIAdapter(ModelAdapter):
                    # Emit usage after tool_use chunks
                    if final_usage:
                        yield StreamChunk(usage=final_usage)
+                else:
+                    # Fallback: DeepSeek pudo emitir las tool calls como TEXTO
+                    # (DSML/XML) en vez de nativas. Parseamos el content y, si hay
+                    # tool calls, las ejecutamos igual; si no, cerramos el turno.
+                    text_calls = _parse_xml_tool_calls(full_content) if full_content else []
+                    if text_calls:
+                        for c in text_calls:
+                            yield StreamChunk(
+                                tool_call_id=c["id"],
+                                tool_name=c["name"],
+                                tool_arguments=json.dumps(c.get("arguments", {}), ensure_ascii=False),
+                                finish_reason="tool_use",
+                            )
+                        if final_usage:
+                            yield StreamChunk(usage=final_usage)
                    else:
                        yield StreamChunk(
                            finish_reason="end_turn"