Fix AI chat signed file cache busting#21639
Conversation
…d file handling - Removed dependency on `FileUrlService` in `AgentChatStreamingService` and simplified message processing by directly mapping database parts to UI message parts. - Introduced `inlineFilePartsAsBase64` utility to handle inlining of file parts as base64 in `ChatExecutionService`, enhancing file content retrieval and integration. - Updated `ChatExecutionService` to utilize the new utility for processing messages with inlined file content, improving overall message handling efficiency.
|
👋 Thanks for contributing to Twenty! Your PR has been set to draft while you work on it. Once you're done, mark it as Ready for review and our automated checks will run. Looking forward to your contribution! |
There was a problem hiding this comment.
2 issues found across 3 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/twenty-server/src/engine/metadata-modules/ai/ai-chat/services/chat-execution.service.ts">
<violation number="1" location="packages/twenty-server/src/engine/metadata-modules/ai/ai-chat/services/chat-execution.service.ts:269">
P1: Pruning decision uses stale `conversationSizeTokens` after file base64 inlining. Large attachments can push actual prompt over model context without triggering compaction.</violation>
</file>
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
| }; | ||
|
|
||
| const rawModelMessages = await convertToModelMessages(processedMessages); | ||
| const messagesWithInlinedFiles = await inlineFilePartsAsBase64( |
There was a problem hiding this comment.
P1: Pruning decision uses stale conversationSizeTokens after file base64 inlining. Large attachments can push actual prompt over model context without triggering compaction.
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/twenty-server/src/engine/metadata-modules/ai/ai-chat/services/chat-execution.service.ts, line 269:
<comment>Pruning decision uses stale `conversationSizeTokens` after file base64 inlining. Large attachments can push actual prompt over model context without triggering compaction.</comment>
<file context>
@@ -263,7 +266,19 @@ export class ChatExecutionService {
};
- const rawModelMessages = await convertToModelMessages(processedMessages);
+ const messagesWithInlinedFiles = await inlineFilePartsAsBase64(
+ processedMessages,
+ (fileId) =>
</file context>
…se64 utility - Updated `ChatExecutionService` to log warnings when AI chat attachments cannot be loaded, improving error visibility. - Modified `inlineFilePartsAsBase64` to return a placeholder message for unavailable attachments, ensuring better user feedback in chat messages. - These changes enhance the robustness of file content retrieval and improve overall user experience in the chat interface.
🔍 Automated Pre-Review✅ No issues detected - This PR is ready for human review. Automated pre-review — human approval still required. |
|
Hi @abdulrahmancodes, I'm not sure it's the right approach (but don't sure of the approach we should have). For example pdf files are parsed and provide in an XML shape at Anthropic. It would be great to compare cost and processing time before/after with a few pages pdf. Wonder if we should not create a readFile tool with specific returning format according to file type |
|
@abdulrahmancodes I try to find other way but nothing really interesting for the moment. I'll test your PR and approved it if works with main providers ! ( AI Chat File Handling — Investigation SummaryThe PR (
|
| Provider | file-url |
base64 (image-data/file-data) |
|---|---|---|
| OpenAI (default = Responses API) | ✅ provider fetches it | ✅ |
| Anthropic | ✅ provider fetches it | file-data PDF-only else dropped; no file-id |
| ✅ | ✅ | |
| xAI (Responses) | ✅ | ✅ |
| Mistral | ❌ JSON.stringify'd to text |
❌ stringified (token bloat) |
| openai-compatible (DeepSeek/custom) | ❌ JSON.stringify'd to text |
❌ stringified (token bloat) |
Conclusions
file-urlfrom a tool is the worst option: where it works it relies on the provider fetching your URL → reintroduces the exact expiry/staleness/cache-busting this PR removed; where it doesn't, it's silently stringified; and the SDK never rescues you.- Even base64 from a tool isn't universally safe (Mistral/openai-compatible stringify it; Anthropic drops non-PDF files).
- This validates the PR's choice: inlining base64 at the message-part level inherits the SDK's mature, uniform handling; moving the same content into a tool result forfeits it.
- Net guidance: a
fetch_filetool to strip files from history is viable only for text output across all providers. Anything binary (images/PDFs) must live in a message part — exactly where this PR put it. - The genuinely "clean" alternative (per-provider Files API +
file-id) was deliberately skipped by the author due to the 8-provider spread and lifecycle management.
|
Hi @abdulrahmancodes it works well. And it has same perf than main, good point. But I've tried to make the cache crash uploading many files, and don't succeed busting the cache. Seems weird but as initial issue is not confirmed, we should close. Then, you're approach is still interesting for user where server is not publicly reachable. |
|
@etiennejouan When I tested the cache busting, it seemed to be working, but I'll take another look to double-check |
What
When you upload a file in AI chat, the prompt cache was getting busted on every turn, any thread with a file just never hit the cache.
Why: we store only the
fileIdand re-signed a fresh url each time the thread was loaded. That signed url gets handed straight to the provider, and since the token changes every turn, the cached conversation prefix changes too and the cache misses.Fix
Instead of giving the provider a signed URL, we download the file on our side and inline it as base64 bytes right before the model call. The bytes don't change between turns, so the cached prefix stays stable.
Follow-ups / things considered