Fix AI chat signed file cache busting by abdulrahmancodes · Pull Request #21639 · twentyhq/twenty

abdulrahmancodes · 2026-06-15T19:09:52Z

What

When you upload a file in AI chat, the prompt cache was getting busted on every turn, any thread with a file just never hit the cache.

Why: we store only the fileId and re-signed a fresh url each time the thread was loaded. That signed url gets handed straight to the provider, and since the token changes every turn, the cached conversation prefix changes too and the cache misses.

Fix

Instead of giving the provider a signed URL, we download the file on our side and inline it as base64 bytes right before the model call. The bytes don't change between turns, so the cached prefix stays stable.

Follow-ups / things considered

File byte cache (later if needed): we re-download and re-encode the thread's files from storage on every turn. It's fine for now (storage reads are cheap and the provider caches the bytes after the first turn), but if it ever shows up in latency we can cache the bytes by fileId since file content is immutable. Skipped for now to keep this focused.
Per-provider Files API (didn't do): the "cleanest" version is to upload each file to the provider once and reference it by their file id, so bytes never get re-sent. We didn't go this route because we support 8 providers, not all of them have a files API, they all differ, and they come with expiry/lifecycle we'd have to manage. base64 is the one representation that works the same across every provider. Worth revisiting if we ever narrow down the provider list.

…d file handling - Removed dependency on `FileUrlService` in `AgentChatStreamingService` and simplified message processing by directly mapping database parts to UI message parts. - Introduced `inlineFilePartsAsBase64` utility to handle inlining of file parts as base64 in `ChatExecutionService`, enhancing file content retrieval and integration. - Updated `ChatExecutionService` to utilize the new utility for processing messages with inlined file content, improving overall message handling efficiency.

twenty-ci-bot-public · 2026-06-15T19:10:22Z

👋 Thanks for contributing to Twenty!

Your PR has been set to draft while you work on it. Once you're done, mark it as Ready for review and our automated checks will run.

Looking forward to your contribution!

cubic-dev-ai

2 issues found across 3 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/twenty-server/src/engine/metadata-modules/ai/ai-chat/services/chat-execution.service.ts">

<violation number="1" location="packages/twenty-server/src/engine/metadata-modules/ai/ai-chat/services/chat-execution.service.ts:269">
P1: Pruning decision uses stale `conversationSizeTokens` after file base64 inlining. Large attachments can push actual prompt over model context without triggering compaction.</violation>
</file>

_{Reply with feedback, questions, or to request a fix.

Re-trigger cubic}

cubic-dev-ai · 2026-06-15T19:16:27Z

    };

-    const rawModelMessages = await convertToModelMessages(processedMessages);
+    const messagesWithInlinedFiles = await inlineFilePartsAsBase64(


P1: Pruning decision uses stale conversationSizeTokens after file base64 inlining. Large attachments can push actual prompt over model context without triggering compaction.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At packages/twenty-server/src/engine/metadata-modules/ai/ai-chat/services/chat-execution.service.ts, line 269: <comment>Pruning decision uses stale `conversationSizeTokens` after file base64 inlining. Large attachments can push actual prompt over model context without triggering compaction.</comment> <file context> @@ -263,7 +266,19 @@ export class ChatExecutionService { }; - const rawModelMessages = await convertToModelMessages(processedMessages); + const messagesWithInlinedFiles = await inlineFilePartsAsBase64( + processedMessages, + (fileId) => </file context>

…se64 utility - Updated `ChatExecutionService` to log warnings when AI chat attachments cannot be loaded, improving error visibility. - Modified `inlineFilePartsAsBase64` to return a placeholder message for unavailable attachments, ensuring better user feedback in chat messages. - These changes enhance the robustness of file content retrieval and improve overall user experience in the chat interface.

twenty-ci-bot-public · 2026-06-16T01:33:46Z

🔍 Automated Pre-Review

✅ No issues detected - This PR is ready for human review.

View details

Automated pre-review — human approval still required.

etiennejouan · 2026-06-17T09:25:16Z

Hi @abdulrahmancodes, I'm not sure it's the right approach (but don't sure of the approach we should have). For example pdf files are parsed and provide in an XML shape at Anthropic. It would be great to compare cost and processing time before/after with a few pages pdf.

Wonder if we should not create a readFile tool with specific returning format according to file type

etiennejouan · 2026-06-17T14:53:58Z

@abdulrahmancodes I try to find other way but nothing really interesting for the moment.

I'll test your PR and approved it if works with main providers ! (

AI Chat File Handling — Investigation Summary

The PR (`fix/ai-chat-signed-file-cache-busting`, #21639)

Before: AI-chat attachments were stored as just a fileId. On every turn the thread loader re-signed a fresh URL (${SERVER_URL}/file/...?token=<JWT>) and handed that signed URL to the LLM provider, which fetched it.

The bug: prompt caching. The JWT token changes every turn → the file part's URL changes → the cached conversation prefix changes → cache miss every turn for any thread containing a file.

The fix: inlineFilePartsAsBase64 downloads the bytes server-side and embeds them as a data: base64 part right before convertToModelMessages. Bytes are deterministic → stable cached prefix. Signing was removed from the message loader.

The Idea Explored: inline once, then inventory by `fileId` + a `fetch_file` tool

"Inline only on the first turn" alone doesn't help — LLM APIs are stateless, so the full history (including the original file message) is re-sent every turn. The real optimization is: strip the file from history → keep a lightweight reference → re-deliver via a tool on demand. This is orthogonal to the PR's base64-vs-signed-URL change.
A tool returning a signed URL string is useless — the model can't fetch URLs at inference time. (Correctly spotted: "I can't pass it to the model.")
Correction: at the AI SDK level a tool can return content parts via toModelOutput → { type: 'content', value: [...] } supporting image-data, file-data, file-url, file-id. But this codebase's Tool abstraction only returns plain JSON (ToolOutput), so it would need extending.

AI SDK Deep-Dive (the decisive part)

Tool-result content gets no SDK normalization — no URL download, no supportedUrls capability check, no fallback. It's a pure pass-through (mapToolResultOutput only renames the deprecated media type).
The download/fallback safety net (downloadAssets) applies only to user-message file/image parts, never tool results.
Behavior is therefore 100% per-provider, and Twenty's 8 providers diverge badly:

Provider	`file-url`	base64 (`image-data`/`file-data`)
OpenAI (default = Responses API)	✅ provider fetches it	✅
Anthropic	✅ provider fetches it	⚠️ image ok; `file-data` PDF-only else dropped; no `file-id`
Google	✅	✅
xAI (Responses)	✅	✅
Mistral	❌ `JSON.stringify`'d to text	❌ stringified (token bloat)
openai-compatible (DeepSeek/custom)	❌ `JSON.stringify`'d to text	❌ stringified (token bloat)

Conclusions

file-url from a tool is the worst option: where it works it relies on the provider fetching your URL → reintroduces the exact expiry/staleness/cache-busting this PR removed; where it doesn't, it's silently stringified; and the SDK never rescues you.
Even base64 from a tool isn't universally safe (Mistral/openai-compatible stringify it; Anthropic drops non-PDF files).
This validates the PR's choice: inlining base64 at the message-part level inherits the SDK's mature, uniform handling; moving the same content into a tool result forfeits it.
Net guidance: a fetch_file tool to strip files from history is viable only for text output across all providers. Anything binary (images/PDFs) must live in a message part — exactly where this PR put it.
The genuinely "clean" alternative (per-provider Files API + file-id) was deliberately skipped by the author due to the 8-provider spread and lifecycle management.

etiennejouan · 2026-06-18T14:55:41Z

Hi @abdulrahmancodes it works well. And it has same perf than main, good point. But I've tried to make the cache crash uploading many files, and don't succeed busting the cache. Seems weird but as initial issue is not confirmed, we should close.

Then, you're approach is still interesting for user where server is not publicly reachable.

abdulrahmancodes · 2026-06-18T18:45:06Z

@etiennejouan When I tested the cache busting, it seemed to be working, but I'll take another look to double-check

abdulrahmancodes requested a review from etiennejouan June 15, 2026 19:09

twenty-eng-sync Bot added -PR: awaiting author -PR: draft labels Jun 15, 2026

twenty-eng-sync Bot assigned abdulrahmancodes Jun 15, 2026

abdulrahmancodes added -PR: awaiting review and removed -PR: draft -PR: awaiting author labels Jun 15, 2026

cubic-dev-ai Bot reviewed Jun 15, 2026

View reviewed changes

etiennejouan added -PR: awaiting author and removed -PR: awaiting review labels Jun 17, 2026

etiennejouan closed this Jun 18, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix AI chat signed file cache busting#21639

Fix AI chat signed file cache busting#21639
abdulrahmancodes wants to merge 2 commits into
mainfrom
fix/ai-chat-signed-file-cache-busting

abdulrahmancodes commented Jun 15, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

twenty-ci-bot-public Bot commented Jun 15, 2026

Uh oh!

cubic-dev-ai Bot left a comment •

edited

Loading

Uh oh!

cubic-dev-ai Bot Jun 15, 2026

Uh oh!

Uh oh!

twenty-ci-bot-public Bot commented Jun 16, 2026

Uh oh!

etiennejouan commented Jun 17, 2026 •

edited

Loading

Uh oh!

etiennejouan commented Jun 17, 2026 •

edited

Loading

Uh oh!

etiennejouan commented Jun 18, 2026

Uh oh!

abdulrahmancodes commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

abdulrahmancodes commented Jun 15, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Fix

Follow-ups / things considered

Uh oh!

twenty-ci-bot-public Bot commented Jun 15, 2026

Uh oh!

cubic-dev-ai Bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

twenty-ci-bot-public Bot commented Jun 16, 2026

🔍 Automated Pre-Review

Uh oh!

etiennejouan commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

etiennejouan commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Chat File Handling — Investigation Summary

The PR (fix/ai-chat-signed-file-cache-busting, #21639)

The Idea Explored: inline once, then inventory by fileId + a fetch_file tool

AI SDK Deep-Dive (the decisive part)

Conclusions

Uh oh!

etiennejouan commented Jun 18, 2026

Uh oh!

abdulrahmancodes commented Jun 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

abdulrahmancodes commented Jun 15, 2026 •

edited by cubic-dev-ai Bot

Loading

cubic-dev-ai Bot left a comment •

edited

Loading

etiennejouan commented Jun 17, 2026 •

edited

Loading

etiennejouan commented Jun 17, 2026 •

edited

Loading

The PR (`fix/ai-chat-signed-file-cache-busting`, #21639)

The Idea Explored: inline once, then inventory by `fileId` + a `fetch_file` tool