Users say Gemini starts forgetting long before it’s supposed to

2026-06-04 | original

read original get ChatGPT Memory Enhancement Tool → more articles

Why This Matters

This issue with Gemini's limited dynamic context window highlights a significant challenge in maintaining long-term conversational memory, impacting user experience and reliability in AI chat applications. It underscores the need for more robust memory management in large language models to support sustained, coherent interactions.

Key Takeaways

Gemini's active memory drops to around 16k tokens, limiting conversation length.
Users experience rapid forgetting of earlier instructions and context.
This bottleneck affects the reliability of long-term AI chat sessions, emphasizing the need for improved memory handling.

Now, X user @Soso_fun_yt claims that this context window is misleading for chat users:

While the backend can successfully ingest a massive static file initially on the first prompt, the active conversational memory (the dynamic context window / KV cache for the chat) appears to be severely bottlenecked, dropping significantly to a 16k~ limit. (Or 25-30 messages in average)

As a result, the model quickly suffers from amnesia within the exact same chat session, completely forgetting earlier instructions, code blocks, or constraints.

Explore topics: gemini chatgpt kv cache context window ai memory