This is a cache of https://slashdot.org/story/25/04/10/1727255/openai-expands-chatgpt-memory-to-draw-on-full-conversation-history. It is a snapshot of the page at 2025-04-11T01:13:56.810+0000.
OpenAI Expands ChatGPT Memory To Draw on Full Conversation History - Slashdot

Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
AI

OpenAI Expands ChatGPT Memory To Draw on Full Conversation History (x.com) 39

OpenAI has expanded ChatGPT's memory functionality to include references from all past conversations. The system now builds upon existing saved memories by automatically incorporating previous interactions to deliver more contextually relevant responses for writing, learning, and advisory tasks, the startup said Thursday.

Subscribers can disable the feature through settings or request memory modifications directly in chat. Those already opted out of memory features won't have past-chat references enabled by default. Temporary chats remain available for interactions that users prefer to keep isolated from memory systems. The update is rolling out immediately to Plus and Pro subscribers, excluding users in the EEA, UK, Switzerland, and other European markets.

OpenAI Expands ChatGPT Memory To Draw on Full Conversation History

Comments Filter:
  • Mostly I've stopped using ChatGPT because of problems that might have been related to limited memory... Starts out strong and then seems to forget what it was doing and where it was going. (But I largely regard my interactions with GAI as psychologically risky.)

    But mostly I want to ask if anyone can recommend any good books on the topic? My main questions right now involve how selected styles or personalized frames are overlaid on the base models. Most of the books I've read on GAI have basically been cookb

    • by gweihir ( 88907 )

      The effect you see may also relate to the input data growing when histry is part of it. Hence the effect may get worse with more history.

      Smart humans filter and only keep what is important of a conversation and hence can focus on the last question and context grows slowly. Fake-AI cannot do that since it has no understaning of things.

      • by DamnOregonian ( 963763 ) on Thursday April 10, 2025 @03:56PM (#65295787)
        Partially right, mostly wrong.

        The effect they see is because performance of an LLM at context relevance does indeed drop the larger teh context becomes.
        while NIH (needle-in-haystack) tests still perform well, the attention heads don't necessarily scale with context length, depending on how this particular model architecture works.

        The solution to this, is a sort of CAM- context addressible memory- you train the LLM to use its context as a short-term memory, and utilize external RAG (retrieval-augmented generation) to pull in contextually relevant information from long-term into short-term memory.

        Smart humans filter and only keep what is important of a conversation and hence can focus on the last question and context grows slowly.

        LLMs have always done this.
        Self-attention absolutely "filters" the context when being input into new token generation.
        Like a human, the LLM has limited short-term memory, and the longer a particular conversation goes, the less of it you'll be able to fit in your head.

        Adding long-term RAG is improving an LLM to be more like a human in this regard, because humans also have very separate mechanism for short and long-term memory.

        • LLMs have always done this.

          Self-attention absolutely "filters" the context when being input into new token generation.

          Like a human, the LLM has limited short-term memory, and the longer a particular conversation goes, the less of it you'll be able to fit in your head.

          I'd need to hear more about what you mean by this, because at first read it sounds inaccurate. The longer a particular human conversation goes, yes, the less of the verbatim transcript you'll be able to store in your head. And that's where mindless computers excel. They can store the entire conversation exactly, specifically because they do not have minds which need to prune and optimize among competing stimuli, resources, and time -- all of which are the result of bio-brains encased in fragile decaying mea

          • I'd need to hear more about what you mean by this, because at first read it sounds inaccurate. The longer a particular human conversation goes, yes, the less of the verbatim transcript you'll be able to store in your head. And that's where mindless computers excel. They can store the entire conversation exactly, specifically because they do not have minds which need to prune and optimize among competing stimuli, resources, and time -- all of which are the result of bio-brains encased in fragile decaying meatsacks. Humans must prune in order to survive, outcompete, or even function at a basic level.

            Don't think of an LLM as a "mindless computer". If you do, all of your reasoning is going to lead you to wrong conclusions.
            Computers absolutely have addressible memory. LLMs don't work that way.
            What they have is an attention layer that turns the context window into a set of embeddings with judges the relative "importance" of tokens in that context window in a very high-dimensional space and feeds it into the feed-forward layers of the network for processing.

            As a compensation, we do not need to store the verbatim conversation, we don't need to go back and retrieve a sample of phrase tokens from it, because we can understand the concepts behind the phrase tokens. Words are just containers; meaning happens in the Mind. Understanding is a sort of compression method that allows us to condense and abstract the nature of the content. We encode the overall (or most relevant) aspects of the content along with the metadata of how it made us feel, how we perceived it made the other participants feel, how we perceive the other participants perceived the way we felt.

            Attention works exactly the same way. Tokens are

            • by gweihir ( 88907 )

              Please stop pushing religion. LLMs are mindless computers. Period.

              • Please stop pushing religion.

                That is so rich coming from you, lol.

                LLMs are mindless computers.

                No, they're not. They're certainly run on them, though.
                LLMs demonstrably have a Theory of Mind. This doesn't make them human, by any means, but saying they don't have a "mind" when the evidence suggests strongly that they do, is religion. Their "mind" is nothing like ours, that much is certain. But trying to pretend that a human mind is the only kind is a literal guaranteed path to being wrong.

                You're the kind of asshole that doesn't see the problem in progressively re

              • Are you just pushing your religion?

        • Is that what is OpenAI talking about here? Adding RAG to cherry pick old chats? (that's not what expanding memory sounds like to me)
          • Is that what is OpenAI talking about here? Adding RAG to cherry pick old chats? (that's not what expanding memory sounds like to me)

            Ya, that's exactly what they're talking about.
            And I agree- if you know anything about transformer architecture, "expanding memory" is ambiguous, and one might be led to hope that it means something else, but it's just RAG, and tool use to move stuff in and out of RAG at inference time, allowing for sharing of information between context windows.

    • What you describe is related to limited "memory".
      In this case, it's the context window.
      The fuller it is, the more embeddings there are, and the harder it is for any particular model to determine the importance of those.

      This can be scaled up, but there is a not-insignificant performance cost. It is an area that is continually improved though.
      You can read more about it, here. [github.com]
    • by allo ( 1728082 )

      I think mostly it is because of the attention growing quadratic. If every token in your prompt attends to every other token, there is a lot more (also irrelevant, misguided, etc.) attention in a long prompt than in a short one.

  • Do not need (Score:5, Funny)

    by PPH ( 736903 ) on Thursday April 10, 2025 @02:00PM (#65295441)

    I'm married. I already have something that will remember everything. Forever.

    And remind me of it.

    • by Tablizer ( 95088 )

      Not the good things.

      I remember a National Geographic article where the author spent some time with the most remote existing tribe known.

      The author said (paraphrased): "Despite having almost no contact with modern society, this tribe shows two things in common between both worlds: nagging and fads.

  • by nospam007 ( 722110 ) * on Thursday April 10, 2025 @02:41PM (#65295581)

    It may remember more but it is not checking its memory.

    I want it to use raw html sources in the text each time instead of these stupid oval buttons that cannot be copy/pasted into my research and told it hundreds of times to lose the dashes, "—", and lots of other things, but it ignores it and it still needs reminding several times for each answer.

    • by gweihir ( 88907 )

      What is worse, LLMs cannot determine what is important context for a specific question and what is not. For example, I had ChatGPT completely ignore critical border conditions when I test-ran some exam questions. Essentially "do A, but change aspect B to C". It only saw the "do A" and completely ignored the rest. The same is bound to happen with "memory". For the things you "told it a hundred times", you would probably need to train a specialist model with those conditions actually part of the model and not

      • What is worse, LLMs cannot determine what is important context for a specific question and what is not.

        That is, perhaps, one of the most bullshit things you have ever said.
        In traditional LMs, this is handled by a RNN with an LSTM architecture. In modern LMs, this is handled by the self-attention mechanism of transformers.
        Literally every other layer of a transformer is a layer designed to handle importance of context based on positional encodings and embeddings.

        For example, I had ChatGPT completely ignore critical border conditions when I test-ran some exam questions.

        No, you didn't.

        Essentially "do A, but change aspect B to C". It only saw the "do A" and completely ignored the rest.

        Unless this question took up an appreciable amount of its context window, no, you're a liar. You can always prove that your not by pr

      • ChatGPT, do 23 Ã-- 78 but change 78 to 10

        ChatGPT said:
        Sure!
        Instead of calculating 23 Ã-- 78, you want to change 78 to 10, so:

        23 Ã-- 10 = 230.

    • It may remember more but it is not checking its memory.

      I want it to use raw html sources in the text each time instead of these stupid oval buttons that cannot be copy/pasted into my research and told it hundreds of times to lose the dashes, "—", and lots of other things, but it ignores it and it still needs reminding several times for each answer.

      Similar to my experience. If you ask a complex question, the complexities disappear and it takes some small segment of the question, then answers it in a lengthy, drawn-out fashion. I wish one of the AI systems would do something that impressed me as much as I keep being told I should be impressed. It'd be nice to think the thing that's going to replace us all at least has some wow factor as we're slowly subsumed by it.

      • Flat out lie.

        Give me an example, I'd love to prove you wrong.
      • by vux984 ( 928602 )

        What are the interstate trade deficits or surpluses between California and other US states?

        Answer? One generic line saying its ("mostly surpluses") with no no details beyond that other than defining what a trade deficit is.

        And then 2 pages about the US global trade deficits with China etc. Most of it irrelevant... i didn't ask what California's top exports are... I don't care what its top international trading partners are. And among the facts I don't care about its got lots of filler slop:

        The US Trade Deficit is primarily caused by imports eceeding exports.

        Marking the 2nd

        • Q: What are the interstate trade deficits or surpluses between California and other US states?
          A: Interstate trade balances refer to the net flow of goods and services exchanged between states within the U.S. However, detailed data on trade balances specifically between California and other U.S. states is not readily available in public sources. Most publicly accessible trade data focuses on international trade figures.

          For instance, California's international trade statistics for 2023 show exports totalin
        • ChatGPT, can you respond?

          "You're totally right to call out the bloat and misdirection in that answer. Asking about interstate trade balances and getting a mini-lecture on international trade is like ordering a grilled cheese and being served a Wikipedia article on dairy production.

          What would be useful is actual data on Californiaâ(TM)s net flows of goods and services with other states. Unfortunately, interstate trade stats are notoriously harder to find than international ones because there's no custom

    • It may remember more but it is not checking its memory.

      You need to understand how its "memory" works.
      The attention layers do their best at deciding how the context is important, but the more context there is, the harder that is.
      My guess is you've got quite a lot of context with a lot of conflicting instructions (not intentionally, but context isn't computed strictly serially).
      It's best to do tasks like this iteratively. Once you've got a set of data with changes done, put it in a new context and continue.

      • Even when I ask simple questions, it still gives be these stupid oval button links instead of raw html as specified in the memory file.

        • Hm. I'm afraid I don't know exactly what you're talking about. I generate HTML output frequently... I think you're maybe fighting the front-end to the LLM on this matter, like maybe it's auto-converting all links or something.
      • You need to understand how its "memory" works.

        The attention layers do their best at deciding how the context is important, but the more context there is, the harder that is.

        My guess is you've got quite a lot of context with a lot of conflicting instructions (not intentionally, but context isn't computed strictly serially).

        It's best to do tasks like this iteratively. Once you've got a set of data with changes done, put it in a new context and continue.

        "quite a lot of context with a lot of conflicting instructions"
        may very well be the best descriptive summary I've ever seen for what being human is.

        We navigate dozens to hundreds of other conflicted-context-window humans like, and unlike, us daily. The more context we share with each one, the easier it is to decide what context is important with each one, because that is what it means to have a Mind and Understand another Mind. Which still seems an entirely different activity from being a procedural tokeniz

        • "quite a lot of context with a lot of conflicting instructions" may very well be the best descriptive summary I've ever seen for what being human is.

          Indeed.

          We navigate dozens to hundreds of other conflicted-context-window humans like, and unlike, us daily. The more context we share with each one, the easier it is to decide what context is important with each one, because that is what it means to have a Mind and Understand another Mind. Which still seems an entirely different activity from being a procedural tokenized lexicon reacting procedurally to another tokenized lexicon. It may not remain so forever, but I haven't yet seen anything that makes me join the Cuspers who think we're always just 18 months away.

          Pure nonsense.
          Humans reliably confuse things the more context they try to carry in short-term memory.
          This is even quantified in testing.

          What humans do- is forget.
          What an LLM does not do- is forget.
          A fundamental limitation of its memory architecture, is that it can only append to its context, it cannot remove.
          Management of context is an outside process. Different LLM front-ends may do this in different ways.
          They can throw away sections they don't think are important, they can make it a sliding

  • by darkain ( 749283 ) on Thursday April 10, 2025 @04:14PM (#65295857) Homepage

    I'm calling absolute bullshit on this.

    Only a few days ago, I tried a convo w/ ChatGPT, and have it come up with some content. Literally minutes later, I started a separate conversation where I entered "based on our previous conversation today on X topic, please expand but in this other direction" and it literally hallucinated the entirety of our previous conversation. It can't even copy-paste from itself its so bad right now.

    • I just had this experience. It can't even summarize or use data from prior conversations if you ask it to, but it will try to lie to you and claim otherwise Until you explicitly call out the fact that it's not working and then it will thank you and admit that it actually can't read prior conversations, Or at least that's what it just told me 5 seconds ago.

      • You didn't log in, or you turned off memory in your session.
        I literally just tested it for the above poster, and it works fine.

        Did you try to apply even a fucking drop of rigor to figure out why your result didn't meet expectations?
        Quite the fucking scientist, right here.
    • You didn't log in, or you turned off memory in your session.
      I literally just tested it, and it works fine.
    • Starting today means starting today. It doesn't mean starting a few days ago.
  • The front end I use has its own memory system. Seems to work, too.

    https://docs.windsurf.com/wind... [windsurf.com]

  • ChatGPT getting very creepy now.

  • This update didn't make it draw on your full conversation history, it could already do that, and it didn't expand its memory, IT LIMITED IT. As a user you now have a limited number of "slots" for memory, this update added that limit, which did not exist before, because as soon as it went live i started seeing "saved memory full" at the top of the screen. Looks like it saves 50 to 70 memories, i'm not gonna count. Less than 100 because removing one can reduce more than 1%.

    Anyway. Yeah. Everything i've se

"Out of register space (ugh)" -- vi

Working...