A v0 artifact for Sachit · built in ~45 min after the Show HN thread

YourMemory — the page that answers your Show HN thread, by name

Your launch hit 86 points and 37 comments in 10 hours. Three of those comments were substantive technical critique (altmanaltman, cyanydeez, axeldunkel). One questioned the entire premise (SwellJoe). Your README answers none of them. This page does.

YourMemory MCP / DuckDB

For: long-running agents

Local-first MCP server. Forgetting curve + spaced repetition + graph layer over the vector store. The bet: what to forget is as critical as what to remember.

Mem0

For: chat-derived memory

Open-source SDK + hosted SaaS. Compresses chat history into "memories" and re-injects on recall. Largest GitHub-stars community in the category.

Letta ex-MemGPT

For: stateful agent runtime

OS-style runtime with three memory tiers (core / recall / archival). You instantiate a Letta agent, not a memory store.

Zep

For: temporal knowledge graphs

Compiles chat into a time-aware knowledge graph. Strong opinions about entity extraction. Cloud + community edition.

Plain RAG

For: most agents, honestly

Chunk → embed → top-k retrieve. No memory product. The thing altmanaltman was implicitly comparing you to. Sometimes it's the right answer.

Section 1

Where YourMemory actually differs from chunked RAG

altmanaltman's accusation is the thread's harshest: "the whole 'biological memory' thing seems like marketing fluff on basic cache mechanisms." The honest response isn't to defend the framing — it's to make the mechanism difference legible enough that a skeptic can verify it themselves. Three things below are not in plain RAG.

Dimension YourMemory Mem0 Letta Zep Plain RAG
Source of truth Agent's own activity stream + recall events Chat / message history Agent runtime state (3 tiers) Chat compiled into temporal graph Whatever you chunk
Forgetting model Ebbinghaus curve + spaced repetition — strength score per memory, recall reinforces, unused decays past threshold and is pruned No native decay — relies on compression to keep recall set tractable Eviction tier-based (archival vs recall vs core) Time-anchored (facts have a `valid_from`/`valid_to`) none — manual cleanup or unbounded growth
Retrieval model vector + graph — graph layer catches "logical neighbors" that semantic search misses Vector top-k Tier-aware (function-call surface for recall) Graph traversal + vector Vector top-k
Storage layer DuckDB local-first — no service required, embeddable SQLite or hosted Postgres (self-hosted) or hosted Postgres + Neo4j-style graph Whatever vector DB you bring
Open source? yes — repo on GitHub yes Apache-2.0 + hosted yes Apache-2.0 + hosted community edition + cloud yes by definition
Reported benchmark 52% Recall@5 on LoCoMo · ~84% token reduction vs unfiltered context Mem0 paper claims +26% LLM-as-judge over OpenAI memory MemGPT paper benchmarks on document QA over long contexts DMR / LongMemEval benchmarks reported Varies wildly by chunking + reranker
Caveats on the benchmark LoCoMo has known issues (altmanaltman's point) — see Section 3 Self-reported; LLM-as-judge has variance Older benchmark window Newer; less third-party replication No single number applies
"Hello, memory" code mcp.add(text, kind=…) → recalls auto-strengthen the entry m.add(messages, user_id=…) letta.create_agent(memory=…) zep.memory.add(session_id, msgs) vec.add(chunks); vec.search(q)
The unique bet "Forgetting is a first-class operation, not a side-effect of running out of tokens" "Chat-history compression is the memory primitive" "Agents need a typed runtime, not a vector store" "Time-aware knowledge graphs beat embeddings" "Just throw it in the prompt"
Section 2

Use the right one for the job

Honest recommendations beat overclaiming. Below is the cohort for which each project — including "no memory product at all" — is the correct answer. Telling SwellJoe "use no memory product" earns the right to tell the next reader why YourMemory matters.

Use YourMemory

For long-running agents whose context bloats over weeks

Your agent runs against the same project / codebase / document set across many sessions. Without forgetting, the recall set grows monotonically and either tokens explode or you start hand-pruning. YourMemory makes pruning a property of the data, not a manual chore.

  • You're hitting the "the agent keeps citing yesterday's bug fix that no longer applies" problem.
  • You want a local-first MCP server, not another hosted dependency.
  • You can defend the decay-clock behavior to your team (Section 3 cyanydeez response).

Use Mem0

When memory is fundamentally chat-derived

Building a chatbot, a support agent, or anything where the conversation history *is* the memory you need. Mem0's compression-and-recall loop is the most mature implementation of that pattern. Open source means cloud bills can't surprise you.

  • The user's history with the agent is what matters — not your own retrieval set.
  • You want SDK and SaaS optionality.

Use Letta

When you're adopting an agent runtime, not just a memory layer

You buy into the MemGPT thesis: agents need an OS-style runtime with explicit tier management. Pick Letta when you'd rather instantiate Agents than glue a memory product into your own loop.

Use Zep

When your problem is shaped like a knowledge graph

You're modeling entities and their relationships across time — a CRM agent, a healthcare agent, a legal-events agent. Embeddings keep returning fuzz; you actually need typed nodes and edges.

Use plain RAG

When your agent's lifespan is one session

If your agent runs once, answers a question, and exits — there is no memory problem. Don't add a dependency. SwellJoe's frustration with memory products distracting agents is real for this case. Add memory the moment a session crosses a meaningful boundary.

Use manual curation (the tra3 approach)

When you're a power user who knows what to keep

tra3 in your thread already names this: "preserve all my Claude Code conversations and set the context from there." For a single expert user with strong intuitions about what's worth remembering, hand-curated memory will outperform any automated decay model. YourMemory's argument is for the case where you're not at the keyboard, or you've got many sessions to track.

Section 3

Answers to the thread, by name

The thread is the brief. Below is a direct response to each substantive critique, by HN handle, so a reader who arrived here from the comments can verify the page actually engages with what was said.

@altmanaltman
"I am sorry but the whole 'biological memory' thing seems like marketing fluff on basic cache mechanisms. You said it cuts token usage by 84% but isn't that typical for any typical chunked RAG system? And why did you specifically chose to test against the LoMoCo dataset when there's a lot of issues with it and it being very easy to cheat?"
The framing is fair, the mechanism difference is real. Three things in YourMemory are not in plain chunked RAG: (1) strength scoring with recall reinforcement — a memory's likelihood of retrieval changes based on whether agents actually reach for it, (2) decay-driven pruning past a threshold — old unused entries are removed from the index, not just down-weighted, which keeps the search space bounded over time, and (3) graph layer over the vector store — catches relevant-but-not-semantically-close nodes that top-k embedding search misses. On 84% token reduction: chunked RAG with a tight top-k will hit similar numbers per query, but YourMemory's claim is over a long-running session — the comparison set is "RAG with no pruning" vs "RAG with decay-based pruning," not "no RAG" vs "RAG." On LoCoMo: yes, the dataset has known issues (synthetic dialogues, narrow distribution, partial-credit scoring quirks). Honest read: a 52% number on LoCoMo is a starting signal, not a closing argument. The page that ships should name the caveat (and ideally include a second benchmark — LongMemEval or LoCoMo-Hard — for triangulation). The "biological" framing should be the metaphor, not the headline. The headline is: "forgetting as a first-class operation."
@cyanydeez
"the decay rate shouldn't be based on a real clock but a lifetime of it's use within the coding session. Elsewise your memory fades even when there's no process change (eg, coder goes on vacation)."
This is a real conceptual flaw worth addressing in code, not just the page. The Ebbinghaus curve in YourMemory's current implementation appears to use wall-clock time. For an agent that ticks unevenly (coder vacations, scheduled batch jobs, weekend pauses) this is wrong on its face — a memory used once an hour during a sprint should not "decay" because the sprint ended. The right unit is recall-events-since or session-cycles-since, not seconds-since. The fix is small and non-breaking: add a `tick()` API that callers invoke per agent step / session boundary. Decay calculations switch from now() - last_recall to tick_count - last_recall_tick. Wall-clock decay can stay as a fallback for use-cases where it does fit (e.g., "memories about the user's emotional state probably shouldn't last for years"). Page recommendation: name this out loud as a v0.2 roadmap item. Doing so neutralizes the critique and signals the project is alive.
@axeldunkel
"What concerns me more are memory chunks with errors in them — they need to be corrected/removed by some other mechanism, not by decay (since they might get retrieved often)."
Decay can't fix recurring lies — it makes them worse, because frequent recall reinforces strength. This is the most important critique in the thread for production users. Two real mechanisms (pick at least one): (1) explicit invalidation surface — an MCP tool that the agent (or human) can call to mark a memory as stale, which immediately drops its strength to a threshold below pruning, (2) contradiction detection — a periodic pass that surfaces memories whose claims conflict with newer high-strength memories, then queues them for invalidation review. YourMemory should ship (1) before (2) — it's a 30-line addition to the MCP surface, and it's the difference between "trust the memory" and "the memory might be wrong forever." Add it to the README's "Limitations" section now even if not implemented yet — it's better to admit the failure mode than to have a user discover it in a production agent.
@SwellJoe
"I know everybody seems to want the agent to remember every conversation they've ever had with it, but I just don't see the value in that. In fact, it seems to hurt productivity to have the agent second guessing me based on something I said yesterday. Every time I've used any memory system, the agent gets distracted from the current tasks based on previous conversations and branches of development."
This is a positioning win, not a critique. SwellJoe is naming the failure mode YourMemory is built to prevent: undecayed memory drowning current context. The page response: "You're describing the problem. We agree. The fix is decay + pruning, not adding more retention." The line on YourMemory's homepage should be: "if memory has ever made your agent worse, that's because nothing was forgetting." That reframes the entire negative-experience cohort as YourMemory's natural buyer. Concrete: ship a `--no-decay` flag that turns YourMemory into a plain vector store, and instrument it. The thesis is testable: agents with decay should beat agents without decay on long-session reasoning tasks. If they don't, SwellJoe was right.
@waterbuffaloai
"How to decide effectively what to save in the memory: Is it the model to decide what is important, summarize and save it to the memory? How to avoid redundancy and categorize the memory correctly so you could get the right hit and decide what to forget?"
Two practical questions, two short answers worth on the page. What to save: the agent decides via an explicit MCP tool call, not an automatic post-hoc pass. The model is already the only thing that knows the difference between "noise" and "fact worth remembering" in its current loop — making it call memory.add() deliberately is honest about that. Auto-summary is a v0.3 feature, not a v0.1. Redundancy: embed-on-write + similarity check before insert. If a new memory is > 0.95 cosine to an existing one, merge (sum strength, take latest text) instead of inserting. This keeps the index from drowning in near-duplicates from chatty agents. These two answers should be in the README's "How it works" section. waterbuffaloai is your highest-intent reader in the thread — they're literally building the same thing.
@xcf_seetan
"It strikes me as funny how we want to get super AI inteligence but keep trying to anthropomorphizing all AI aspects to make it more 'human'. IMHO, if we keep doing it we will create Human AI with all errors and deficiencies humans have."
The anthropomorphism is a metaphor, not a thesis. The mechanism (strength scores + decay + pruning) is engineering, not biology. The Ebbinghaus curve is just a convenient functional form — exponential decay with reinforcement — that happens to also fit human memory data. The page fix is one line: rename the README from "biological memory" to "decay-based memory with reinforcement (inspired by Ebbinghaus)." Same mechanism, no metaphysics. xcf_seetan's critique evaporates.