As LLM-powered brokers transfer from analysis to manufacturing, one design pressure is turning into tougher to disregard: the extra helpful cloud-hosted reminiscence turns into, the extra non-public consumer information it exposes. Researchers from MemTensor (Shanghai), HONOR System and Tongji College have launched MemPrivacy, a framework that makes an attempt to resolve this pressure with out sacrificing the utility that makes customized reminiscence worthwhile within the first place.
The Core Drawback With Cloud Reminiscence
Whenever you work together with an AI agent, your dialog usually comprises delicate particulars like well being circumstances, e mail addresses, monetary figures, passwords, and extra. In a typical edge-cloud deployment, the consumer’s system (the sting) handles enter, whereas computation-heavy reminiscence administration and reasoning occur within the cloud. This structure is environment friendly, nevertheless it means uncooked, unfiltered consumer information travels to and persists in cloud methods.
The chance will not be theoretical. Prior research present that multi-turn reminiscence assaults can induce privateness violations with success charges as much as 69%, and leakage assaults in opposition to reminiscence methods can attain 75% success. Oblique immediate injection may even manipulate brokers into actively eliciting non-public info from customers. As soon as delicate content material enters cloud logs, vector databases, or exterior reminiscence shops, it will probably stay accessible by subsequent storage, retrieval, and reuse levels properly past the unique interplay.
Prior works have tried to deal with this with masking — changing delicate values with tokens like ***. The issue is that masking destroys semantics. If a consumer asks an agent to draft a health care provider’s e mail and their blood stress studying and e mail deal with are each changed with ***, the cloud mannequin can not full the duty meaningfully. Extra principled methods akin to differential privateness and cryptographic safety supply stronger ensures however are troublesome to combine into interactive reminiscence pipelines with out degrading response high quality.

What MemPrivacy Does Otherwise
Somewhat than masking non-public content material, MemPrivacy replaces it with typed placeholders — structured tokens like or — earlier than the enter leaves the native system. The cloud mannequin receives semantically intact textual content and might motive and retailer reminiscences usually; it simply by no means sees the precise values. When the cloud returns a response containing placeholders, the native system seems to be up the originals from a safe native database and substitutes them again in. The consumer sees a completely coherent, customized response.
This design is known as native reversible pseudonymization, and the total pipeline operates in three levels. Stage 1 (Uplink Desensitization): A light-weight on-device mannequin identifies privacy-sensitive spans within the enter, classifies every by sort and sensitivity stage, and replaces them with typed placeholders. The unique-to-placeholder mappings are saved regionally and persist throughout periods so the identical worth at all times will get the identical placeholder. Stage 2 (Cloud Processing): The sanitized enter is distributed to the cloud agent or reminiscence system. The typed placeholders protect sufficient semantic construction for reminiscence formation and retrieval to perform appropriately. Stage 3 (Downlink Restoration): The cloud response, which can include placeholders, is restored regionally through light-weight database lookup and string substitution, including negligible latency.
A 4-Stage Privateness Taxonomy
A key contribution by the analysis crew is a four-level privateness taxonomy (PL1–PL4) that defines what will get protected and at what threshold:
- PL1 covers normal preferences, habits, and stylistic selections that don’t establish an individual and carry low danger. These are usually not protected by default.
- PL2 consists of identifiable PII — actual names, cellphone numbers, e mail addresses, detailed addresses, account usernames, and mixtures that would establish or hint a particular particular person.
- PL3 covers extremely delicate PII: authorities doc numbers, monetary account particulars, well being information, exact location and trajectory information, biometrics, uncooked communication content material, and delicate id attributes akin to spiritual beliefs or ethnicity.
- PL4 is the very best tier — credentials and secrets and techniques which are instantly exploitable: passwords, PINs, verification codes, session tokens, API keys, non-public keys, seed phrases, and undisclosed enterprise supplies. Publicity at this stage can instantly lead to account takeover, monetary loss, or large-scale information exfiltration.
Customers can configure the masking threshold for instance, defending solely PL3 and PL4, or making use of full safety throughout PL2–PL4 — giving granular management over the privateness–utility trade-off.


MemPrivacy-Bench and Mannequin Coaching
To coach and consider their strategy, the analysis crew constructed MemPrivacy-Bench, a dataset masking 200 artificial consumer profiles and over 155,000 privateness situations (125,776 coaching, 29,967 check) throughout balanced Chinese language and English dialogue, spanning 7 high-level state of affairs classes and 23 fine-grained subcategories. The check set comprises 615 question-answer pairs throughout six reminiscence process varieties: primary reminiscence, temporal reasoning, adversarial questioning, dynamic updating, implicit inference, and knowledge aggregation. Annotations have been first generated by a dual-model pipeline utilizing Gemini-3.1-Professional and GPT-5.2, then verified by six human annotators, attaining a closing annotation accuracy of 98.08%.
The MemPrivacy extraction fashions are fine-tuned from Qwen3 base fashions at 0.6B, 1.7B, and 4B parameter scales utilizing supervised fine-tuning (SFT) adopted by reinforcement studying with Group Relative Coverage Optimization (GRPO). GRPO estimates benefits based mostly on relative rewards throughout a number of sampled outputs per enter, utilizing F1 rating because the reward sign, avoiding the computational overhead of a individually educated critic. Coaching used 160 customers for the coaching cut up and 40 customers for the check cut up.
Experimental Outcomes
On MemPrivacy-Bench, the best-performing mannequin — MemPrivacy-4B-RL — achieves an F1 rating of 85.97%, in comparison with 78.41% for Gemini-3.1-Professional, the strongest general-purpose mannequin examined. Even the smallest mannequin, MemPrivacy-0.6B-SFT, reaches 83.09% F1, outperforming all general-purpose fashions evaluated. On the out-of-distribution PersonaMem-v2 benchmark, MemPrivacy-4B-RL achieves 94.48% F1, in comparison with 92.18% for DeepSeek-V3.2-Suppose, the very best normal mannequin on that set.
OpenAI’s lately launched Privateness-Filter, a bidirectional token-classification mannequin for PII detection open-sourced. It achieves 35.50% F1 on MemPrivacy-Bench, a spot of over 50 share factors behind the very best MemPrivacy mannequin, although it operates at considerably decrease latency (0.34s versus roughly 2s for MemPrivacy fashions on MemPrivacy-Bench).
On downstream reminiscence utility, MemPrivacy was examined throughout three broadly used reminiscence methods: LangMem, Mem0, and Memobase. When defending all PL2–PL4 content material, accuracy drops on MemPrivacy-Bench are contained to 0.73%–1.30% and 0.71%–1.60% on PersonaMem-v2, relative to no-protection baselines. In contrast, irreversible masking causes accuracy drops of 16.99%–41.87% on MemPrivacy-Bench, whereas untyped placeholder masking causes drops of 4.72%–6.67% on MemPrivacy-Bench and a pair of.67%–8.71% on PersonaMem-v2.
Key Takeaways
- MemPrivacy replaces delicate consumer information with semantically typed placeholders (e.g.,
) on-device earlier than cloud transmission, so the cloud reminiscence system by no means receives uncooked non-public values. - The framework introduces a four-level privateness taxonomy (PL1–PL4) starting from normal preferences to right away exploitable credentials, with user-configurable masking thresholds.
- MemPrivacy-4B-RL achieves 85.97% F1 on MemPrivacy-Bench and 94.48% on PersonaMem-v2, outperforming GPT-5.2 (68.99%) and Gemini-3.1-Professional (78.41%) on privateness span extraction.
- Throughout LangMem, Mem0, and Memobase, making use of MemPrivacy on the PL2–PL4 stage limits reminiscence utility loss to inside 1.6%, in comparison with accuracy drops of as much as 41.87% with irreversible masking.
- Fashions vary from 0.6B to 4B parameters, with per-message inference below two seconds, making the framework appropriate for on-device deployment with out noticeable latency.
Marktechpost’s Visible Explainer
Try the Paper and Mannequin Weights. Additionally, be happy to observe us on Twitter and don’t overlook to hitch our 150k+ ML SubReddit and Subscribe to our Publication. Wait! are you on telegram? now you may be a part of us on telegram as properly.
Have to accomplice with us for selling your GitHub Repo OR Hugging Face Web page OR Product Launch OR Webinar and so on.? Join with us
