LIVE NEWS
  • Ex-school district employee jailed for hacks on former employer
  • Social Security benefits and costs are perfectly reasonable — no case exists for massive cuts
  • A renewed security and cooperation agenda for Colombia’s next government
  • Cotton Showing Steady Friday Trade
  • Pentagon may ‘sacrifice’ traditional weapons to buy more drones if reconciliation fails: CTO
  • Calls to restore chalk grassland for rare insects
  • Resident doctors in England call off strike action after new government offer | Doctors
  • Bitcoin Trader Says A 20% BTC Candle Could Bring Retail Back
Prime Reports
  • Home
  • Popular Now
  • Crypto
  • Cybersecurity
  • Economy
  • Geopolitics
  • Global Markets
  • Politics
  • See More
    • Artificial Intelligence
    • Climate Risks
    • Defense
    • Healthcare Innovation
    • Science
    • Technology
    • World
Prime Reports
  • Home
  • Popular Now
  • Crypto
  • Cybersecurity
  • Economy
  • Geopolitics
  • Global Markets
  • Politics
  • Artificial Intelligence
  • Climate Risks
  • Defense
  • Healthcare Innovation
  • Science
  • Technology
  • World
Home»Artificial Intelligence»OpenAI’s new Privacy Filter runs on your laptop so PII never hits the cloud
Artificial Intelligence

OpenAI’s new Privacy Filter runs on your laptop so PII never hits the cloud

primereportsBy primereportsApril 24, 2026No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
OpenAI’s new Privacy Filter runs on your laptop so PII never hits the cloud
Share
Facebook Twitter LinkedIn Pinterest Email


OpenAI has debuted Privacy Filter, a bidirectional token-classification model for detecting and redacting personally identifiable information (PII) that can scan long-form text in a single pass, run locally, and deliver greater context-awareness. 

Scanning text in a single pass for emails, numbers, and more

For developers working with large language models (LLMs), data privacy has long been a recurring issue. But with its new Privacy Filter, released on Wednesday, OpenAI is essentially opening up access to what it uses in-house for its own privacy-preserving workflows.

So, how does it work? 

As OpenAI explains in its announcement blog post, it starts with an autoregressive pretrained checkpoint and converts it into a token classifier over a fixed taxonomy of privacy labels. 

Rather than generating each token at a time, it “labels an input sequence in one pass and then decodes coherent spans with a constrained Viterbi procedure.”

There are eight such labels, allowing Privacy Filter to mask or redact names, addresses, emails, phone numbers, URLs, dates, account numbers, and secrets (e.g., API keys or passwords).

(It’s a decent round-up, but it doesn’t catch everything; social security numbers and passport numbers, for example, are overlooked.) 

Greater context-awareness, run locally

OpenAI claims Privacy Filter has greater context awareness, allowing it to pick up on subtler personal information and make more nuanced decisions.

“By combining strong language understanding with a privacy-specific labeling system, it can detect a wider range of PII in unstructured text, including cases where the right decision depends on context.”  

Specifically, the AI company claims its bidirectional token-classification model is a step up from traditional PII detection tools (such as regular expressions (RegEx) or NLP libraries), which typically rely on deterministic rules for format. 

While these approaches might get the job done for simpler cases, like phone numbers or email addresses, they’re more likely to run into problems when context introduces more subtlety: 

“By combining strong language understanding with a privacy-specific labeling system, it can detect a wider range of PII in unstructured text, including cases where the right decision depends on context.” 

For example, Privacy Filter should be able to distinguish between publicly available information that it can preserve and private information that it should mask or redact, such as a public business address versus a private home address. 

This focus on context also comes into play when processing lengthy documents with unstructured text. OpenAI says Privacy Filter was specifically designed to catch PII in “noisy, real-world” texts, perhaps support logs, long legal filings, and the like. To scan these long-form texts without chunking, the model supports up to 128,000 tokens of context. 

Privacy Filter is also notably small. 

At 1.5 billion total parameters with 50 million active parameters, the model is snappy enough to run locally on a browser or laptop. Besides efficiency gains, this means developers can use Privacy Filter to mask and redact PII in their own environments, thereby reducing exposure risks for sensitive data. 

How it compares to the competition

In its announcement blog post, OpenAI boasts that Privacy Filter “achieves state-of-the-art performance on the PII-Masking-300k benchmark, when corrected for annotation issues we identified during evaluation.” 

What it calls “state of the art” is an F1 score of 96% (94.04% precision and 98.04% recall). 

Of course, OpenAI isn’t the first to offer a PII detection and redaction solution.

Microsoft’s Presidio, for example, is an open-source framework for detecting, redacting, masking, and anonymizing text, images, and structured data. Here, Microsoft might win: In its blog post, OpenAI flat-out states that Privacy Filter is not an anonymization tool but “one component in a broader privacy-by-design system.” 

Amazon’s Comprehend, meanwhile, is a managed service for PII detection and redaction in AWS workflows. 

Stacked up against existing competitors, Privacy Filter stands out for its context-aware, locally run design. 

Where Microsoft may give developers more capabilities than Privacy Filter, OpenAI’s model makes up for its smaller scope with greater context-awareness and local deployment — at least against Amazon’s managed service. 

What this means for developers

For developers building RAG systems, developing customer support pipelines, or orchestrating any other workflow that requires feeding user text into an LLM, OpenAI says Privacy Filter should slot in nicely. 

It’s the option for fine-tuning that adds extra appeal to OpenAI’s model. 

And supposedly, it only takes a small amount of data to see results. In its model card, OpenAI reports that “training on 10% of the dataset is enough to drive F1 scores above 96%.” 

That means with relatively little data, developers can adapt OpenAI’s model for different data distributions, privacy policies, and domain-specific tasks. 

That said, OpenAI expresses caution about high-sensitivity domains, such as legal, medical, and financial workflows, reminding developers to keep human review in the loop and prepare for potential mistakes. 

“Training on 10% of the dataset is enough to drive F1 scores above 96%.” 

One more piece in OpenAI’s stack

Privacy Filter is available today on Hugging Face and GitHub under the Apache 2.0 license. 

It comes alongside OpenAI’s launch of GPT-5.5, released on Thursday, a new model that OpenAI calls “a new class of intelligence.”


Group Created with Sketch.

OpenAI’s new Privacy Filter runs on your laptop so PII never hits the cloud

Meredith Shubel is a technical writer covering cloud infrastructure and enterprise software. She has contributed to The New Stack since 2022, profiling startups and exploring how organizations adopt emerging technologies. Beyond The New Stack, she ghostwrites white papers, executive bylines,…

Read more from Meredith Shubel



Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleWildfires in Florida after dry winter and spring lead to drought across US | Drought
Next Article OpenAI announces GPT-5.5, its latest artificial intelligence model
primereports
  • Website

Related Posts

Artificial Intelligence

Moonshot AI Releases Kimi K2.7-Code: a Coding Model Reporting +21.8% on Kimi Code Bench v2 Over K2.6

June 13, 2026
Artificial Intelligence

Smarter Summer Vacations: The Best AI Travel Gadgets to Pack This Year

June 13, 2026
Artificial Intelligence

Google unveils DiffusionGemma, an AI model that breaks free of left-to-right processing

June 13, 2026
Add A Comment
Leave A Reply Cancel Reply

Top Posts

Paxton’s win over Cornyn sets up high-stakes Texas clash with Talarico

May 28, 202616 Views

Global Resources Outlook 2024 | UNEP

December 6, 202510 Views

Texas Democrat Talarico claims voting laws are rigged ahead of Paxton race

May 28, 20269 Views
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Latest Reviews

Subscribe to Updates

Get the latest tech news from FooBar about tech, design and biz.

PrimeReports.org
Independent global news, analysis & insights.

PrimeReports.org brings you in-depth coverage of geopolitics, markets, technology and risk – with context that helps you understand what really matters.

Editorially independent · Opinions are those of the authors and not investment advice.
Facebook X (Twitter) LinkedIn YouTube
Key Sections
  • World
  • Geopolitics
  • Cybersecurity
  • Popular Now
  • Crypto
  • Artificial Intelligence
All Categories
  • Artificial Intelligence
  • Climate Risks
  • Crypto
  • Cybersecurity
  • Defense
  • Economy
  • Geopolitics
  • Global Markets
  • Healthcare Innovation
  • Politics
  • Popular Now
  • Science
  • Technology
  • World
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms & Conditions
  • Disclaimer
  • Cookie Policy
  • DMCA / Copyright Notice
  • Editorial Policy

Sign up for Prime Reports Briefing – essential stories and analysis in your inbox.

By subscribing you agree to our Privacy Policy. You can opt out anytime.
Latest Stories
  • Ex-school district employee jailed for hacks on former employer
  • Social Security benefits and costs are perfectly reasonable — no case exists for massive cuts
  • A renewed security and cooperation agenda for Colombia’s next government
© 2026 PrimeReports.org. All rights reserved.
Privacy Terms Contact

Type above and press Enter to search. Press Esc to cancel.