LIVE NEWS
  • Continuous Observability as the Decision Engine
  • How to Prevent Colon Cancer and Tell If Your Poop Is Normal: Doctor
  • Iran War Live Updates: U.S. Defense Officials Speak About War Effort
  • NZD/USD edges higher as softer US Dollar, firm RBNZ outlook lift pair
  • Holding out: Japan rearms impressively, but Taiwan can’t count on it
  • Week in wildlife: a tiny harvest mouse, bagel cats and a rhino out for a stroll
  • In Baltic skies, NATO and Russian pilots size each other up warily but without a tilt into war
  • Metaplanet raises $50M in zero-interest bonds to buy more Bitcoin
Prime Reports
  • Home
  • Popular Now
  • Crypto
  • Cybersecurity
  • Economy
  • Geopolitics
  • Global Markets
  • Politics
  • See More
    • Artificial Intelligence
    • Climate Risks
    • Defense
    • Healthcare Innovation
    • Science
    • Technology
    • World
Prime Reports
  • Home
  • Popular Now
  • Crypto
  • Cybersecurity
  • Economy
  • Geopolitics
  • Global Markets
  • Politics
  • Artificial Intelligence
  • Climate Risks
  • Defense
  • Healthcare Innovation
  • Science
  • Technology
  • World
Home»Artificial Intelligence»OpenAI’s new Privacy Filter runs on your laptop so PII never hits the cloud
Artificial Intelligence

OpenAI’s new Privacy Filter runs on your laptop so PII never hits the cloud

primereportsBy primereportsApril 24, 2026No Comments5 Mins Read
Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
OpenAI’s new Privacy Filter runs on your laptop so PII never hits the cloud
Share
Facebook Twitter LinkedIn Pinterest Email


OpenAI has debuted Privacy Filter, a bidirectional token-classification model for detecting and redacting personally identifiable information (PII) that can scan long-form text in a single pass, run locally, and deliver greater context-awareness. 

Scanning text in a single pass for emails, numbers, and more

For developers working with large language models (LLMs), data privacy has long been a recurring issue. But with its new Privacy Filter, released on Wednesday, OpenAI is essentially opening up access to what it uses in-house for its own privacy-preserving workflows.

So, how does it work? 

As OpenAI explains in its announcement blog post, it starts with an autoregressive pretrained checkpoint and converts it into a token classifier over a fixed taxonomy of privacy labels. 

Rather than generating each token at a time, it “labels an input sequence in one pass and then decodes coherent spans with a constrained Viterbi procedure.”

There are eight such labels, allowing Privacy Filter to mask or redact names, addresses, emails, phone numbers, URLs, dates, account numbers, and secrets (e.g., API keys or passwords).

(It’s a decent round-up, but it doesn’t catch everything; social security numbers and passport numbers, for example, are overlooked.) 

Greater context-awareness, run locally

OpenAI claims Privacy Filter has greater context awareness, allowing it to pick up on subtler personal information and make more nuanced decisions.

“By combining strong language understanding with a privacy-specific labeling system, it can detect a wider range of PII in unstructured text, including cases where the right decision depends on context.”  

Specifically, the AI company claims its bidirectional token-classification model is a step up from traditional PII detection tools (such as regular expressions (RegEx) or NLP libraries), which typically rely on deterministic rules for format. 

While these approaches might get the job done for simpler cases, like phone numbers or email addresses, they’re more likely to run into problems when context introduces more subtlety: 

“By combining strong language understanding with a privacy-specific labeling system, it can detect a wider range of PII in unstructured text, including cases where the right decision depends on context.” 

For example, Privacy Filter should be able to distinguish between publicly available information that it can preserve and private information that it should mask or redact, such as a public business address versus a private home address. 

This focus on context also comes into play when processing lengthy documents with unstructured text. OpenAI says Privacy Filter was specifically designed to catch PII in “noisy, real-world” texts, perhaps support logs, long legal filings, and the like. To scan these long-form texts without chunking, the model supports up to 128,000 tokens of context. 

Privacy Filter is also notably small. 

At 1.5 billion total parameters with 50 million active parameters, the model is snappy enough to run locally on a browser or laptop. Besides efficiency gains, this means developers can use Privacy Filter to mask and redact PII in their own environments, thereby reducing exposure risks for sensitive data. 

How it compares to the competition

In its announcement blog post, OpenAI boasts that Privacy Filter “achieves state-of-the-art performance on the PII-Masking-300k benchmark, when corrected for annotation issues we identified during evaluation.” 

What it calls “state of the art” is an F1 score of 96% (94.04% precision and 98.04% recall). 

Of course, OpenAI isn’t the first to offer a PII detection and redaction solution.

Microsoft’s Presidio, for example, is an open-source framework for detecting, redacting, masking, and anonymizing text, images, and structured data. Here, Microsoft might win: In its blog post, OpenAI flat-out states that Privacy Filter is not an anonymization tool but “one component in a broader privacy-by-design system.” 

Amazon’s Comprehend, meanwhile, is a managed service for PII detection and redaction in AWS workflows. 

Stacked up against existing competitors, Privacy Filter stands out for its context-aware, locally run design. 

Where Microsoft may give developers more capabilities than Privacy Filter, OpenAI’s model makes up for its smaller scope with greater context-awareness and local deployment — at least against Amazon’s managed service. 

What this means for developers

For developers building RAG systems, developing customer support pipelines, or orchestrating any other workflow that requires feeding user text into an LLM, OpenAI says Privacy Filter should slot in nicely. 

It’s the option for fine-tuning that adds extra appeal to OpenAI’s model. 

And supposedly, it only takes a small amount of data to see results. In its model card, OpenAI reports that “training on 10% of the dataset is enough to drive F1 scores above 96%.” 

That means with relatively little data, developers can adapt OpenAI’s model for different data distributions, privacy policies, and domain-specific tasks. 

That said, OpenAI expresses caution about high-sensitivity domains, such as legal, medical, and financial workflows, reminding developers to keep human review in the loop and prepare for potential mistakes. 

“Training on 10% of the dataset is enough to drive F1 scores above 96%.” 

One more piece in OpenAI’s stack

Privacy Filter is available today on Hugging Face and GitHub under the Apache 2.0 license. 

It comes alongside OpenAI’s launch of GPT-5.5, released on Thursday, a new model that OpenAI calls “a new class of intelligence.”


Group Created with Sketch.

OpenAI’s new Privacy Filter runs on your laptop so PII never hits the cloud

Meredith Shubel is a technical writer covering cloud infrastructure and enterprise software. She has contributed to The New Stack since 2022, profiling startups and exploring how organizations adopt emerging technologies. Beyond The New Stack, she ghostwrites white papers, executive bylines,…

Read more from Meredith Shubel



Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
Previous ArticleWildfires in Florida after dry winter and spring lead to drought across US | Drought
Next Article OpenAI announces GPT-5.5, its latest artificial intelligence model
primereports
  • Website

Related Posts

Artificial Intelligence

After using this HP laptop, I get why its ‘boring’ design is preferred by business users

April 24, 2026
Artificial Intelligence

The billion-dollar startup with a different idea for AI

April 24, 2026
Artificial Intelligence

Google Cloud AI Research Introduces ReasoningBank: A Memory Framework that Distills Reasoning Strategies from Agent Successes and Failures

April 23, 2026
Add A Comment
Leave A Reply Cancel Reply

Top Posts

Global Resources Outlook 2024 | UNEP

December 6, 20258 Views

The D Brief: DHS shutdown likely; US troops leave al-Tanf; CNO’s plea to industry; Crowded robot-boat market; And a bit more.

February 14, 20265 Views

German Chancellor Merz faces difficult mission to Israel – DW – 12/06/2025

December 6, 20254 Views
Stay In Touch
  • Facebook
  • YouTube
  • TikTok
  • WhatsApp
  • Twitter
  • Instagram
Latest Reviews

Subscribe to Updates

Get the latest tech news from FooBar about tech, design and biz.

PrimeReports.org
Independent global news, analysis & insights.

PrimeReports.org brings you in-depth coverage of geopolitics, markets, technology and risk – with context that helps you understand what really matters.

Editorially independent · Opinions are those of the authors and not investment advice.
Facebook X (Twitter) LinkedIn YouTube
Key Sections
  • World
  • Geopolitics
  • Popular Now
  • Artificial Intelligence
  • Cybersecurity
  • Crypto
All Categories
  • Artificial Intelligence
  • Climate Risks
  • Crypto
  • Cybersecurity
  • Defense
  • Economy
  • Geopolitics
  • Global Markets
  • Healthcare Innovation
  • Politics
  • Popular Now
  • Science
  • Technology
  • World
  • About Us
  • Contact Us
  • Privacy Policy
  • Terms & Conditions
  • Disclaimer
  • Cookie Policy
  • DMCA / Copyright Notice
  • Editorial Policy

Sign up for Prime Reports Briefing – essential stories and analysis in your inbox.

By subscribing you agree to our Privacy Policy. You can opt out anytime.
Latest Stories
  • Continuous Observability as the Decision Engine
  • How to Prevent Colon Cancer and Tell If Your Poop Is Normal: Doctor
  • Iran War Live Updates: U.S. Defense Officials Speak About War Effort
© 2026 PrimeReports.org. All rights reserved.
Privacy Terms Contact

Type above and press Enter to search. Press Esc to cancel.