OpenAI Open-Sources Privacy Filter, a Tiny Model That Scrubs PII Without an API Call

Failed to add items

Sorry, we are unable to add the item because your shopping basket is already at capacity.

Add to cart failed.

Please try again later

Add to wishlist failed.

Please try again later

Remove from wishlist failed.

Please try again later

Follow podcast failed

Unfollow podcast failed

OpenAI Open-Sources Privacy Filter, a Tiny Model That Scrubs PII Without an API Call

Listen for free

View show details

About this listen

This story was originally published on HackerNoon at: https://hackernoon.com/openai-open-sources-privacy-filter-a-tiny-model-that-scrubs-pii-without-an-api-call.
OpenAI open-sourced Privacy Filter, a 50M-active-parameter model that detects and masks PII locally in one pass. Here's what's actually new, and what's hype.
Check more stories related to tech-companies at: https://hackernoon.com/c/tech-companies. You can also check exclusive content about #openai, #open-source, #privacy-filter, #openai-privacy-filter, #tiny-model, #ai, #openai-open-sources, #hackernoon-top-story, and more.

This story was written by: @abstraction. Learn more about this writer by checking @abstraction's about page, and for more stories, please visit hackernoon.com.

OpenAI released Privacy Filter under Apache 2.0 — a 1.5B-parameter (50M active) bidirectional token-classification model that detects and masks PII in text locally, in a single forward pass. It runs on a laptop, supports 128K context, hits 96% F1 out of the box, and is fine-tunable with minimal data. Eight entity categories: names, addresses, emails, phones, URLs, dates, account numbers, and secrets. It's context-aware (not regex), ships with a CLI and eval tooling, and slots into the same open-weight ecosystem as gpt-oss. The catch: multilingual support is thin, adversarial formatting breaks it, and the benchmark validation used OpenAI's own models to grade OpenAI's model.

No reviews yet