OpenAI Open-Sources Privacy Filter, a Tiny Model That Scrubs PII Without an API Call
Failed to add items
Add to cart failed.
Add to wishlist failed.
Remove from wishlist failed.
Follow podcast failed
Unfollow podcast failed
-
Narrated by:
-
Written by:
About this listen
This story was originally published on HackerNoon at: https://hackernoon.com/openai-open-sources-privacy-filter-a-tiny-model-that-scrubs-pii-without-an-api-call.
OpenAI open-sourced Privacy Filter, a 50M-active-parameter model that detects and masks PII locally in one pass. Here's what's actually new, and what's hype.
Check more stories related to tech-companies at: https://hackernoon.com/c/tech-companies. You can also check exclusive content about #openai, #open-source, #privacy-filter, #openai-privacy-filter, #tiny-model, #ai, #openai-open-sources, #hackernoon-top-story, and more.
This story was written by: @abstraction. Learn more about this writer by checking @abstraction's about page, and for more stories, please visit hackernoon.com.
OpenAI released Privacy Filter under Apache 2.0 — a 1.5B-parameter (50M active) bidirectional token-classification model that detects and masks PII in text locally, in a single forward pass. It runs on a laptop, supports 128K context, hits 96% F1 out of the box, and is fine-tunable with minimal data. Eight entity categories: names, addresses, emails, phones, URLs, dates, account numbers, and secrets. It's context-aware (not regex), ships with a CLI and eval tooling, and slots into the same open-weight ecosystem as gpt-oss. The catch: multilingual support is thin, adversarial formatting breaks it, and the benchmark validation used OpenAI's own models to grade OpenAI's model.