Redact: A Transparent Privacy Filter for the AI Era
May 9, 2026
As AI tools become embedded in daily workflows — from browser-based chat interfaces to API-driven code assistants — every prompt sent to an external service is a potential data leak. Personally identifiable information (PII) routinely ends up in training data, logs, and model contexts with no easy way to recall it.
The Problem
Most privacy solutions require users to change their behavior: manually scrubbing prompts, using specialized interfaces, or routing through enterprise proxy systems that need IT involvement. This friction means they rarely get adopted outside compliance-mandated environments.
How Redact Works
Redact takes a different approach. It sits transparently between your Linux machine and AI services — both browser-based and API-based — intercepting outbound traffic and stripping PII before data reaches external servers. Built on mitmproxy for traffic interception and spaCy for NLP-based entity recognition, Redact requires no changes to existing applications or workflows.
The key design decisions:
- Transparent interception — works with any browser or API client without configuration changes
- NLP-based detection — uses spaCy models to identify names, addresses, emails, phone numbers, and other PII entities rather than relying on brittle regex patterns
- Local processing — all anonymization happens on-device; no PII ever leaves the machine
- Reversible mapping — maintains a local mapping table so responses containing anonymized tokens can be de-anonymized for the user
Architecture
Redact runs as a local proxy (via mitmproxy) that intercepts HTTPS traffic to known AI service domains. For each outbound request:
- The request body is parsed to extract user-provided text content
- spaCy NER models identify PII entities in the text
- Detected entities are replaced with consistent placeholder tokens
- The sanitized request is forwarded to the AI service
- The response is scanned for placeholder tokens and de-anonymized before returning to the user
Trade-offs
Redact is designed for individual users on Linux workstations. It does not attempt to solve enterprise-scale data governance or multi-tenant privacy. NLP-based detection has inherent false negative rates — domain-specific PII (e.g., internal project names, proprietary identifiers) requires custom entity training. The transparent proxy approach also means it only protects traffic from the machine where it is installed.
Source
The project is open source and available on GitHub: drig-ai/redact