← All docs

Output DLP

Scan every model response for PII and secrets before the data reaches your agent — the last line of defence against LLM data leakage.

What is Output DLP?

A post-generation filter that runs on every model response inside the gateway.

When your agent calls a model through the Cisora gateway, the response passes through the Output DLP scanner before it is returned to your code. The scanner checks the text against built-in patterns for secrets and PII, plus any custom regex patterns you have defined for your org.

Depending on the configured action, the gateway either logs an incident, replaces matched spans with [REDACTED:TYPE], or blocks the response entirely with a policy_violation error — all transparent to your SDK call.

Why it matters

LLMs can regurgitate sensitive data that should never reach downstream actions.

Large language models are trained on web-scale data and fine-tuned on real documents. When your RAG pipeline injects customer records, email threads, or internal knowledge-base articles into the context window, the model may quote or rephrase that content verbatim in its response — including secrets, credentials, and personal data embedded in those documents.

Output DLP is the last control point before a model response drives downstream agent actions: writing files, sending emails, calling external APIs. A single undetected credit-card number in a response can propagate into a database, a Slack message, or an outbound API call. Catching it here costs microseconds; cleaning it up afterward costs far more.

Common sources of leakage

  • RAG chunks containing customer PII or internal credentials
  • Conversation history carrying data from earlier tool outputs
  • Memorised training data reproduced in factual-looking responses
  • Model summarisation of confidential documents passed as context

What it detects

Built-in patterns cover the most critical categories; custom patterns extend coverage per org.

CategoryPatterns
secretAnthropic / OpenAI / AWS / GitHub / Stripe keys, JWTs, private key blocks
piiSSN, credit / debit cards, email addresses, US phone numbers, IP addresses
customPer-org regex patterns added in Settings → Gateway → Output DLP

Three actions on detection

alertdefault

Log a data_leak incident. Data passes through to the agent unmodified. Use this when you want visibility without disrupting the pipeline.

redactmodifies response

Replace every matched span with [REDACTED:TYPE] before the response reaches the agent. The agent sees a sanitised version; the original is never forwarded.

blocknon-streaming only

Return a policy_violation error. The agent never receives the response. Use in high-risk pipelines where data leakage is unacceptable. Requires non-streaming mode.

Configuration

Enable and configure DLP from Settings → Gateway → Output DLP.

  1. 1Navigate to Settings → Gateway in the Cisora dashboard.
  2. 2Scroll to the Output DLP section and toggle Enable output scanning.
  3. 3Select the categories to scan: Secrets, PII, or both.
  4. 4Pick an action — alert, redact, or block — and click Save DLP settings.

Changes take effect within 60 seconds across the gateway (the settings are cached per-org with a 60 s TTL). No code changes or redeployments are needed on the agent side.

Streaming note

DLP has different semantics for streaming vs non-streaming responses.

For non-streaming requests the full response is buffered before being returned, so DLP runs synchronously. All three actions — alert, redact, and block — work as described.

For streaming requests, chunks are forwarded to the client as they arrive. DLP runs asynchronously once the stream completes, meaning the agent has already received the full response by the time a match is detected. alert still fires and creates an incident, but redact and block have no effect on the client-side stream.

If block or redact are required for your threat model, use non-streaming mode by omitting the stream: true parameter in your SDK call.

Custom patterns

Per-org regex patterns extend coverage beyond the built-in set.

Custom patterns are isolated per organisation and validated at save time — invalid regex is rejected with a 400 before it can affect scanning. Each pattern must specify a name, regex, category, and an optional description. Up to 20 custom patterns per org.

PUT /api/gateway/dlp — body example

{
  "enabled": true,
  "action": "redact",
  "scan_categories": ["secret", "pii"],
  "custom_patterns": [
    {
      "name": "internal_api_token",
      "regex": "cis_[A-Za-z0-9]{32}",
      "category": "secret",
      "description": "Cisora internal service token"
    },
    {
      "name": "employee_id",
      "regex": "EMP-\d{6}",
      "category": "pii",
      "description": "Internal employee identifier"
    }
  ]
}

REST endpoint

Manage DLP settings programmatically via the API.

MethodPathDescription
GET/api/gateway/dlpReturn current DLP settings for the org
PUT/api/gateway/dlpUpdate DLP settings (upsert)

Auth: pass your session cookie (Authorization: Bearer <token> is not required — the cookie is forwarded automatically by the browser). These endpoints are scoped to dashboard use only and require a valid org session.

Incidents integration

Every DLP detection creates a data_leak incident in your dashboard.

Whenever the scanner finds a match (regardless of action), it creates or updates a data_leak incident in your Incidents dashboard. Incidents are deduplicated — only one open incident per org+agent per hour to prevent alert fatigue.

Incident severity is derived from the highest-severity match in the response:

criticalAPI key, private key block, JWT — immediate Slack alert if configured.
highGoogle API key, Resend key, JWT with lower confidence.
mediumEmail address, US phone number.
lowIP address.

Each incident links to the specific action where the match occurred. From the Incidents panel you can view the matched pattern names, the configured DLP action that fired, and the raw evidence — without seeing the redacted data itself.

View Incidents dashboard →