Gateway

A streaming proxy that sits between your agent and your LLM provider — enforcement in the request path, not optionally at the SDK.

What the gateway is

The Cisora gateway is a drop-in reverse proxy for Anthropic, OpenAI, AWS Bedrock, and Google Vertex AI. Your agent sends its LLM requests to cisora.io/api/gateway/… instead of the provider directly; Cisora authenticates the call, scans for prompt injection, evaluates your policies, then forwards the request and streams the response back.

Unlike the SDK wrapper — which instruments your code — the gateway is the real circuit breaker. A malicious or misconfigured agent cannot bypass it by swapping libraries or building its own HTTP client. Blocking happens inline, before the provider ever sees the request.

How it works

Every request follows this path:

Agentyour code

↓

Cisora gatewayauth · injection scan · policy check

↓

ProviderAnthropic / OpenAI / Bedrock / Vertex

↓

Stream backSSE passes through unchanged

↓

Async log + ML scanbackground — zero added client latency

The gateway tees the response stream: one half goes to your client immediately, one half is buffered for the async ML injection scan. Your agent sees no extra latency on the happy path.

Add your provider key

Go to Settings → Gateway and paste your Anthropic or OpenAI secret key. Cisora stores it encrypted with AES-256-GCM; it is never logged in plaintext and never visible after entry.

Cisora forwards the stored key on your behalf — your agents only ever hold a Cisora API key, never the raw provider credential.

Get your Cisora API key

Go to Settings → API Keys and create a key. Keys are prefixed cisora_live_ for production. Treat this key like a provider secret — it authenticates all gateway requests for your org.

Point your SDK at the gateway

Change only base_url / baseURL and swap your provider key for your Cisora key. Everything else — streaming, tool use, vision — works identically.

Python · Anthropic

import anthropic

client = anthropic.Anthropic(
    api_key="cisora_live_YOUR_KEY",
    base_url="https://cisora.io/api/gateway/anthropic",
)
message = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

TypeScript · Anthropic

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: process.env.CISORA_API_KEY!,
  baseURL: 'https://cisora.io/api/gateway/anthropic',
});
const msg = await client.messages.create({
  model: 'claude-opus-4-5',
  max_tokens: 1024,
  messages: [{ role: 'user', content: 'Hello!' }],
});

Python · OpenAI

from openai import OpenAI

client = OpenAI(
    api_key="cisora_live_YOUR_KEY",
    base_url="https://cisora.io/api/gateway/openai/v1",
)
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Python · AWS Bedrock

# The gateway speaks the Bedrock-runtime HTTP API directly.
# Skip the boto3 SigV4 dance and authenticate with your Cisora key instead.
import requests, json

resp = requests.post(
    "https://cisora.io/api/gateway/bedrock/us-east-1"
    "/model/anthropic.claude-3-5-sonnet-20240620-v1:0/invoke",
    headers={
        "authorization": "Bearer cisora_live_YOUR_KEY",
        "content-type": "application/json",
    },
    data=json.dumps({
        "anthropic_version": "bedrock-2023-05-31",
        "max_tokens": 1024,
        "messages": [{"role": "user", "content": "Hello!"}],
    }),
)
print(resp.json())

TypeScript · AWS Bedrock

// Bedrock's InvokeModel — proxied through Cisora.
// Your AWS keys never leave the gateway.
const resp = await fetch(
  'https://cisora.io/api/gateway/bedrock/us-east-1' +
  '/model/anthropic.claude-3-5-sonnet-20240620-v1:0/invoke',
  {
    method: 'POST',
    headers: {
      authorization: `Bearer ${process.env.CISORA_API_KEY!}`,
      'content-type': 'application/json',
    },
    body: JSON.stringify({
      anthropic_version: 'bedrock-2023-05-31',
      max_tokens: 1024,
      messages: [{ role: 'user', content: 'Hello!' }],
    }),
  },
);
const data = await resp.json();

Python · Google Vertex

# Vertex AI generateContent — proxied through Cisora.
# Your service account JSON stays in the gateway; agents see only the Cisora key.
import requests, json

resp = requests.post(
    "https://cisora.io/api/gateway/vertex/us-central1"
    "/projects/YOUR-PROJECT/locations/us-central1"
    "/publishers/google/models/gemini-2-5-pro:streamGenerateContent",
    headers={
        "authorization": "Bearer cisora_live_YOUR_KEY",
        "content-type": "application/json",
    },
    data=json.dumps({
        "contents": [{
            "role": "user",
            "parts": [{"text": "Hello!"}],
        }],
    }),
    stream=True,
)
for chunk in resp.iter_lines():
    if chunk:
        print(chunk.decode())

TypeScript · Google Vertex

// Vertex AI generateContent — proxied through Cisora.
const resp = await fetch(
  'https://cisora.io/api/gateway/vertex/us-central1' +
  '/projects/YOUR-PROJECT/locations/us-central1' +
  '/publishers/google/models/gemini-2-5-pro:generateContent',
  {
    method: 'POST',
    headers: {
      authorization: `Bearer ${process.env.CISORA_API_KEY!}`,
      'content-type': 'application/json',
    },
    body: JSON.stringify({
      contents: [{ role: 'user', parts: [{ text: 'Hello!' }] }],
    }),
  },
);
const data = await resp.json();

Endpoints reference

Provider	Gateway endpoint
Anthropic	https://cisora.io/api/gateway/anthropic/v1/messages
OpenAI	https://cisora.io/api/gateway/openai/v1/chat/completions
AWS Bedrock	https://cisora.io/api/gateway/bedrock/<region>/model/<model-id>/invoke
AWS Bedrock (stream)	https://cisora.io/api/gateway/bedrock/<region>/model/<model-id>/invoke-with-response-stream
Google Vertex	https://cisora.io/api/gateway/vertex/<location>/projects/<project>/locations/<location>/publishers/google/models/<model>:generateContent
Google Vertex (stream)	https://cisora.io/api/gateway/vertex/<location>/projects/<project>/locations/<location>/publishers/google/models/<model>:streamGenerateContent

For Bedrock, the gateway signs the upstream request with SigV4 using your stored AWS credentials — your agents never hold AWS keys directly. For Vertex, the gateway exchanges your service-account JSON for a short-lived OAuth2 token (cached per-org for 50 minutes) before forwarding the request.

What's enforced inline

▸Injection scan — regex layer (<1ms) plus async ML model (~300ms). Malicious score blocks before forwarding.
▸Policy engine — your JSON policies evaluated against every request. block decisions return a 403 before the provider is called.
▸Rate limiting — per-org limits enforced at the gateway edge.
▸Output DLP — scanning response content for credential patterns. Coming in Phase 3.

Streaming

Full SSE streaming passes through unchanged. The gateway tees the response stream: one half is forwarded to your client token-by-token with no buffering; the other half is consumed asynchronously for logging and ML scanning. Your client sees no added latency beyond the ~30ms p50 auth and scan overhead on the request side.

Tool use, vision, and extended thinking all stream correctly — the gateway is protocol-transparent for everything Anthropic and OpenAI expose over SSE.

SDK vs gateway

	SDK wrapper	Gateway
Can be bypassed?	Yes	No
Latency overhead	~0ms	~30ms p50
Blocking	Async (after the fact)	Inline (before provider)
Setup	npm install / pip install	Change base_url

Use the SDK for logging and observability in low-risk contexts. Use the gateway wherever blocking matters — agentic workflows with tool access, customer-facing bots, or any agent that can take irreversible actions.

Security

▸Provider keys encrypted at rest with AES-256-GCM. Decrypted in memory only at forward time.
▸Keys are never written to logs in plaintext. Log entries record key ID and last-four only.
▸Request metadata (model, token counts, latency, policy decisions) is logged. Full message content is stored only when you opt in to trace storage.
▸TLS 1.3 end-to-end between agent, gateway, and provider.

← Core concepts Injection detection →