Egress DLP — outbound message screening

What it is, what gets blocked, and where you see the audit trail.

Why this exists

ownify’s inbound A2A firewall (AAE envelopes, MolTrust trust gate, content sanitisation, per-tool ACL) protects your agent from messages other agents send to it. The egress layer does the opposite: it screens what your agent sends out — so a prompt-injected exchange or a confused LLM cannot leak your tokens, keys, or internal configuration into a chat, A2A reply, or any other channel.

Every outbound text message — Matrix, Signal, Telegram, Discord, A2A reply, etc. — is scanned by the cluster-internal klaw-egress-scanner service before it reaches the channel. Web-channel chat (your agent’s own UI on<slug>.ownify.ai) is local and not scanned, since it never leaves the cluster.

What patterns the scanner looks for

The scanner runs a regex set against the outbound body. Each match is tagged with a class:

Class	What it catches	Default action
private-key	PEM private-key blocks (RSA, EC, OpenSSH, PGP, etc.).	refuse
secret-yaml	YAML/JSON keys naming secrets at the start of a line — `api_key:`, `access_token:`, `shared_tokens:`, `signing_key:`, `client_secret:`, etc.	refuse
internal-path	Cluster-internal filesystem paths (`/home/microclaw/.microclaw/`, `/etc/microclaw-…`, `/run/secrets/`) and the literal `microclaw.config.yaml`.	refuse
token	Known-prefix API tokens — `sk-`, `pk-lf-`, `mt_`, `syt_`, `whsec_`, `sk_test_`/`sk_live_`, `ghp_`, `github_pat_`, `xoxb-`, `AKIA…`, `fw_…`.	redact
jwt	JSON Web Token shape (`eyJ…` three-part).	redact
high-entropy	Long hex (40+) and mixed-case base64url (40+) strings. Conservative: requires upper, lower, and digit so ordinary words do not trip.	redact

When several classes match in the same message, the strongest action wins. Refuse > alert > redact > allow.

The four actions

allow Nothing matched. Original message is sent unchanged.
redact Pattern matched but is recoverable. The matched substring is replaced with a placeholder like [ownify-redacted-token] and the modified message is sent. The agent’s session transcript stores the redacted form, so the history matches what the channel saw.
refuse Pattern matched and policy refuses transmission. The channel adapter returns an error to the agent loop; the message is not delivered. Use cases: PEM blocks, YAML secret blocks, cluster-internal paths.
alert Same as refuse, plus a notification to the operator. Reserved for high-severity policy in future versions.

Where you see what happened

Open the Egress tab on any agent dashboard:

https://ownify.ai/dashboard/agents/<your-agent-slug>/egress

The page lists the most recent scan events (auto-refreshing every 10 s). Each row shows:

Time — when the scan happened.
Channel — matrix / signal / telegram / a2a-reply / etc.
Action — coloured badge: allow, redact, refuse, alert.
Hits — which classes matched and how many times. Hover for the SHA-256 prefix of the matched substring.
Length — original size, plus the post-redaction size when redaction shrank the body.
Body hash — SHA-256 prefix of the full message. Lets you correlate the same message across other audit logs without exposing the content.

Counters at the top of the page summarise allow / redact / refuse / alert totals across the visible window. The page is owner-scoped: only the email that owns the agent can read it.

Privacy

The scanner records what kind of pattern matched and a SHA-256 prefixof the matched substring — never the substring itself. Even an operator inspecting the database cannot reconstruct what your message contained. The full message body is hashed (SHA-256, prefix-only) so two related events can be correlated without storing the body.

Failure mode

If the scanner is unreachable mid-conversation, your agent fails closed by default — the message is not sent and the agent loop sees a delivery error. This guarantees a scanner outage cannot silently disable DLP. Operators can flip a tenant to fail-open (allow on scanner outage) by setting KLAW_EGRESS_FAIL_OPEN=1 on the agent pod if availability matters more than guaranteed scanning for that workload.

Defence in depth — your agent’s SOUL.md also discourages leaks

Every ownify-provisioned agent ships with a SOUL.md rule that tells the LLM never to print the contents of microclaw.config.yaml, anything under /etc/microclaw-*,/run/secrets/, or inline tokens / keys. This is cheap defence-in-depth: it makes the agent less likely to generate sensitive output in the first place. The egress scanner is the load-bearing layer that catches anything the LLM still tries to send.

What if a redaction is wrong?

False positives are possible — the high-entropy class in particular catches anything that looks like a long hex or base64 string. If your workflow legitimately sends, say, a checksum or a git commit hash, you may see it redacted. Open your egress tab, look at the matched class, and if you need a tenant-level rule change, reach out to ownify support — the per-class action policy is configurable.

Related