Skip to content

Firewall

The Humanbound Firewall is an open-source, context-aware security layer that sits between your users and your AI agent. It uses an LLM-as-a-Judge approach to evaluate incoming messages in real time -- blocking prompt injections, off-topic requests, and policy violations before they reach your agent.

Installation

pip install aiandme

Note

The Python package is currently published as aiandme (not yet renamed to humanbound). The package name will be updated in a future release.

Python Integration

from aiandme import Firewall
from aiandme import AIANDME_Firewall_CannotDecide, AIANDME_Firewall_NotAuthorised

firewall = Firewall(
    api_key="your-azure-openai-key",
    azure_endpoint="https://your-resource.openai.azure.com",
    scope="You are a customer support bot for Acme Corp...",
    permitted_intents=["order_status", "returns", "product_info"],
    restricted_intents=["competitor_comparison", "internal_pricing"]
)

try:
    result = firewall.filter(user_input)
    # result.verdict: "Pass" | "Off-Topic" | "Violation" | "Restriction"
    # result.reasoning: explanation of the verdict
    if result.verdict == "Pass":
        response = your_bot.chat(user_input)
    else:
        response = result.reasoning
except AIANDME_Firewall_NotAuthorised:
    response = "Your request was blocked by the security firewall."
except AIANDME_Firewall_CannotDecide:
    response = "Unable to process your request. Please try again."

Firewall Verdicts

Verdict Description
Pass Input is safe and within scope. Forward to your agent.
Off-Topic Input is outside the agent's defined scope. Reject with explanation.
Violation Input contains prompt injection, jailbreak attempt, or security threat.
Restriction Input touches a restricted intent (e.g., competitor comparison).

Adaptive Context Defense (ACD)

The firewall auto-learns from your Humanbound test results. When FSLF identifies adversarial FAIL examples, those patterns are incorporated into the firewall's defense model -- creating a feedback loop between testing and runtime protection.

Guardrails -> Firewall Pipeline

Export learned guardrails and feed them into your firewall configuration:

# Export guardrails from test findings
hb guardrails -o guardrails.json

# Use in your firewall configuration
hb guardrails --vendor openai -o openai-rules.json

Open Source

The Humanbound Firewall is Apache-2.0 licensed. Contributions welcome at github.com/Humanbound/firewall.