Humanbound website

You're Still Alt-Tabbing to a Security Tool

DG
Demetris Gerogiannis
Feb 26, 20267 min read
Security engineer working in a terminal with AI-assisted tools integrated into their workflow

The interface for AI agent security is changing — and it matters more than the tooling itself.

There's a moment every security engineer knows. You've got your terminal open, you're deep in a workflow, and then you need to test something. So you switch context. Open a browser. Log into a platform. Configure a scan. Wait. Export results. Copy them back into the system you were already working in.

That friction is so familiar it's invisible. It's also why most AI agents never get tested at all.

The gap nobody talks about

The conversation around AI security has focused almost entirely on what to test. Prompt injection. Jailbreaks. Data leakage. The OWASP LLM Top 10 gave us a taxonomy, and that was necessary.

But taxonomy doesn't solve the operational problem. Security engineers aren't short on awareness — they're short on workflow.

They know their AI agents should be tested against adversarial multi-turn attacks. They know guardrails that hold in English might collapse in French. They know a single manual red-team session doesn't constitute a security programme.

What they don't have is a way to do all of this without leaving the environment where they're already working.

Two commands to get started

The setup is intentionally boring. Install the Humanbound CLI and authenticate:

Loading code editor...

Then add it as an MCP server in Claude Code — one entry in your configuration, the same way you'd add any other tool to your AI-assisted workflow:

Loading code editor...

That's it. No onboarding wizard. No twenty-minute setup call. From this point on, your AI coding assistant can orchestrate security tests, query posture scores, pull findings, and export guardrails — all through conversation. For full setup details, visit docs.humanbound.ai.

What happened on a Tuesday

Here's a real session. No staging, no scripted demo.

A security engineer opens their terminal. They're already using Claude Code with the Humanbound MCP server connected. They're reviewing a financial guidance agent — Acme FinBot — that the product team deployed last quarter.

They type:

"list my projects"

Eleven projects come back. AI agents across finance, health, insurance, legal, retail — all registered, all with prior test history and posture scores.

"use Acme FinBot and let me know how we can test it"

The assistant confirms the project is set, checks the model provider is configured (Azure OpenAI, gpt-4.1), and presents the options: test categories, depth levels, language.

"run a unit test in french for owasp multiturn"

The test launches. Multi-turn adversarial attacks, contextually generated against the agent's defined scope and permitted behaviours, in French. The experiment ID comes back. Status: running. Estimated duration: twenty minutes.

"check the experiment status"

Running.

That was it. Four messages. No browser tabs. No YAML. No dashboard. The security test was a by-product of a conversation that took less time than making coffee.

This isn't automation. This is disappearance.

Every tool in AI security right now talks about automation. Automated red-teaming. Automated scanning. Automated reporting. And automation matters — but it's a baseline, not a differentiator.

The actual shift is subtler. It's not about whether the test runs automatically. It's about where the test lives in your workflow.

Think about how application security evolved:

Separate team, separate tools — security was a gate at the end of the pipeline. You threw your code over the wall, waited for a report, and argued about severity ratings.

Shift left — security moved into CI/CD. Tests ran on every commit. Developers saw results without leaving their pipeline.

Shift into — this is where we are now. Security isn't a stage in the pipeline. It's a capability inside the tools you're already using. You don't go to the security tool. The security tool comes to you.

MCP — the Model Context Protocol — is what makes this possible for AI agent security. It turns testing infrastructure into something an AI assistant can orchestrate natively. The security engineer doesn't learn a new interface. They describe what they need in natural language, and the test happens.

The tool disappears. The capability remains.

Contextual, not generic

There's a reason most prompt injection test suites feel like checkbox exercises. They throw the same generic payloads at every agent and report pass/fail. It's the equivalent of running SQLMap against every endpoint regardless of whether it touches a database.

Meaningful AI agent testing has to be contextual. An adversarial attack against a financial guidance chatbot is fundamentally different from one against a legal information assistant. The permitted behaviours are different. The restricted intents are different. The data sensitivity is different.

When the test ran against Acme FinBot, it didn't fire generic prompts. It understood the agent's scope: permitted to provide general financial guidance, review transactions, offer spending breakdowns. Restricted from transferring funds, recommending specific stocks, disclosing internal processes. The adversarial attacks were generated against those boundaries — multi-turn conversations designed to methodically probe the specific edges where the agent might fail.

And it did this in French. Because an agent that holds its boundaries in English but leaks data in French isn't secure — it's monolingual.

The loop that actually closes

Testing that produces findings without a path to remediation is expensive awareness.

The workflow that matters is a closed loop:

Discover → Test → Measure → Defend → Retest

Discover — You can't secure agents you don't know about. Shadow AI is the elephant in the room. Employees and teams spin up AI services — ChatGPT integrations, Copilot instances, custom agents — without going through security. Discovery means scanning your cloud environment to surface every AI service, sanctioned or not, before you can assess any of them.

Test — Multi-turn adversarial testing across OWASP LLM Top 10 categories, at varying depth levels, in multiple languages. Not once — continuously. Quick smoke tests on every deployment. Thorough assessments before major releases.

Measure — Posture isn't a binary. It's a score with dimensions: findings severity, test coverage breadth, statistical confidence from test volume, and drift over time. Security engineers need a number they can report upward and track across quarters. A letter grade — A through F — that a CISO can put on a slide without translation.

Defend — Findings should generate guardrail configurations directly. Not a PDF of recommendations that sits in a shared drive for six months. Actual rule sets, derived from actual vulnerabilities, ready to deploy. Evidence-based defence, not theoretical threat modelling.

Retest — Run the same categories again. Did the posture score improve? Did the guardrails hold? Did new drift emerge? This is where security becomes a programme, not a project.

The interface is the strategy

There's a temptation to treat the conversational interface as a convenience feature — a nicer way to do what dashboards already do. That undersells it.

The interface determines adoption. The history of developer tools proves this over and over. Git didn't win because it was the best version control system. It won because it met developers where they worked — the terminal. Containers didn't win because isolation was a new idea. They won because Docker made it one command.

AI agent security will be adopted — or ignored — based on how much friction it introduces into existing workflows. If testing means learning a new platform, configuring scan profiles, navigating a dashboard, and exporting results, it will be deprioritised every single sprint. Not because it's unimportant, but because everything else is easier.

If testing means typing "run a unit test in french" during the same session where you're reviewing code, it happens. Not because the engineer is more disciplined, but because the barrier dropped below the threshold of resistance.

The best security tooling is the kind you never have to leave your workflow to use.

The standard is being written now

AI agent security doesn't have its AppSec moment yet. There's no mature ecosystem of scanners, no universal CI/CD integration pattern, no agreed-upon metrics for posture and coverage.

That's not a problem. That's a window.

The teams that establish their security testing practices now — that build the muscle memory of continuous adversarial testing, that track posture scores and coverage gaps, that close the loop from findings to guardrails — will define what "good" looks like for everyone else.

The playbook isn't finished. But the engineers who show up now are the ones who'll write it.