This website uses cookies

Read our Privacy policy and Terms of use for more information.

Disclaimer: Opinions expressed are solely my own and do not reflect the views or opinions of my employer or any other affiliated entities. Any sponsored content featured on this blog is independent and does not imply endorsement by, nor relationship with, my employer or affiliated organisations.

Every SecOps platform now has "agents." SOAR vendors, the AI SOC crowd, the new players. Everyone slapped the word on something and shipped it.

But "agent" is doing a lot of work in those sentences. What you actually get changes a lot from one platform to the next. And the differences matter more than the demo makes it look.

At SecOps Unpacked we track this space closely. Right now we count 22 vendors offering some form of agent builder for SecOps. That is a lot of choice, and almost none of them mean the same thing by it.

And even inside that group there is a split. Some platforms ship prebuilt agents and you use them as they are. Plug and play, no customization. That is fine for getting started, but you take what you get. The ones I care about here are the platforms that let you customize those agents or build your own from scratch. That is where the differences show up.

If you read some of my Li post on AI Agents vs. Playbooks, you know I care about where you draw the line between a thing that thinks and a thing that just runs steps. Anthropic draws the same line: workflows run on predefined paths, agents direct their own process and tools. This post is the next layer down. Once you decide you want agents, how the platform lets you build them is the real question.

So let me break down the flavors I keep seeing, and then the harness, which is what actually sets them apart.

Our new Product Updates Section !

Agent Builder release

Most AI SOC demos stop at chat. Fleet is interesting because it gives the you an AI SOC Analyst that does the actual work: reason through evidence, run investigations inside an isolated sandbox, inspect files, execute tools, generate detections, and turn findings into repeatable workflows. The Agent Builder is the control layer on top. Teams can shape specialized agents for package triage, phishing analysis, AIR investigations, reporting, or detection engineering without waiting for a vendor roadmap. This is the kind of AI SOC capability practitioners should evaluate: not autonomy theater, but controlled, customizable execution that maps to how SecOps really works.

Threat Hunting & Threat Intel Analyst

Most threat hunting programs remain out of reach for many security teams because they require specialized expertise and analyst time that is often in short supply. Dropzone new AI Threat Intelligence Analyst and AI Threat Hunter agents aim to make proactive threat discovery more accessible. The Threat Intelligence Analyst monitors threat intelligence sources, extracts TTPs and IOCs, and builds hunt packs, while the Threat Hunter executes them across SIEM, EDR, cloud, and identity platforms. Beyond identifying threats, the agents can uncover misconfigurations, shadow IT, and exposed vulnerabilities, allowing analysts to focus on validation, decision-making, and response.

The Three Flavors

1. The agent you call inside a workflow

This is the classic. Probably the most used today, especially in SOAR and automation-style platforms. You have a workflow, and at some step you drop in an agent to do a piece of reasoning. Triage this alert. Summarize this. Decide if this IOC is bad.

It works. It is easy to reason about, because the agent lives inside a flow you already understand. The blast radius is small. If it does something dumb, it does it in one spot.

The downside is scale. You end up building a new agent per workflow. Phishing flow gets its own agent. EDR flow gets its own agent. Identity flow gets its own agent. Same logic, copied five times, drifting apart over time. And when you want to change how your agents behave, you are editing them one by one. Welcome back to maintenance hell, just with prompts instead of API calls.

2. The monolith agent

One big agent. You call it to do "the reasoning" inside a workflow, and it tries to handle whatever you throw at it.

Honestly I see more downsides than upsides here. The one real upside is ease of use. There is one thing to configure, so getting started is fast. It might also make sense if you are running small or local language models, where you want fewer moving parts to host and control.

But a monolith has no clear role. It is good at a bit of everything and great at nothing. It is hard to test, because the surface is huge. It is hard to guardrail, because it does too much. And when it gets something wrong, good luck figuring out why. You poke at a giant prompt and hope. For a SOC, where you need to trust and audit the decision, that is a bad trade.

3. The agent builder

This is the one I think wins. You build your own agents as reusable units. Each one gets:

  • A role. What is this agent for. A phishing triage agent. A host enrichment agent. A containment recommender.

  • Constraints. What it is not allowed to do. No isolating critical assets without approval. No closing alerts above a severity.

  • A knowledge base. Your SOPs, your asset context, your past decisions. The stuff that makes the agent yours and not a generic model guessing.

  • Skills. The actions and tools it can use to get the job done.

Then you reuse them. Build the enrichment agent once, call it from every workflow that needs enrichment. Fix it once, fixed everywhere. This is the same reason we moved from copy-pasted scripts to functions. It is not a new idea. It is just finally showing up in security tooling.

And these get a lot better when they are interactive. An agent you can talk to, that asks for input mid-investigation, that you can correct and steer, beats a fire-and-forget agent buried in a workflow. That is where the co-pilot and the autonomous agent start to merge into something useful.

Now the Harness

Here is what I want you to take away. When you compare agent builders, you are not really comparing models. You are comparing harnesses.

The agent harness is everything wrapped around the model. Tool execution, memory, context management, state, guardrails, the loop that lets it act instead of just answer. The model sits in the middle and does the reasoning. The harness is the rest. And the rest is most of it.

People building production agents keep landing on the same conclusion. The model is the smallest part of the system. When an agent breaks in prod, hallucinates a tool call, repeats an action it already did, ignores an instruction it followed an hour ago, it is almost never the model that got dumber. It is the harness that was underbuilt. Mitchell Hashimoto, the Terraform guy, even named the discipline: harness engineering. Every agent mistake becomes a permanent fix to the environment, not a prompt you retry.

This reframes the whole thing. The agent you call in a workflow, the monolith, the agent builder. These are just different amounts of harness, and different amounts of control over it.

  • In the workflow agent, the harness is mostly the workflow itself. You wire context in by hand, step by step. Fine for one flow. Painful across many.

  • In the monolith, the harness is hidden inside one big config. You do not really shape it. You feed it and hope.

  • In a real agent builder, the role, constraints, knowledge base, skills, memory, and guardrails are the harness. You are configuring it directly. That is the point.

So an agent builder is not a fancy prompt box. A good one is a harness with a UI on top. That framing tells you what to actually look for.

What a Good Harness Gives You in SecOps

When you evaluate one of these platforms, look past the model name on the slide and ask about the harness:

  • Tools and integrations as skills. Can your agents reach your SIEM, EDR, IDP, TI, CMDB, and case management, and can you add your own? An agent with no skills is just a chatbot with opinions.

  • Memory and context. Does the agent remember prior alerts, prior analyst decisions, known-benign patterns? Or does every alert start from zero. SOC context is the whole game.

  • Guardrails as part of the build. Can you set hard limits per agent. Rate limits on anything that quarantines or isolates. Human approval for high-impact actions. This should be config, not vibes.

  • Observability. Can you see what the agent did and why. "Selected playbook 42 because 3 of 5 engines flagged the file." If the decision is a black box, you will never trust it, and you should not.

  • Evals. Can you test an agent against known cases before it touches prod. Shadow mode counts. Anything that lets you measure before you ship.

  • Model-agnostic. This one is underrated. The model is pluggable. A good harness lets you swap it. When a stronger model drops next quarter, and one will, you want to plug it in, not rebuild every agent you own. If your platform welds you to one model, you bought a harness with no spare parts.

Bottom Line

The "agent" label tells you almost nothing. What tells you something is how you build them and how much of the harness you control.

Per-workflow agents are fine to start. Monoliths are easy and not much else. The agent builder, especially an interactive one, is where you get reuse, roles, constraints, and real control.

And the next time a vendor walks in leading with which model powers their agents, ask about the harness instead. That is the part you will actually live with.

Join as a top supporter of our blog to get special access to the latest content and help keep our community going.

As an added benefit, each Ultimate Supporter will receive a link to the editable versions of the visuals used in our blog posts.

Reply

Avatar

or to participate

Keep Reading