Evaluating AI agents for SOC

Disclaimer: Opinions expressed are solely my own and do not reflect the views or opinions of my employer or any other affiliated entities. Any sponsored content featured on this blog is independent and does not imply endorsement by, nor relationship with, my employer or affiliated organisations.

Whenever a new technology pops up, the first thing we wonder is, "How does this work, and how do I choose the best one?" With the rising popularity of AI agent-based solutions in Security Operations Centers (SOCs), you're probably wondering how to evaluate these tools effectively. This edition of the cybersecurity automation blog provides a deep dive into evaluating AI agents specifically for security operations.

Remember, though, that implementing AI-based technology isn't just about introducing new tools. You need the right people, skills, and processes in place to truly benefit from these solutions. Effective AI deployment also requires checking if your AI solution can handle interactions with stakeholders and process the diverse types of inputs (emails, calls, instant messages) your SOC regularly deals with.

Quick Refresher: What Are AI Agents?

An AI agent (or more technically, a group of cooperating, task-specific AI agents) is an intelligent system powered by artificial intelligence, particularly large language models (LLMs), designed to perform specific tasks autonomously or semi-autonomously. Within security operations, these agents significantly enhance the SOC team's efficiency by automating repetitive tasks, augmenting human decision-making, and ensuring consistent, rapid responses to threats. Currently, many of the Agentic AI solutions (also known as Agentic AI SOC Analysts) are focused on automating alert triage and investigation, specifically around Tier 1/Tier 2 investigations.

As previously discussed in our detailed exploration of SOC AI agents, we outlined four main categories tailored to cybersecurity operations:

Tool-Using Agents: Combining LLM reasoning capabilities with external tool integration (e.g., APIs, SIEMs, EDR platforms). They function as "smart SOC assistants" that handle data retrieval, enrichment, and automated actions.
Reasoning Agents (ReAct, Chain-of-Thought): These agents explicitly outline their reasoning steps, enhancing transparency and trust in decision-making, critical for compliance-heavy environments.
Memory-Enhanced Agents: Equipped with memory capabilities, these agents learn from historical alerts, patterns, and analyst feedback, progressively refining their contextual awareness and reducing redundant analysis.
Agentic RAG (Retrieval-Augmented Generation + Autonomy): Advanced agents that autonomously retrieve and synthesise diverse data sources, perfect for complex investigations where multiple context points are essential.

This edition is sponsored by Prophet Security

Empower your SOC with a Force Multiplier

Prophet Security delivers an autonomous Agentic AI SOC Analyst that eliminates the manual, repetitive processes involved in triaging and investigating alerts.

www.prophetsecurity.ai/?utm_campaign=10763157-Filip%20Stojkovski%20Sponsorship&utm_source=cybersec-automation.com

Key Factors to Consider When Evaluating AI Agent Solutions

Evaluating AI agents involves considering multiple critical aspects tailored to your unique operational needs. Here are expanded core considerations:

Clearly Define Your SOC Use Cases

Identify the specific responsibilities within your SOC that you want AI agents to manage. At a high level, use cases can be organised by the sources of alerts that require triage and investigation, such as cloud, endpoint, email phishing, identity, network, DLP, and more. Within each of these, define more granular use cases. These may include impossible travel (Identity), malware analysis, stolen credentials (Email Phishing), anomalous K8s infrastructure change (Cloud), risky file downloads/uploads (DLP), and the list goes on. In addition to these, consider advanced, cross-domain use cases such as threat hunting, forensic investigations and incident handling, each of which may span multiple alert sources and require broader contextual reasoning. Clearly defining your use cases ensures a more objective evaluation process and increases the likelihood that your chosen Agentic AI solution delivers maximum value and targeted capabilities.

Integration and Data Handling

Evaluate how seamlessly the AI agent integrates with your existing security infrastructure, including SIEM, EDR, cloud service providers and cloud security tools, Email providers, Identify services (IDP), Threat Intelligence platforms, data lakes and data storage services, and incident management tools. Additionally, assess the agent’s capability to effectively handle diverse data types, both structured (logs, databases) and unstructured (reports, threat intel documents).

Depth and Accuracy

Depth and accuracy are paramount to the efficacy of AI Agents for security operations use cases. After all, having inaccurate investigations that fail to identify true positives as threats and false positives as benign activity renders the AI Agents useless. High accuracy builds confidence in the AI Agents, reduces the number of false positives analysts must review, and ensures actual threats are never missed by the AI system.

Configurability and Customisation

Determine the level of ease and flexibility offered by the AI solution regarding customisation and configuration. Consider whether your SOC team can easily adjust or extend AI agents to meet unique operational demands or evolving threat landscapes. An ideal solution should empower your team with autonomy for managing and fine-tuning the system without heavy reliance on vendor intervention.

Security and Compliance

Review the security architecture underpinning the AI agent solutions. Ensure robust mechanisms are in place for data protection, secure communications, and agent authentication. Verify compliance with essential security standards and regulations such as GDPR, SOC 2, ISO 27001, and others relevant to your operational jurisdiction or industry. Confirm that AI agents do not use customer data for model training, as this can introduce serious data leakage risks.

Transparency and Auditability

Assess the transparency of the AI agent’s decision-making process. Reliable AI agents should provide clear, detailed reasoning pathways that analysts and auditors can easily understand and verify. Transparency is vital for building trust in automated processes and essential for compliance and regulatory requirements.

Comparing AI Offerings: From Copilots to AI Agents and Hybrid Models

In one of our previous blog, we explored the spectrum of AI solutions available for SOC operations, including Cyber Copilots, pure AI Agents, and Hybrid models:

Copilot Capabilities

Copilots are like your personal AI-powered assistant, responding interactively to prompts:

Alert Investigation: Quickly summarize alerts, explaining what triggered them and why they're important.
Enrichment/Context: Gather threat intel and provide context like associated domains and IP details.
Blast Radius Insights: Assess the impact of incidents.
Threat Hunting Starters: Assist in crafting queries for deeper investigations.
Understanding Attack Patterns: Align incidents to known frameworks like MITRE ATT&CK.
Automation Skeletons: Draft mini-playbooks or scripts.
Security Q&A and Compliance Tips: Answer quick questions about security controls or compliance.
Vulnerability Management: Highlight CVEs or exploits.
Alert Summaries: Condense overwhelming alert volumes into readable summaries.

🌟 Pros

Flexible, interactive, and great for exploring different angles.
Excellent for learning and brainstorming.

⚠️ Cons

Requires clear, specific questions.
Dependent on the quality of input (garbage in, garbage out).

Pure AI Agents

These solutions operate independently, running dynamic playbooks, automating investigations, and even responses:

No Prompt Needed: Automatically investigates alerts.
Pre-Populated Insights: Instantly provides comprehensive incident details.
Rapid False Positive/True Positive Determination: Efficiently separates important alerts from noise, reducing or eliminating alert backlogs.
Automated Response Actions: Can isolate compromised endpoints or disable compromised accounts.

🌟 Pros

Minimal manual intervention; ideal for high-volume alert environments.
Provides consistent and quick responses.

⚠️ Cons

Limited flexibility in deep-diving or pivoting beyond predefined workflows.
Heavily reliant on vendor-provided playbooks, may require significant customisation.

Hybrid Models

Hybrid models combine the strengths of both Copilots and Autonomous AI Agents. They offer automation but keep humans in the loop for complex cases:

Balanced Autonomy: Handles routine tasks automatically but allows human oversight when things get tricky.
Flexible Exploration: Analysts can freely pivot from automated insights to interactive exploration as needed.
Customizable Playbooks: Can be tailored to your SOC's unique needs, balancing consistency with flexibility.

🌟 Pros

Combines speed and consistency of autonomous solutions with the interactive flexibility of copilots.
Suitable for both routine and complex scenarios.

⚠️ Cons

Initial setup can be complex and may require substantial customization.
Potentially higher ongoing maintenance and management requirements.

Closing Thoughts

Remember, implementing AI-based technology isn't just about bringing in fancy new tools. You need the right people, skills, and processes to maximize their value. AI and LLM-based systems heavily depend on quality data and effective interactions with stakeholders across various teams, so always confirm that the necessary data and interactions can be effectively handled by your AI solution.

Additionally, SOC tasks don't always come neatly packaged from detection systems—many requests arrive via emails, phone calls, or instant messages. Make sure your chosen AI tools can handle these diverse, ad-hoc inputs to ensure smooth and efficient operations.

By keeping these considerations in mind, you’ll not only choose the right technology but also set your SOC up for long-term success.

Vendor Spotlight: Prophet Security

Prophet Security combines the best of both worlds. Prophet AI delivers the automation and speed of an Agentic AI SOC Analyst, capable of autonomously triaging and investigating every alert with full context without prompts or playbooks. But it doesn’t stop there.

With built-in AI chat capabilities, Prophet AI also enables human-in-the-loop threat hunting, interactive investigations, and ad hoc questioning, giving analysts the freedom to dig deeper, pivot quickly, and explore emerging threats without waiting on custom rules or workflows.

Whether you’re managing a high-volume alert queue or navigating a complex, cross-domain incident, Prophet Security empowers your team with a seamless blend of autonomy and interactivity, scaling your SOC’s capacity without sacrificing investigative depth.

Visit prophetsecurity.ai to request a demo and see Prophet AI in action in your environment.

🏷️ Blog Sponsorship

Want to sponsor a future edition of the Cybersecurity Automation Blog? Reach out to start the conversation. 🤝

Sponsorship details

🗓️ Request a Services Call

If you want to get on a call and have a discussion about security automation, you can book some time here

Book a call

Join as a top supporter of our blog to get special access to the latest content and help keep our community going.

As an added benefit, each Ultimate Supporter will receive a link to the editable versions of the visuals used in our blog posts. This exclusive access allows you to customize and utilize these resources for your own projects and presentations.

Upgrade

Evaluating AI agents for SOC

Quick Refresher: What Are AI Agents?

Key Factors to Consider When Evaluating AI Agent Solutions

Comparing AI Offerings: From Copilots to AI Agents and Hybrid Models

Copilot Capabilities

Pure AI Agents

Hybrid Models

Closing Thoughts

Vendor Spotlight: Prophet Security

Reply

Keep Reading

Cyber Security Automation and Orchestration