AI Detection & Response

Overview

As enterprises rapidly adopt GenAI to boost productivity and automate decision-making, security teams face unprecedented challenges. Traditional security tools aren't designed to handle the unpredictable nature of AI agents, which can introduce risks such as prompt injections, data leakage, model manipulation, and misleading outputs.

AI Detection & Response (AIDR) is Zenity's comprehensive runtime security layer offering:

Real-time threat detection across all AI agent interactions
Automated response capabilities to emerging risks
Proactive prevention with continuous monitoring
Full-spectrum visibility from governance to attack mitigation

Core Components

The AIDR solution consists of two integrated components:

1. Runtime Visibility - Complete observability into AI agent behavior with granular step-by-step tracking of all interactions—transforming the AI black box into transparent, auditable activity.

2. Threat Detection & Response - Continuous security analysis powered by an advanced detection engine mapped to industry-standard OWASP LLM and MITRE ATLAS frameworks.

Category	Section	Description
Core Capabilities	Runtime Visibility (Activity)	Complete observability into AI agent interactions
	→ Understanding Steps	Breakdown of AI agent interactions into atomic units
	→ Logs & Transcript Data	Privacy-preserving data collection and retention
	→ A2A Communication Visibility	Agent-to-agent interaction tracking and transparency
	Threat Detection (Findings)	Continuous security analysis and risk identification
	→ Detection Engine	Rule-based threat detection mapped to OWASP & MITRE
	→ Severity Levels	Risk prioritization and classification
	→ Individual Finding Details	Comprehensive finding investigation and context
	Detection Capabilities	Advanced scanning and analysis features
	→ LLM-Based Runtime Detections	Semantic and contextual threat detection
	→ File Attachment Scanning	Real-time file content security analysis
	→ AI Agent Runtime Governance	Policy-driven organizational compliance enforcement
Integration & Automation	Automate Response via AIDR API	Programmatic access to AIDR data for SIEM/SOAR integration
	How It Works	Architecture and platform support
	→ Solution Architecture	Agentless, cloud-native design
	→ Platform Coverage	Supported AI services and platforms
Setup & Configuration	Getting Started with AIDR	Step-by-step setup guide
	→ Microsoft 365 Copilot Setup	Required permissions and prerequisites
	→ Expanding Existing Integrations	Add AIDR to existing Zenity integrations
	→ Creating a New Integration	Set up a new integration from scratch
Advanced Features	Securing Homegrown Agents	Runtime guardrails for custom-built AI applications

Runtime Visibility (Activity)

Zenity’s Runtime Visibility provides near real-time observability into every AI agent interaction. By breaking down complex workflows into granular steps—both user-facing and behind-the-scenes—Zenity transforms what was once a black box into complete transparency.

Key Activity Attributes

Every interaction captured by AIDR includes the following metadata:

Attribute	Description
Timestamp	Exact time the action occurred
Actor Name	User who initiated the interaction or triggered the agent
Agent	AI agent associated with the action
Type	Specific step category (AI Message, RAG, Tool Invocation, etc.)
Client Application	Platform from which the actor interacted with the agent

Understanding Steps

Agentic AI flows are broken down into atomic units called Steps. These capture both visible user interactions and internal agent operations, providing complete context for investigation and analysis.

Step Types

Zenity tracks all AI agent activity through distinct step categories:

Step Type	Description
AI Message Step	The agent’s response to user or system input
RAG Step	Retrieval of external data to ground the agent’s response
Tool Invocation Step	Execution of a function or API call by the agent
Trigger Step	Initiation of an agent flow based on conditions or events
User Message Step	Input sent by the end user
Agent Handoff Request	Captures when Agent A requests assistance or data from Agent B
Agent Handoff Response	Captures the specific data or payload returned to the requesting agent

Step Metadata

Each step contains rich contextual information:

AI Service

The supported AI service powering the agent (e.g., Microsoft Copilot, ChatGPT Enterprise, Google Vertex AI).

Client Application

The application platform from which the actor interacted with the agent.

Agent

The AI agent associated with the step. This is clickable and links directly to Zenity’s AISPM Inventory for deeper asset context.

Logs & Transcript Data

🔒 Privacy by Design: Zenity is built with privacy at its core. While metadata is persisted for investigation, sensitive content is processed in-memory only and never stored.

Each step contains both:

Metadata: Timestamps, actors, service information, and step details (persisted)
Sensitive Content: Message text, tool parameters, file snippets (processed in-memory, never stored)

When investigation requires access to sensitive content, users can fetch this data on-demand via the source’s API directly within the Zenity UI.

Data Collection & Retention

Collection Speed: Near real-time (within minutes of occurrence)
Retention Period: Three months of runtime activity metadata

A2A Communication Visibility {#a2a-communication-visibility}

💡 Agent-to-Agent (A2A)

As AI systems evolve beyond simple chatbots into complex multi-agent workflows, understanding how agents collaborate to complete tasks becomes essential for security and operational oversight. Zenity now provides complete transparency into agent-to-agent interactions, making these previously hidden communications fully visible and auditable. With this visibility, security teams can trace the complete flow of information across multi-agent workflows and identify potential security implications in collaborative AI systems.

When agents communicate with one another, Zenity automatically identifies and tracks these interactions as dedicated Agent Handoff steps in the Activity Page. This includes both the initial request from one agent to another and the response containing the requested data or assistance.

When you select an Agent Handoff step in the UI, the side panel provides granular context for the interaction. It includes an A2A Communication section, along with common step fields and enrichments available for other steps.

Key A2A Fields:

Field	Description
Requesting Agent	Identifies the AI agent initiating the handoff request
Responding Agent	Identifies the AI agent providing the data or assistance
Related Step	Allows you to cross-reference the request and response for a specific handoff

Threat Detection (Findings)

Zenity’s advanced detection engine continuously analyzes AI agent activity to surface risks, anomalies, and suspicious behavior before they become incidents. Every finding is enriched with context and mapped to industry-standard security frameworks.

Detection Coverage Includes:

Data exposure and leakage
Prompt injection attempts
Unusual agent behavior patterns
Malicious file uploads
Policy and compliance violations

Detection Engine

At the core of AIDR is a powerful rule engine designed to surface AI runtime risks across multiple threat categories:

Prompt misuse and injection attempts
Sensitive data exposure (PII, credentials, secrets)
Unusual agent behavior patterns
Malicious inputs and obfuscation techniques
Policy and compliance violations

🔄 Continuously Evolving: Zenity's research team actively expands and tunes detection logic to stay ahead of emerging threats, ensuring broad and adaptive coverage.

Framework Mapping

All detection rules are fully mapped to industry-standard security frameworks:

MITRE ATLAS - Adversarial threat landscape for AI systems
OWASP LLM - Top security risks for LLM applications

To explore the complete ruleset, visit the Policy page in the Zenity platform and filter by the “AIDR” tag.

Severity Levels

Detection severity is calculated based on potential impact and confidence level, resulting in three priority tiers:

Severity	Description
High	Critical threats requiring immediate investigation and response
Medium	Significant risks warranting investigation and remediation
Low	Anomalies and potential concerns for awareness and monitoring

Note: Not every anomaly indicates a confirmed threat. Findings serve as indicators of suspicious behavior worth tracking and investigating, even when not yet conclusive.

Individual Finding Details

Click any finding to access comprehensive context and actionable intelligence for investigation and response.

Each finding includes:

Evidence

Exact reason for detection
Supporting data and context
Timestamp and sequence information

Context

Actor Name: User who triggered the interaction
Client Application: Platform used for the interaction
Agent: AI agent involved (linked to AISPM Inventory)

Framework Mapping

OWASP LLM category
MITRE ATLAS technique
GenAI Matrix alignment

Guidance

Investigation tips and next steps
Response recommendations
Links to related findings and thread activity

Detection Capabilities

LLM-Based Runtime Detections

AIDR extends its detection engine with LLM-based runtime detections to identify threats that require deep semantic and contextual understanding. These detections operate alongside existing mechanisms such as pattern matching, structural validation, and threshold-based conditions to identify known risk signals with high precision and predictable behavior. LLM-based detections and are designed to uncover previously unseen attack variants, nuanced misuse, and multi-step behaviors that cannot be reliably detected using static patterns alone.

LLM-based detections are applied selectively in scenarios such as determining malicious intent, detecting paraphrased or obfuscated attacks, correlating behavior across multiple steps or tool invocations, and identifying inconsistencies between user requests and agent actions. To enable deeper analysis without impacting user-facing latency, these detections run asynchronously.

Detection coverage includes:

Malicious Input such as instruction injection, jailbreaks, tool abuse, and disguised manipulation techniques
Reconnaissance attempts targeting sensitive data, agent capabilities, tools, or system instructions
Data Exfiltration via email, messaging, webhooks, external storage, public links, attachments, or encoded payloads
Destructive Actions including deletions, permission changes, and other mutating operations
Sensitive Resource Access involving non-destructive reads of PII, credentials, financial, HR, legal, customer, or IP data
Obfuscated Text using encoding or transformation techniques while excluding legitimate technical artifacts
Intent Breaking, where agent behavior deviates from the user’s request

LLM-based detections provide semantic, intent-aware analysis with probabilistic outcomes and natural-language explainability. They complement existing detection mechanisms by adding depth and adaptability while preserving Zenity’s privacy-by-design approach through in-memory processing of sensitive content.

LLM-based detections significantly improve signal quality compared to pattern-based logic. While deterministic rules are highly effective for known and well-structured indicators, they often generate false positives when context is ambiguous or language is used in a legitimate manner. By analyzing intent and semantic meaning rather than isolated keywords or patterns, LLM-based detections reduce alert noise and improve precision, especially in complex, paraphrased, or multi-step scenarios. This results in more actionable findings for security teams, minimizing investigation overhead while increasing coverage of sophisticated and previously unseen threats.

File Attachment Scanning

As document uploads become a primary interaction method with AI agents, file attachments represent a critical security blind spot. AIDR provides comprehensive near real-time scanning to identify security and compliance risks hidden within uploaded files.

⚠️ Prerequisites: This feature requires an OpenAI key with Compliance API permissions.

Supported Platforms & Formats

Primary Integration: ChatGPT Enterprise

Format Category	Supported Extensions
Text-Based Files	`.txt`, `.log`, `.csv`, `.md`, `.rtf`
Binary & Encoded Files	`.pdf`, `.docx`, `.xlsx`, `.ppt`

Security Scanning Scope

The detection engine analyzes file contents across three critical risk categories:

Risk Category	Detection Focus
PII Detection	Identifies sensitive personal data (SSN, Aadhaar, France INSEE, Taiwanese ID, UK National Insurance, Indian PAN, Italy Fiscale, Mexico CURP)
Financial Detection	Identifies sensitive financial data (Credit Cards, Iban, and PINs)
Malicious Input	Scans for prompt injection and malicious instructions embedded in files
Obfuscated Text	Detects encoding or text manipulation attempts to bypass security controls

Investigating File-Based Findings

When risks are detected in file attachments, findings include specialized metadata for forensic analysis:

Enhanced File Evidence

Field	Description
Finding Label	Marked as “User file attachment” to distinguish from chat messages
Evidence Location	Shows File Upload Step > Attachment path
Core Evidence	Highlights specific lines or sections where risk was detected
File Access	Direct download capability for offline forensic analysis

Security teams can download suspicious files directly from the finding drawer for deeper investigation and analysis.

AI Agent Runtime Governance

Alongside threat detection, AIDR supports AI governance use cases by enforcing organizational standards for how AI agents access data and interact with external systems at runtime. These detections are driven by customer-defined policy configuration, allowing security teams to translate internal AI usage rules into enforceable controls.

Runtime governance detections focus on organizational policy violations rather than malicious intent, helping reduce risk, prevent accidental data exposure, and ensure consistent AI behavior across environments.

Key governance-driven detections include:

Sensitive File Access
Detects when an AI agent accesses files classified as sensitive based on Microsoft sensitivity labels or defined SharePoint and OneDrive sensitive locations. This enables data-layer governance scenarios such as monitoring access to executive OneDrive folders or specific sites containing regulated or high-impact data.
To enable this detection, use the Policy Configuration tab to define sensitive data using Sensitive Labels and Sensitive Locations.
Disallowed Recipient Domains
Detects when an AI agent sends information via tools to recipient domains that are not permitted by organizational policy. This helps reduce unintended data egress and limit information leaving the tenant.
To enable this detection, use the Policy Configuration tab to define trusted domains. Subdomains are supported using wildcards (for example, *.main.com). Any domain not explicitly listed will trigger a detection.

These detections provide deterministic and explainable outcomes aligned with enterprise governance requirements, enabling consistent enforcement of AI usage standards at runtime without relying on probabilistic intent analysis.

Automate Response via AIDR API

Scale your security operations with programmatic access to AIDR data. The Zenity API enables automated risk processing, custom alerting, and seamless integration with existing security workflows—SIEM, SOAR, ticketing systems, and more.

Key API Endpoints

Access AIDR data through the Detection API section:

Endpoint	Purpose
List Findings	Retrieve detection findings from specific or all integrations
List Agent Steps	Retrieve agent steps from specific or all integrations

Querying Findings

Retrieve detected runtime risks with flexible filtering options:


# Example: Get findings for M365 Copilot since specific timestamp
GET /v1/detection/findings?aiService=m365Copilot&sinceTimestamp=2024-01-01T00:00:00Z

Key API Parameters

Parameter	Format	Purpose
`aiService`	`copilotStudio` / `m365Copilot` / `chatgpt`	Filter by AI service
`sinceTimestamp`	`yyyy-MM-dd'T'HH:mm:ssZ`	Get incremental changes from timestamp
`untilTimestamp`	`yyyy-MM-dd'T'HH:mm:ssZ`	Get incremental changes until timestamp
`ruleId`	string	Filter findings by specific risk or category

Cross-Referencing with AISPM

To correlate runtime findings with Zenity AISPM inventory data:

Use the toolplatforminfo.resourceid field from the listFindings endpoint
Cross-reference with the List Resources endpoint
Gain complete asset context including ownership, permissions, and configuration

How It Works

Solution Architecture

AIDR is built on a modern, cloud-native architecture designed for enterprise scale:

Key Properties:

Agentless by Design: No installation or registration required on end-user devices
Device-Agnostic: Full coverage across desktop, mobile, and web interactions
Near Real-Time Visibility: AI agent activity streamed as it's logged for immediate detection and response
Privacy-Preserving: Sensitive content processed in-memory only, never persisted

Platform Coverage

AIDR currently supports the following AI services:

Microsoft 365 Copilot
Microsoft Copilot Studio
ChatGPT Enterprise + Custom GPTs
Microsoft Azure AI Foundry
Google Vertex AI

Getting Started with AIDR

📋 Activation Required: AIDR is not enabled by default. Contact the Zenity team to activate this solution in your environment.

Microsoft 365 Copilot Setup

To enable AIDR for M365 Copilot, Zenity requires specific Microsoft Graph and Office 365 Management API permissions.

Required Permissions

Permission	Purpose	Scope
AiEnterpriseInteraction.Read.All	Retrieve Copilot interaction transcripts	Microsoft Graph
ActivityFeed.Read	Digest M365 Copilot audit logs	Office 365 Management APIs
InformationProtectionPolicy.Read.All	Retrieve MIP label data for file correlation	Microsoft Graph

🔒 Privacy Guarantee: Zenity processes transcript data in-memory only for security analysis. Sensitive content is never persisted.

Once permissions are granted, Zenity automatically starts ingesting data in real-time and analyzing it for runtime findings.

Expanding Existing Integrations

Already have a Zenity integration with Microsoft? Follow these steps to enable AIDR capabilities:

Option 1: Expand via Managed Application

Enhance your existing Zenity integration by re-consenting to the updated permission set.

Step-by-Step Instructions:

Navigate to Azure Portal > Enterprise Applications
Select the Zenity application used for your existing integration
Expand Security in the left navigation menu
Click Permissions
Click Grant admin consent for [your tenant]

Once consent is granted, Zenity automatically begins ingesting data and analyzing it for runtime findings in near real-time.

Option 2: Expand via Service Principal

For organizations using service principal-based integrations, add the required permissions directly to your Azure AD application.

Step-by-Step Instructions:

Open your Azure AD Application page
Navigate to API Permissions
Click Add a permission
Add the following permissions:

Office 365 Management APIs (Application permissions)

ActivityFeed.Read

Microsoft Graph (Application permissions)

AiEnterpriseInteraction.Read.All
InformationProtectionPolicy.Read.All

Click Grant admin consent for [your tenant] to activate the permissions

Creating a New Integration

If you don’t have an existing Zenity integration with Microsoft, create a new one using either method:

Method	Best For	Setup Guide
Managed Application	Most organizations seeking streamlined setup	Configuration Guide
Service Principal	Organizations requiring granular permission control	Configuration Guide

✅ Recommended: The Managed Application approach provides easier permission management and faster deployment for most organizations.

Securing Homegrown Agents

Expanding Beyond Defined Platforms

The Shift to Agentic Workflows: Organizations are evolving from simple chatbots to "Agentic Workflows," which are autonomous systems that execute tasks. While Zenity already covers established platforms like M365 Copilot and ChatGPT Enterprise, there is a massive growth in "Homegrown Agents" built on custom infrastructure that lack specialized security.

The Problem: Current security tools focus on final LLM outputs, missing the internal risks within micro-interactions such as RAG fetches and tool calls.

The Zenity Mission: Extend enterprise-grade security to custom-built agents with the same depth as platform-native solutions.

The Solution: Zenity Evaluation Engine for homegrown agents

Zenity provides a specialized, cloud-agnostic security decision engine that acts as a runtime guardrail for homegrown agents.

How it Works: Agents micro-interaction (prompts, tool calls, and RAG retrievals) are sent to the engine.

The Decision: The engine evaluates the interaction for security and logic, returning a clear Allow or Deny decision with full explainability.

Deployment and Integration

Cloud-Agnostic Design

The engine is deployable across AWS, GCP, and Azure, ensuring security coverage regardless of your cloud infrastructure.

Multi-Layered Defense

Zenity integrates with native services such as AWS Bedrock Guardrails, Google Model Armor, and Azure PromptShield to provide a comprehensive security strategy.

High-Performance Architecture

Scanners are optimized for high-speed operation and low latency to ensure minimal impact on the end-user experience.

Developer-First API

The solution is fully accessible via a modern REST API, supporting complex agentic workflows and automated risk processing.

Decision Explainability

The system returns clear "Allow" or "Deny" decisions; "Deny" responses include full explainability, reason codes, and evidence of the detected threat.

Detections Supported for Homegrown Agents

Zenity's homegrown agent security engine provides comprehensive protection across multiple threat categories, ensuring safe and compliant AI operations.

Category	Capability	Description
Advanced Attack Prevention	Prompt Injection Defense	Automatically identifies and blocks attempts to manipulate the AI into bypassing safety filters
Advanced Attack Prevention	Jailbreak Prevention	Stops attempts to trick the AI into performing unauthorized actions or ignoring its system instructions
Data Loss Prevention (DLP) & Privacy	Sensitive Data Blocking	Identifies and blocks the leakage of Personally Identifiable Information (PII) such as Social Security numbers and email addresses
	Financial and Secret Protection	Monitors for the exposure of financial data, including credit card numbers and IBANs, as well as technical secrets like API keys or passwords
	Regulatory Compliance	Ensures AI usage remains compliant with data privacy standards by preventing sensitive information from being sent to or returned by the model
Safety & Content Governance	Risk Filtering	Provides real-time detection of toxicity, hate speech, and offensive content
	Topic Control	Ensures the AI stays focused on business-relevant tasks by blocking off-topic or restricted subjects
	Threat Detection	Identifies malicious links, hidden text, and risky image rendering within AI responses
Context-Aware Protection	Multi-Turn Defense	Tracks the entire conversation thread to stop sophisticated attacks occurring over multiple steps rather than single messages
Context-Aware Protection	User Behavior Tracking	Identifies and blocks persistent bad actors by monitoring suspicious activity patterns across different sessions
Secure AI Agent & Tool Governance	Tool Misuse Prevention	Monitors and controls how AI agents interact with external tools, plugins, and Model Context Protocols (MCPs)
Secure AI Agent & Tool Governance	Data Exfiltration Defense	Stops compromised agents from transmitting sensitive internal data to unauthorized external domains through integrated tools