Observability for the AI Era: Introducing Kloudfuse 3.5
Monitor AI applications alongside traditional infrastructure with natural language queries, FIPS validation, and platform engineering controls
Published on
Dec 3, 2025
Table of Contents
Kloudfuse 3.5: AI-Native Observability with Federal Security Compliance
TL;DR: Kloudfuse 3.5 introduces the industry's first Model Context Protocol (MCP) server integration for observability, FIPS 140-2/3 validation for regulated enterprises, and comprehensive platform engineering controls. Query observability using natural language, monitor AI applications natively, and gain unprecedented control over costs and compliance, all in one unified platform.
Why We Built Kloudfuse 3.5
Since launching Kloudfuse 3.0 in November 2024, we've been on a mission to solve a fundamental problem in enterprise observability: fragmentation. As organizations deploy AI applications alongside traditional infrastructure, they face a dual challenge, observability tools built for the cloud-native era don't understand AI workloads, while AI-focused monitoring creates operational silos.
Over the past year, we've shipped over 50 major capabilities across five key areas: AI and intelligent observability, enterprise security and compliance, platform engineering controls, query and analytics power, and OpenTelemetry-native architecture. Today, we're proud to announce Kloudfuse 3.5, the first unified observability platform to natively integrate traditional and AI monitoring while achieving FIPS 140-2/3 validation.
"Observability is at an inflection point," said Pankaj Thakkar, CEO and Co-Founder of Kloudfuse. "As organizations embrace AI and intelligent applications, the old model of fragmented tools, runaway costs, and vendor-controlled data no longer works. We built Kloudfuse 3.5 to fundamentally rethink what observability should be: unified, AI-ready, and built on a foundation of data freedom."
What's New in Kloudfuse 3.5
AI-Native Observability: From Infrastructure to Intelligence
Model Context Protocol (MCP) Server: Natural Language Access to Observability
The Model Context Protocol (MCP) server represents a new approach to working with observability data. Rather than requiring teams to build custom dashboards and alerts for every question, MCP enables AI systems to query observability data directly using natural language.
Ask questions like "What caused the latency spike in checkout yesterday?" or "Show me services with error rates above 2%" directly from Claude, ChatGPT, or custom AI agents. The MCP server translates these requests into FuseQL queries, executes them against your observability data, and returns results in context.
"On the agentic AI front, our MCP server is not just a wrapper over APIs," explained Ashish Hanwadikar, CTO and Co-Founder of Kloudfuse. "It exposes Kloudfuse's full observability model in a way that allows users to troubleshoot issues exactly as they would in the UI: unifying signals across metrics, logs, traces, events, and mapping service and infrastructure dependencies. You can ask natural language questions and get the same comprehensive, correlated insights."
Platform teams can build custom agents that leverage observability data for automated incident response, intelligent capacity planning, or cost optimization. The MCP server provides standards-based access through a protocol that works across LLM providers, avoiding vendor lock-in at the AI layer just as Kloudfuse avoids it at the observability layer through OpenTelemetry support.
For developers, this extends into IDEs through MCP integration, enabling AI-assisted debugging that understands actual production behavior rather than just code structure.
LLM Observability: Monitoring AI Applications Natively
AI monitoring in Kloudfuse 3.5 is built native, not bolted on. The platform captures prompt and output tracing as events attached to distributed traces, enabling you to correlate AI requests with application behavior.
Track token usage across prompt and completion tokens (supporting OpenAI, Anthropic, Google, AWS, and Azure), and monitor error rates from both API failures and model-level issues. Integration with LangChain and LlamaIndex frameworks, plus vector database performance monitoring, gives AI engineers visibility into their full stack.
Unlike observability platforms treating AI as a separate product line, Kloudfuse integrates LLM monitoring into the existing APM framework. Teams use a unified instrumentation approach based on OpenTelemetry standards, eliminating duplicate agents and fragmented workflows.
"We integrated LLM telemetry directly into our APM using standards-based OpenTelemetry, so platform teams get the same operational control over AI workloads as they do for traditional services," said Ashish. "That's the difference between simply adding AI features and building AI-native observability."
Security and Compliance: Built for Regulated Enterprises
FIPS 140-2/3 Validated Cryptographic Modules
Achieving FIPS validation was one of our most significant technical undertakings in 3.5. Kloudfuse implements FIPS 140-2/3 validated cryptographic modules throughout the platform, covering data ingestion, storage, queries, and API access.
This federal security standard ensures that cryptographic functions protecting observability data, both at rest and in transit, meet government requirements. For organizations in defense, healthcare, finance, and other regulated sectors, FIPS validation isn't optional; it's often contractually required.
FedRAMP Authorization Pathway
Building on FIPS validation, Kloudfuse 3.5 establishes a clear FedRAMP authorization pathway. The platform implements NIST 800-53 security controls, includes continuous monitoring and automated compliance reporting, and generates Plans of Action & Milestones (POA&Ms) for vulnerability management.
Combined with Kloudfuse's VPC deployment model, these security features mean organizations retain complete data sovereignty while meeting stringent compliance requirements. Your observability data never leaves your infrastructure, eliminating entire categories of compliance concerns.
"At Zscaler's scale, observability must be both robust and compliant, and Kloudfuse 3.5 continues to excel on both fronts," said Kishore Thakur, Senior Director, Cloud Platform Engineering at Zscaler. "The platform efficiently handles our massive data volumes across global deployments while meeting stringent security requirements, including FIPS 140-2 compliance and a FedRAMP authorization pathway. For enterprises operating at scale with strict compliance needs, Kloudfuse provides an exceptional combination of flexibility, security, and innovation."
Data Governance: Enterprise Control Over Observability Data
Security and compliance extend beyond cryptography to how observability data is managed, accessed, and audited. Kloudfuse 3.5 delivers comprehensive data governance capabilities that give organizations complete control over sensitive telemetry.
Data Scrubbing Across All Streams
Kloudfuse 3.5 expands data scrubbing capabilities across all telemetry streams: Metrics, Logs, APM traces, Events, and RUM. Preview data before deletion, apply sophisticated filters and regex patterns to target specific records, and maintain comprehensive audit trails of all data management operations. This enables compliance with GDPR's right to deletion, HIPAA's data minimization requirements, and internal data retention policies.
Stream-Specific RBAC and Identity Management
Enterprise access control in 3.5 enables granular data visibility based on labels, tags, or custom attributes. Development teams see logs and traces from their services without accessing other teams' data. Security teams maintain comprehensive access for investigations. Finance teams query aggregated metrics without viewing detailed traces containing sensitive information.
Kloudfuse 3.5 automatically synchronizes groups and roles with SAML and OAuth 2.0 identity providers including Okta and Google. As organizational structure evolves, access controls stay current without manual updates. This eliminates access drift and simplifies compliance audits.
Hierarchical Organization and Audit Logging
Organize dashboards, alerts, and other Kloudfuse objects in hierarchical folder structures that mirror organizational complexity. RBAC policies apply at the folder level and inherit downward, enabling delegation without chaos.
Audit logging captures every configuration change and can be self-ingested into Kloudfuse itself, creating a queryable compliance trail using the same tools teams use for application logs. Search audit logs with FuseQL, build dashboards showing configuration changes over time, and alert on suspicious access patterns.
Platform Engineering: Control That Scales
As observability infrastructure grows, platform teams need the same operational controls they apply to production systems. Kloudfuse 3.5 delivers capabilities purpose-built for teams managing observability at enterprise scale.
Stream-Specific Rate Control
Engineering stream-specific rate control was a key focus for 3.5. Most observability platforms offer only account-level ingestion limits. Kloudfuse introduces granular rate control at the stream level, set different ingestion rate limits for metrics, logs, traces, events, and RUM independently.
Within each stream, use filters to prioritize business-critical data over noise. When a canary deployment starts emitting excessive metrics, you can throttle just that service's metrics stream without affecting production monitoring. This enables platform teams to prevent runaway ingestion from misbehaving services and manage costs proactively based on business priorities.
Multi-Zone High Availability and Disaster Recovery
Designing multi-zone high availability required careful architectural decisions. Kloudfuse 3.5 offers flexible availability options to match organizational requirements.
Multi-Zone High Availability provides automatic failover with zero downtime and no manual intervention across availability zones. The platform distributes workloads in active-active configuration, maintaining continuous operations even during zone failures.
For cost-conscious deployments, Disaster Recovery configurations with manual failover triggers reduce resource requirements while enabling rapid recovery across regions. Organizations can choose the configuration that balances their availability needs with infrastructure costs.
Service Accounts and Automation
Modern platform engineering requires automation. Kloudfuse 3.5 introduces enterprise-grade service accounts with bearer token authentication, enabling secure machine-to-machine interactions. Assign RBAC policies directly to both users and service accounts, enabling GitOps approaches to observability configuration and programmatic management of dashboards, alerts, and policies.
Real-Time Cost Visibility and Chargeback
Platform teams need real-time visibility into observability costs. Building real-time cost visibility and chargeback models enables granular breakdowns of data volumes and costs by stream, by team, and by custom tracking labels.
Enable chargeback and showback models that attribute observability costs to the teams generating them, creating accountability and enabling informed capacity planning. Finance teams gain visibility into spending patterns while platform teams identify optimization opportunities.
Custom Metrics SLOs
Custom metrics SLOs enable monitoring what matters to your business. Extend beyond basic latency and availability SLOs with service level objectives based on any metric using PromQL queries: conversion rates, data pipeline lag, cache hit ratios, or custom business KPIs.
Advanced Query Performance
Kloudfuse's proprietary FuseQL query language continues expanding capabilities that compete with specialized log analytics platforms while maintaining unified query syntax across all telemetry types.
Scheduled Views for Sub-Second Performance
How scheduled views deliver sub-second query performance explains one of 3.5's most impactful features. Scheduled views are pre-aggregated datasets that update at specified intervals for dramatically improved query performance.
Define complex FuseQL queries once, configure update schedules (down to one-minute intervals), and query the precomputed results instantly instead of scanning raw data repeatedly. Queries that previously took 15-20 seconds scanning raw data now complete in under one second.
Complement scheduled views with cron-based scheduled searches that automate routine investigations, schedule daily security audits, weekly capacity reports, or hourly compliance checks without manual intervention.
Advanced Operators for Complex Analysis
Expanding FuseQL with DIFF operator and JSON parsing adds sophisticated analysis capabilities. The DIFF operator compares two time ranges or result sets to identify what changed between releases or incidents.
New matches and in operators provide efficient regex matching and membership testing. JSON array parsing handles complex structured data natively. Lookup tables enrich data on the fly, enabling cost center mapping or threat intelligence enrichment, without modifying ingestion pipelines or reprocessing historical data.
OpenTelemetry Native: No Proprietary Agents, No Vendor Lock-In
Kloudfuse doubles down on open standards as architecture, not marketing. Version 3.5 demonstrates that standardization enables innovation rather than constraining it.
OpenTelemetry-Native Kubernetes Topology
Visualizing Kubernetes topology using pure OpenTelemetry shows how Kloudfuse 3.5 automatically discovers pods, nodes, services, and their relationships through OTel Events support for Kubernetes, providing a living map of your infrastructure without proprietary agents.
When you evaluate alternatives or evolve your observability strategy, your instrumentation remains portable. You own your telemetry pipeline, not the vendor.
Native Histogram Support
Metrics accuracy matters, especially for histograms. Kloudfuse 3.5 implements both Prometheus native histograms and OTLP exponential histograms with complete function libraries: histogram_avg(), histogram_count(), histogram_fraction(), histogram_sum(), histogram_stddev(), and histogram_stdvar().
This dual support ensures accurate metric distributions whether you're using Prometheus or OpenTelemetry ecosystems, with dynamic bucket boundaries and improved storage efficiency compared to legacy histogram implementations.
Expanded Cloud Integration
Version 3.5 adds AWS CloudFront metrics enrichment and GCP Stackdriver metrics ingestion through standard cloud APIs. GeoIP support enables location-based analysis for understanding geographic performance patterns. These integrations work through open APIs rather than proprietary connectors, ensuring observability keeps pace as cloud footprints evolve.
Proven at Enterprise Scale
Kloudfuse 3.5 builds on a foundation trusted by enterprises operating at massive scale. The platform processes millions of events per second while delivering 60-80% cost savings compared to competitors, not through feature compromise, but through architectural efficiency and customer-controlled infrastructure.
"Kloudfuse continues to deliver the scale, flexibility, and cost efficiency we need as Automation Anywhere grows globally," said Raghu Sethuraman, Vice President of Engineering at Automation Anywhere. "The enhanced FuseQL capabilities give our engineering teams powerful new ways to analyze trends and troubleshoot issues faster. We're particularly excited about the MCP Server, which will enable our developers to interact with observability data in entirely new ways. Longer data retention and complete ownership through Self-SaaS deployment are exactly what we need to confidently build and operate the next generation of AI-powered automation at global scale."
With customers including Zscaler, GE Healthcare, Workday, Tata 1mg, and Automation Anywhere, plus over 700 integrations and Gartner Magic Quadrant Honorable Mention recognition, Kloudfuse has proven that observability doesn't require choosing between capability and control.
The Self-SaaS Advantage
Underlying every capability in Kloudfuse 3.5 is the Self-SaaS architecture that fundamentally differentiates the platform.
Deploy Kloudfuse in your own VPC on AWS, Azure, or GCP. Your data never leaves your infrastructure. You control retention, you control storage costs, and you eliminate data egress fees that make competitive platforms expensive at scale.
Yet you get SaaS-like simplicity: managed control plane, automatic updates, and expert support ensuring your observability platform runs smoothly. It's control without operational burden, the architectural foundation enabling everything else in version 3.5.
Get Started with Kloudfuse 3.5
Resources:
See it in action:
Kloudfuse will be demonstrating version 3.5 capabilities at Gartner IT Infrastructure, Operations & Cloud Strategies Conference in Las Vegas, Nevada, from December 9-11, 2025. Visit booth 646 to see the Model Context Protocol Server in action and learn how Kloudfuse is building enterprise-ready observability for modern infrastructure.
About Kloudfuse
Kloudfuse is a unified observability platform integrating with over 700 diverse infrastructures, cloud services, and applications. By harnessing open standards like OpenTelemetry and Prometheus, Kloudfuse eliminates vendor lock-in while providing advanced capabilities across metrics, logs, traces, events, and real user monitoring. Deployed within customer VPCs, Kloudfuse ensures scalability, cost-efficiency, and enterprise security.
Trusted by leading organizations like Zscaler, GE Healthcare, Tata 1mg, Workday, and Automation Anywhere, Kloudfuse delivers observability that enterprises can operate with confidence.
Learn more at www.kloudfuse.com or follow us on LinkedIn.

