How We Built FuseQL: A Query Language for Log Analytics at Scale

Published on
Table of Contents
You're investigating a production incident. The error is somewhere in your checkout service logs, but you need to correlate log volume by namespace, detect which patterns are anomalous, and then drill into the specific log lines. That's at least three operations: filter, aggregate by multiple dimensions, run anomaly detection, and drill down, all in a single query session without exporting to a notebook or switching tools.
Most log query languages force you to break this into separate steps, separate tools, or separate queries. FuseQL was built so you don't have to.
This is the problem space FuseQL was built to address. Not metrics (PromQL is the standard there and we support it natively), but log analytics, where the investigation demands expressiveness that existing query languages weren't designed to provide.
In this post, we'll cover why we built a new query language instead of extending LogQL, how FuseQL's pipe-based syntax handles everything from simple filters to ML-powered anomaly detection, and what the query engine architecture looks like under the hood. FuseQL currently works for logs, with plans to extend it to events and traces.
Why Build a New Query Language Instead of Extending LogQL?
We started with LogQL. Kloudfuse originally used Grafana's LogQL as the primary log query language. Three limitations pushed us toward building FuseQL:
Limited expressiveness. Most log query languages are designed around single-aggregation, single-vector output models. They let you count errors or compute a rate, but not return multiple independent aggregations as columns in a single tabular result, the way an engineer naturally thinks about analysis: show me sum, avg, and max side-by-side, grouped by service. FuseQL supports this natively. For teams used to writing multi-aggregation queries in tools like Splunk or Sumo Logic, this was the difference between adopting Kloudfuse for logs and sticking with their existing stack.
Rabiya, Customer Success Architect at Kloudfuse, noted: "The most common feedback from teams migrating from Splunk or Sumo Logic was that LogQL felt like a step backward. FuseQL changed that conversation entirely. Engineers could write the same multi-aggregation queries they were used to, and the pipe syntax meant the learning curve was days, not weeks."
Performance at scale. LogQL queries slow down significantly with large datasets and high-cardinality log attributes. In environments generating millions of log events per second, this creates a ceiling on what you can investigate interactively.
Licensing constraints. LogQL is governed by the AGPL license. Any modifications must be shared with the community, which may not align with all customers' deployment requirements. Building on AGPL code limits how you can distribute and customize the query engine.
As JT, Staff Engineer at Kloudfuse, put it: "Building a new query language is inherently complex. Our process paralleled compiler design, requiring critical architectural decisions around language grammar and execution strategies." The alternative, continuing to patch around LogQL's limitations, would have created more complexity over time, not less.
How Does FuseQL Syntax Work?
FuseQL uses a pipe-based syntax where operations chain with |. If you've written Splunk SPL, the pattern will feel familiar:
<search expression> | <operator> | <operator> ...
The result of each operator feeds into the next. Every FuseQL query produces a table following a schema defined by column headers. Here are examples that demonstrate the progression from simple to complex:
Count all logs in 5-second buckets:
* | timeslice 5s | count by (_timeslice)
Count logs grouped by severity level:
* | timeslice 30s | count by (_timeslice, level)
Average duration facet over time:
* | timeslice 5s | avg(@duration:duration_seconds) by (_timeslice)
Detect outliers in error counts using DBSCAN:
level="error" | timeslice 120s | count by (_timeslice, kube_namespace) | outlier (_count) by 120s, model=dbscan, eps=3
That last example is where FuseQL's design philosophy shows. An engineer investigating an error spike doesn't want to export data to a notebook for anomaly detection. They want to identify the outlier namespace directly in the query, using the same tool they use for everything else.
What Operators Does FuseQL Support?
FuseQL ships with over 60 operators across several categories. This is not a comprehensive list, but it covers the categories that matter most for understanding the language's scope:
Aggregation:
avg,count,count_unique,first,last,max,min,percentiles,stddev,sum. Supports multiple aggregations in a single query (a limitation that LogQL does not overcome).Algorithmic/ML:
anomaliesoverlays expected behavior bands on time series. outliers highlights outlier series using DBSCAN clustering.forecastpredicts future values from historical data. These operators integrate ML models including Prophet, SARIMA, Holt-Winters, Rolling Quantile, and Seasonal Decomposition directly into the query language, so engineers don't need to export data to a notebook for anomaly detection or capacity forecasting.Parse: Variable pattern extraction with regex, anchor-based parsing, native JSON array parsing, and split operations. The
parse multioperator (added in 4.0) extracts multiple patterns from log data with anchoring andnodropoptions.Subqueries (new in 4.0): Nested analysis where results from one query feed into another. Find hosts with the highest error rate, then pull their detailed logs, in a single query.
Compare:
compare timeshiftanalyzes data across different time periods for before-and-after analysis during deployments or incidents.DIFF: Compares two time ranges or result sets to identify additions, deletions, and modifications. Leverages columnar storage and indexing to perform differencing at the storage layer rather than in memory.
Lookup: Enriches log data at query time from external CSV-based lookup tables without re-ingesting data. Useful for mapping IP addresses to geolocations, cost center codes to team names, or threat intelligence indicators to log events.
Search: Boolean operators, regex matching, facet filtering,
starts with,ends with,contains, CIDR prefix matching, IP validation. Thematchesoperator pushes regex pattern matching down to the storage layer for indexed performance. Theinoperator uses hash-based membership testing for constant-time lookups.Window:
accum(running accumulation),rollingstd(rolling standard deviation),smooth(moving average),total(running total).Miscellaneous: Over 40 utility functions including
base64Decode,hexToDec,ipv4ToNumber,isPrivateIP,luhn(credit card validation),urlDecode,maskFromCIDR, andtoBytes.
How Does the Query Engine Architecture Work?
FuseQL queries run against Apache Pinot, the distributed real-time OLAP datastore at the core of Kloudfuse's storage layer. The architecture has several properties that matter for query performance:
Schema-on-read. Logs are ingested without requiring a predefined schema. The query engine interprets structure at read time, which means new log formats are immediately queryable without schema migrations or re-indexing.
Columnar storage with fingerprinting. Kloudfuse's patent-pending log fingerprinting technology separates each log line into static components (boilerplate) and dynamic values (user IDs, timestamps, etc.). This enables up to 20x storage compression and creates indexes that the query engine exploits for faster filtering.
Computation pushed to storage. FuseQL operators like matches and in execute at the Pinot storage layer rather than in a separate query processing layer. This is the difference between scanning and filtering at the data source versus pulling all data into memory and filtering there.
Dual representation. FuseQL queries can produce both time series results (via getLogMetricsResultWithKfuseQl) and streaming raw log results (via getLogsWithFuseQlStream with cursor-based pagination). The same query language serves both the analytics view and the log tailing view.
Ashvin Kumaran, who led the implementation of mathematical operators in FuseQL, noted: "Enhancing the efficiency of these operations has significantly improved our system's performance." The optimization work happens at the operator level: each operator is designed to leverage Pinot's columnar storage and indexing capabilities rather than treating the storage engine as a generic data source.
FuseQL at a glance
Capability | What it means for your investigation |
Pipe-based syntax | Chain filter → aggregate → filter → visualize in a single query. Read left to right, like a data pipeline. |
Multi-column aggregation | Return sum, avg, max, percentiles as separate columns in one tabular result. No need for multiple queries or post-processing. |
Built-in ML operators | Anomaly detection, outlier detection (DBSCAN), and forecasting run inside the query, not in an external notebook or ML toolkit. |
Subqueries (4.0) | Nest queries: find hosts with highest error rate, then pull their detailed logs, in a single query. |
60+ operators | Aggregation, parsing, windowing, comparison, arithmetic, trigonometry, lookup enrichment, and 40+ utility functions. |
Schema-on-read | New log formats are immediately queryable. No schema migrations, no re-indexing. |
Storage-layer pushdown | Operators like matches and in execute at the Pinot storage layer, not in memory. Faster filtering at scale. |
LogQL backward compatibility | Existing LogQL queries continue to work. Migrate incrementally. |
FuseQL is purpose-built for observability log analytics. It doesn't try to be a general-purpose data language. The operators are chosen for the workflows platform teams and SREs actually run: incident investigation, anomaly detection, capacity planning, and compliance auditing. If you need a filter combination you didn't pre-configure, you write it in FuseQL and get the answer. If you need to correlate error spikes with deployment events, the compare and DIFF operators handle it in a single query.
For teams currently using LogQL, the migration path is incremental. Kloudfuse maintains full backward compatibility with LogQL queries, so existing dashboards and alerts continue working while teams adopt FuseQL's extended capabilities for more complex investigations.
What's the Relationship Between FuseQL and PromQL?
They're complementary, not competing. PromQL is the industry standard for metrics queries, and Kloudfuse supports it as a first-class query language with contextual autocomplete (added in 4.0). FuseQL handles log analytics where PromQL's time-series-oriented model doesn't apply.
The boundary is clear: if you're querying metrics, use PromQL. If you're analyzing logs, use FuseQL (or LogQL for simpler queries). The Kloudfuse MCP Server translates natural language to the appropriate query language automatically, and the platform's correlation features let you pivot between metrics and logs without manually bridging the query languages.
The Design Decision: Compiler Design for a Query Language
Building FuseQL required the same rigor as building a compiler: defining a formal grammar, implementing a parser, building an optimizer, and designing an execution engine. The alternative, a thin query translation layer on top of an existing language, would have been faster to ship but would have inherited the limitations we were trying to escape.
The tradeoff is maintenance cost. A custom query language means a custom parser, custom error messages, custom autocomplete, and custom documentation. We accepted this cost because the query language is the primary interface between engineers and their log data. A mediocre query language creates friction on every investigation, every alert configuration, every dashboard build. Getting this right compounds across every interaction with the platform.
Next Steps
The FuseQL documentation covers the full operator reference with examples. The FuseQL additional examples page provides real-world query patterns for common investigation scenarios.
For teams evaluating FuseQL alongside their existing log analytics setup, the Kloudfuse platform's LogQL compatibility means you can start with existing queries and incrementally adopt FuseQL operators as the investigations demand them.
What does your team's log investigation workflow look like? Are you writing queries from scratch every time, or have you built up a library of saved patterns? We're always interested in how teams bridge the gap between "something is wrong" and "here's the root cause."
