Alert Evaluations: Incremental Merges in ClickHouse

hodgesrm 2 days ago

It sounds as if you used your own algorithm for pre-aggregation into the schema you showed. Did you consider using a materialized view to populate it?

vadman97 2 days ago

We insert from our alerts worker because we want the aggregation to happen per alert (with the aggregated data filtered by the particular alert definition). As each alert is evaluated, we run the following [1] INSERT INTO ... SELECT ... statement based on the alert definition. We can't aggregate with an MV since we'd need to create an MV per unique alert that a customer may set up.
[1]: https://github.com/highlight/highlight/blob/c526daea31fdf764...
nitinreddy88 2 days ago

Precisely, whats stopping them from using simple Incremental Materialised view?
- vadman97 2 days ago
  
  If there was only one alert criteria, that'd be simple. Our alerts can be configured for any data filters (eg. only matching logs with column `level='error'`); we would have to create a unique MV for each alerts' filter condition.
  
  atombender 10 hours ago
  
  You could have an alert ID be part of the MV primary key?
  A MV is really more like a trigger, which translates an insert into table A into an insert in table B, evaluating, filtering, and grouping each batch of A inserts to determine what rows to insert into B. Those inserts can be grouped by an alert ID in order to segregate the state columns by alert. To me this sounds like exactly what you're doing using manual inserts?
  That said, I while MVs are super powerful and convenient, they're a convenience more than a core function. If you have an ingest flow expressed as Go code (as opposed to, say, Vector or Kafka Connect), then you're basically just "lifting" the convenience of MVs into Go code. You don't get the benefit of MV's ability to efficiently evaluate against a batch of inserts (which gives you access to joins and dictionaries and so on), but it's functionally very similar.

iampims 2 days ago

At a certain scale, exact computations (p50 for instance) become impractical. I’ve had great luck switching to approximate calculations with guaranteed error bounds.

An approachable paper on the topic is "Effective Computation of Biased Quantiles over Data Streams" http://dimacs.rutgers.edu/%7Egraham/pubs/papers/bquant-icde....

hipadev23 2 days ago

The quantile() method in clickhouse is also approximate although it uses a more simplistic reservoir sampling model than GK, but quantileGK() is also available. quantileExact() exists but indeed becomes impractical as you point out.