Skip to content

Reading throughput on the canvas

The topology canvas isn’t just a wiring diagram — when a collector is reporting self-telemetry, every element carries a live throughput overlay: a small sparkline (recent trend), the current rate, and — where it matters — a queue fill bar and an error flag. This page explains how to read them.

Toggle the overlays with the throughput switch in the canvas toolbar. With it off, you get the plain wiring graph.

Units: events per second, not bytes

Every number on the canvas is events per second — log records, metric data points, or spans — not bytes.

That’s a deliberate choice: the OpenTelemetry collector reports its internal throughput as item counts, so events/second is what we can show accurately from the metrics the collector already emits, with no extra instrumentation. It’s also the honest axis for LinkMesh, which isn’t priced per gigabyte. If you need byte-level accounting, scrape the collector’s /metrics endpoint into your own backend alongside LinkMesh.

The overlay matrix

Each kind of canvas element shows throughput for a specific point in the pipeline. Read it as “what flows, and which direction”:

ElementWhat the sparkline showsDirection
Source nodeevents ingested per second (its receivers); error tint if events are being refusedin
Collector nodetwo lines — total received vs total sent across all its receivers and exportersin & out
Route edgeoffered vs matched — how many events reached the route, and how many it kept. The gap is what the route filtered outin (offered) / out (matched)
Collector → Destination edgeevents sent to that destination per secondout
Collector → Collector edgetwo lines — events forwarded by the sender vs events received by the peerout & in
Destination nodeevents sent per second, plus a queue fill bar and an error flag when the exporter is strugglingout
Processor step (expanded route)each step’s in → out, with the drop % that step removesin & out

Sources and collectors

A source node shows how fast events are coming in. A flat line at the bottom means nothing is arriving right now; a rising line means traffic is picking up. If the source is rejecting events (a malformed sender, an auth failure), the sparkline tints red and shows an errors/second figure.

A collector node shows two lines: everything it receives versus everything it sends. When in and out track each other, the collector is passing traffic through cleanly. A persistent gap means events are being dropped or buffered somewhere between ingest and export — open the collector to see which route or processor is responsible.

Route edges — offered vs matched

A route’s job is to select events, so its overlay shows two numbers: how many events were offered to the route, and how many it matched and forwarded. The difference is what the route filtered out.

This is exactly what you want when tuning a filter: if a route meant to keep “only errors” is matching 95% of what’s offered, your filter is too loose. If it’s matching 0%, it’s too strict — or the upstream isn’t sending what you think.

Connections between collectors

When one collector forwards to another, the edge shows two lines: what the sender exported, and what the receiving collector accepted. If the sent line is healthy but the received line is flat, events are leaving the sender but not arriving — usually a missing receiver or a firewall on the target. LinkMesh also flags this specific misconfiguration directly on the edge.

Destinations — watch the queue and errors

A destination node shows egress rate, but two extra signals matter more for catching trouble early:

  • Queue fill bar — the exporter buffers events when the downstream (Grafana Cloud, Loki, your backend) can’t keep up. The bar fills as the queue grows; it turns amber past 80% and red past 95%. A filling queue is your earliest warning that data loss is coming — the collector drops events once the queue is full.
  • Error flag — appears when the exporter is failing to send (rejected requests, timeouts). The destination tints red and shows an errors/second figure.

Per-processor drop rate

Expand a route in a collector’s routing tab to see a Live throughput strip: the route’s own offered→kept rate, then each pipeline-processor step with its in → out and the percentage it drops. A masking step should drop ~0%; a filter step drops by design. A step shedding nearly everything is flagged — usually a mis-written filter condition eating events you meant to keep.

Cadence and window

  • Refresh: the numbers update roughly every 30 seconds — the interval at which collectors push their self-telemetry. A brand-new collector shows a flat baseline until its first push lands (within a minute of enrollment), not a false zero.
  • Window: the canvas holds the last 24 hours of throughput. It’s a recent-trend view for spotting what’s happening now and over the last day — not a long-term history. For 30-day trends, pair a Prometheus scrape of the collector with your own observability stack.

See also

  • Self-telemetry — where these numbers come from: the collector pushing its own otelcol_* metrics over standard OTLP.
  • Build your first pipeline — wire up a source, route and destination, then watch the overlays light up.
  • Route — how a route’s match filter selects events (the offered-vs-matched number on every route edge).