Industrial Monitor
Real-time industrial monitoring dashboard with anomaly detection and predictive maintenance.
→ Detected an 18-day equipment degradation in the pilot phase.
A monitoring platform for an operator running geographically distributed industrial assets — the kind of fleet where every site has its own vintage of PLCs, its own preferred fieldbus, its own historian (or none at all), and a wide gap between what the equipment vendor’s app shows and what the operator actually wants to see across the whole fleet.
The shape of the problem
The work begins, as it usually does on this kind of engagement, with a walk through the existing data. Some of it lives on Modbus TCP and Modbus RTU registers. Some of it speaks OPC UA, exposed by a newer PLC. Some sits on an MQTT broker someone set up two years ago and forgot. A handful of sensors are HTTP REST endpoints behind cellular modems, polling-friendly only. One subsystem speaks BACnet because it came from a building-automation lineage. Two subsystems are S7 over Profinet because Siemens. There are a couple more buses on a couple more sub-systems, none load-bearing on the story. None of it shares units, none of it shares timestamps, and none of it agrees on what “normal” looks like.
The brief is straightforward to state and not at all straightforward to deliver: one dashboard, one alert path, one source of truth for the operations team.
The approach
Three layers, kept deliberately boring on the boundaries:
Edge. A small Node-RED gateway at each site (the Uplink on this same engagement) speaks each protocol natively, normalises units and signal names against a canonical catalogue, dedupes, and forwards only deltas over a narrow-band link. Buffered locally to disk for the inevitable connection gaps. Same gateway runs a small offline rules pack so the loudest local faults still light up at site even when the link is down.
Spine. A time-series store on the operator’s own infrastructure — TimescaleDB for the structured industrial signals, with an MQTT bridge in front for the streaming firehose and a Postgres-side compression policy that keeps a year of high-resolution data and many years of rollups. Per-asset metadata in a relational schema so the dashboard can group by site, line, vendor, install date, last-service date.
Surface. A FastAPI backend, a static frontend (this stack — Astro, TypeScript, server-side rendered tiles where it helps), and a fleet view that updates a few times a second over Server-Sent Events without requiring the client to poll. Anomaly detection runs server-side: a mix of plain statistical bounds for the things that are already well-characterised, and lightweight learned models — autoencoders, isolation forests, simple LSTM forecasts — for the long-tail signals where “this isn’t right” is a phrase nobody has ever bothered to write down.
┌─[ 00 SIGNALS ]───────────────────────────────────────┐ │ > sensors · valves · temps · vibration · counters │ └─────────────────────────┬────────────────────────────┘ ▼ ┌─[ 01 INGEST ]────────────────────────────────────────┐ │ │ │ tags ──> normalise ──> time-series store │ │ │ └─────────────────────────┬────────────────────────────┘ ▼ ┌─[ 02 DETECT ]────────────────────────────────────────┐ │ │ │ baseline ──> anomaly score ──> alert thresholds │ │ │ └─────────────────────────┬────────────────────────────┘ ▼ ┌─[ 03 TIMELINE ]──────────────────────────────────────┐ │ │ │ ▁▁▂▂▁▂▂▃▃▃▄▄▅▆▇█ drift caught at T-18d │ │ │ │ T-18d T-12d T-6d T-now │ └──────────────────────────────────────────────────────┘
The integrations are the work
Most of the actual labour is not modelling and not the dashboard. It is matching a register on a thirty-year-old PLC to a measurement on a newer sensor, getting both into the same engineering units, getting both timestamped against the same NTP source, and persuading both to keep shipping when the cellular modem drops to two bars. There is a section of the codebase that exists solely to re-encode timestamps from the four or five subtly-incompatible representations the field equipment emits. There is a glossary of signal names the operator’s team can edit. There is a per-site clock-skew correction. None of this is glamorous and all of it is what determines whether the platform is trusted six months in.
What shipped
In the pilot phase — one site, three weeks of data — the platform flagged a slow degradation on a single asset eighteen days before it would have produced a hard fault under the operator’s existing alarm thresholds. The signal was not unusual on its own; it became unusual when correlated with two other measurements that nothing else on site was watching together. The fix was a scheduled service rather than an emergency call-out.
That is the right shape of outcome for predictive maintenance: not a miracle, but enough warning to turn the response from a phone call at 22:00 into an item on next week’s maintenance schedule.
Stack, briefly
- Edge: Node-RED, custom protocol nodes, SQLite ring buffer, MQTT bridge with store-and-forward, signed firmware updates
- Protocols seen in production: Modbus TCP and RTU, OPC UA, MQTT, S7, BACnet, HTTP REST, plus a handful of less-common buses where the asset vintage required it
- Spine: PostgreSQL + TimescaleDB, EMQX as the broker, Redis for hot-path caching of derived state
- Surface: FastAPI, Astro, vanilla TypeScript, Server-Sent Events, Plotly for interactive charts where they earn their JS
- ML: scikit-learn for the bread-and-butter detectors, PyTorch for the LSTM forecasters, all trained on the operator’s own historical data and re-trained on a regular cadence
- Hosting: the operator’s own Linux servers, no third-party SaaS in the data path
Status
Live on multiple sites; the case-study page here stays generic on purpose. Specific assets, sites, vendors, and the operator’s name are covered by NDA. If you are evaluating this kind of integration work and want to talk through specifics, drift@solheimsolutions.no is the shortest path.