Skip to content
CASATONA
All cases
ManufacturingEdge#Computer Vision#On-prem/Edge

Catching defects on the line with vision models running at the edge

An industrial manufacturer

Catching defects on the line with vision models running at the edge

[An industrial manufacturer] · Computer Vision · On-prem/Edge · Edge

Catching defects on the line with vision models running at the edge

Context

An industrial manufacturer ran visual quality control on a production line that moved fast. Inspecting for defects by eye was inconsistent and couldn't keep pace with line speed, so defects were slipping through to later stages — where they're far more expensive to catch and sometimes reach the customer. They wanted automated, real-time defect detection that could keep up with the line without becoming a bottleneck itself.

Challenge

The constraints here were physical, and they ruled out the obvious cloud approach. Inspection had to happen in the moment a part passed the camera — a defect flagged a few seconds late is a defect that's already moved down the line. A round trip to a cloud model adds network latency and, worse, a dependency on a network connection that on a factory floor is not guaranteed. A dropped connection cannot mean inspection stops. So the requirement was real-time inference, on the line, that keeps working when the network doesn't.

Approach

We scoped this as an edge-deployment problem from the start, because the latency and reliability constraints decided the architecture before any model choice. If the inference had to be local and offline-capable, then the whole design followed from running the model on a device at the line rather than in a data center.

That constraint shaped the model work too. A model running on edge hardware has a compute and memory budget, so we focused on a detection model efficient enough to hit real-time inference on the device while still catching the defect classes that mattered. We worked with the client's quality team to define and prioritize those defect types and to assemble representative images — including the hard, ambiguous cases — so the model was tuned against the defects that actually escaped, not just the easy ones.

Architecture

The defining property: inference happens on the line, on the device, with no cloud round-trip in the detection path.

  • Edge inference: a computer-vision defect-detection model deployed directly on edge devices at the line, processing the camera feed in real time and flagging defects on the spot.
  • Optimized for the device: the model was sized and optimized to run within the edge hardware's compute budget while sustaining sub-second inference — fast enough to keep up with line speed.
  • Offline by design: because detection runs locally, the system keeps working with no network connection. Network is used for the things that can tolerate latency — pushing aggregate results, collecting flagged images for review and future retraining — not for the inference itself.
  • Improvement loop: flagged and reviewed images feed back into periodic retraining, so the model's coverage of defect types improves over time without changing the real-time path on the line.

When every millisecond and every network drop matters, the model belongs on the device — and that single constraint reshaped the entire architecture around it.

Results

  • Defect escape rate down sharply — defects caught on the line instead of slipping to later stages or to customers.
  • Sub-second inference at the edge, fast enough to keep pace with line speed without becoming a bottleneck.
  • Works offline — detection continues through network interruptions, because nothing in the inference path depends on the cloud.
  • An image-review and retraining loop that broadens defect coverage over time.

Stack

Computer-vision defect-detection model · edge inference devices on the production line · on-device model optimization for real-time inference · offline-capable architecture · image-collection and periodic retraining loop.


Latency-critical, network-independent, on the device: edge is where some problems have to run. See how we deploy at the edge →, or read the on-prem clinical case for another deployment driven by a hard constraint.

Have a similar problem?

Talk to us