Engineering Blog | Nitin Mane

Edge AI

Industrial Anomaly Detection at the Edge

Why plant-floor vision systems need local inference, careful alert design and deployment-aware models.

LLM/RAG

Multi-Modal LLM/RAG for Plant Intelligence

Turning camera detections, sensor telemetry and logs into operator-ready context.

Robotics

Smart Weed Remover Robot

NVIDIA Jetson Nano perception paired with a 6-DOF parallel robot for field-level weed removal.

Industrial Anomaly Detection at the Edge

Industrial anomaly detection is not just a model problem. It is a latency, reliability, environment and trust problem wrapped around a model.

A plant-floor camera feed can reveal equipment anomalies, unsafe movement, missing PPE, blocked passages or near-miss events before they become incidents. The challenge is that the signal is noisy: lighting changes, dust, vibration, occlusion, glare and camera drift all create cases that look obvious to a human but confusing to a detector.

Why edge deployment matters

Sending every frame to the cloud is rarely the right first move. Industrial networks can be bandwidth constrained, cloud round-trips add latency, and safety alerts lose value when they arrive late. Edge inference keeps the detection loop close to the machine: frames are processed locally, alerts are generated near the source, and only compressed events or summaries need to travel upstream.

A practical pipeline

The pipeline I prefer starts with a production-shaped YOLO detector, then optimizes it for the target edge device using an inference runtime such as OpenVINO. A lightweight event layer sits after detection: it smooths short-lived false positives, applies zone rules, maps detections to equipment regions and emits events with timestamps, confidence and frame evidence.

This event layer is where many demos become systems. A single frame detection is not always an anomaly. A pattern across frames, a detection inside a restricted zone, or a sequence that violates a known operating state is much more useful to operators.

Designing for operators

The alert should say what happened, where it happened, how confident the system is and what evidence it used. That makes the system auditable. It also gives plant teams a way to tune thresholds without treating the model as magic.

The end goal is not to replace human supervision. It is to give operators a second set of tireless eyes that can watch every frame and surface the moments worth human attention.

Multi-Modal LLM/RAG for Plant Intelligence

A detector can tell you that something changed. A plant-intelligence layer should help explain what changed, why it matters and what to check next.

Industrial data is naturally multi-modal. Cameras see motion and condition. Sensors record vibration, temperature, current, speed and pressure. Logs capture alarms, operator actions and maintenance history. Each stream is useful alone, but the real value appears when the streams are fused into context.

The role of RAG

Retrieval-augmented generation gives an LLM access to the plant's own documentation and event memory: standard operating procedures, maintenance notes, equipment manuals, prior incidents and alarm definitions. Instead of producing a generic explanation, the model can ground its response in the specific machine, zone and operating condition.

From detections to narratives

A useful architecture treats computer-vision detections and sensor anomalies as structured events. Each event includes timestamp, camera or sensor ID, equipment zone, confidence, class label and any linked frame evidence. The RAG layer then retrieves relevant context: what equipment lives in that zone, what the normal operating envelope looks like and what previous events resembled the current one.

The LLM's job is not to invent a diagnosis. Its job is to summarize evidence, show the retrieved basis, rank possible causes and suggest checks. For example: "Repeated rod misalignment detected near Zone B between 14:02 and 14:04. Similar historical events were associated with roller wear. Check camera feed B3, roller alignment and vibration trend."

Guardrails for the plant floor

Plant intelligence needs conservative behavior. Outputs should preserve uncertainty, cite retrieved evidence and avoid unsupported action commands. The best user experience is a compact alert first, with expandable evidence for engineers who need the full chain.

Done well, a multi-modal LLM/RAG system becomes a translation layer between raw machine signals and human operational judgment.

Smart Weed Remover Robot: NVIDIA Jetson Nano with 6-DOF Parallel Robot

A weed-removal robot is a good mechatronics problem because perception and actuation have to agree in real space, not just on a screen.

The concept is straightforward: a mobile platform moves through crop rows, a camera identifies weeds, and a robotic mechanism removes the target plant without damaging the crop. The hard part is precision. A false positive can damage a crop. A delayed command can miss the weed. A weak mechanism can detect correctly and still fail physically.

System architecture

The NVIDIA Jetson Nano is a practical edge-compute choice for this class of prototype because it can run lightweight vision models close to the camera while still fitting the power and size constraints of a field robot. The vision stack detects weed candidates, estimates their image coordinates and projects them into the robot's working frame.

The actuation layer uses a 6-DOF parallel robot to position the end effector over the weed. A parallel mechanism is attractive here because it can offer stiffness and precise positioning inside a bounded workspace. That matters when the tool has to strike, cut, pull or disturb soil at a small target point.

Calibration is the quiet hero

The camera, mobile base and manipulator each have their own coordinate frame. The robot only works when those frames are calibrated well enough that a pixel-level detection turns into a reliable physical target. Camera calibration, height estimation, crop-row geometry and end-effector offset all become part of the intelligence pipeline.

Control loop

A robust loop looks like this: detect the weed, verify it across a few frames, estimate the target point, pause or slow the base, move the 6-DOF mechanism into position, actuate, and then confirm that the target has changed. The confirmation step is important because agriculture is messy. Soil texture, leaf overlap and shadows can fool a one-shot system.

The engineering goal is a robot that uses chemicals less, handles weeds selectively and gives farmers a tool that is precise enough to be trusted. It is not just an AI project or a mechanical project. It is the integration that makes it useful.