← Back to work

Building a Smart Trajectory System

6 min read
computer-visionembedded-systemsballisticsjetsongstreamer

Building a Smart Trajectory System

I wanted to build a system that could automatically account for wind, distance, and environmental conditions in real-time. So I put together a pipeline that uses computer vision to detect targets, estimate distance, and then runs ballistics calculations to predict where shots will land.

System Architecture

The system runs on a Jetson device and processes video frames through multiple stages:

Camera Feed
    ↓
[1] Object Detection (YOLOv7)
    ↓
    Detections + Distance Estimation
    ↓
[2] Pose Estimation (3D Pose)
    ↓
    Keypoint Locations
    ↓
[3] Trajectory Calculation
    ↓
    Windage + Elevation Adjustments
    ↓
[4] Display Overlay
    ↓
    Annotated Video Output

Stage 1: Object Detection and Distance Estimation

I use YOLOv7 running on Triton Inference Server to detect people and objects in each frame. The detector runs at 416x416 resolution for speed, and I filter detections by confidence threshold (0.55).

class Detector():
    def __init__(self, dim=416, triton_url="localhost:8001", threshold=0.55):
        self.triton = grpcclient.InferenceServerClient(url=triton_url)
        # ... setup inputs/outputs
    
    def detect(self, frame):
        batch = self._preprocess(frame)
        results = self._inference(batch)
        return self._postprocess(*results)

Distance estimation uses the bounding box size and a calibration model. I trained XGBoost regressors on real-world data to predict distance from bounding box dimensions, accounting for the camera's field of view.

Stage 2: Pose Estimation

For more precise targeting, I use a 3D pose estimation model to extract keypoints from detected people. The model outputs 18 keypoints (neck, shoulders, elbows, wrists, hips, knees, ankles, eyes, ears) along with heatmaps and part affinity fields.

class PoseDetector():
    def detect(self, frame):
        batch = self._preprocess(frame)
        features, heatmaps, pafs = self._inference(batch)
        keypoints = pose3d_postprocess(heatmaps, pafs, self.eR, self.et)
        return keypoints

I use the keypoints to calculate distance more accurately - the spacing between shoulders or hips gives a better distance estimate than bounding boxes alone. The pose model also helps with orientation detection, which matters for lead calculations.

Stage 3: Trajectory Calculation

This is where the physics comes in. I use XGBoost regressors trained on Hornady's 4DOF ballistic calculator to predict windage and elevation adjustments. The models take into account:

def worker():
    cu_regressor = xgb.XGBRegressor()
    cu_regressor.load_model("xgb_cu_regressor-v2.json")
    wd_regressor = xgb.XGBRegressor()
    wd_regressor.load_model("xgb_wd_regressor-v2.json")
    
    while True:
        distance = float(rv("distance"))
        if distance != last_distance:
            # Calculate windage and elevation
            windage = wd_regressor.predict([features])
            elevation = cu_regressor.predict([features])
            # Update Redis with adjustments

I read sensor data from Redis, which is updated by separate processes that handle:

The trajectory calculation runs in a loop, checking for distance changes and recalculating adjustments when needed.

Stage 4: Video Processing Pipeline

The whole thing runs on GStreamer for efficient video processing. I use shared memory between processes to pass frames around without copying:

scan_shm = shared_memory.SharedMemory(name="scan_frame")
scan_shm_frame = np.ndarray((SCAN_DIM, SCAN_DIM, 3), dtype=np.uint8, buffer=scan_shm.buf)

The pipeline has two video streams:

I use Redis to coordinate crop regions - when an object is detected, the system updates the zoom crop to center on it.

The Sensor Stack

Environmental data comes from multiple sensors:

All sensor data gets written to Redis, where the trajectory calculation service reads it. This decouples sensor polling from the main video processing pipeline.

Real-Time Performance

On a Jetson Xavier NX:

The bottleneck is the neural network inference. I use TensorRT-optimized models where possible, and Triton Inference Server for efficient batching and GPU utilization.

Challenges

Sensor synchronization: Different sensors update at different rates. I use Redis with timestamps to handle this, but there's still some jitter in the measurements.

Distance estimation accuracy: The XGBoost models work well within their training range, but performance degrades outside of it. I had to collect a lot of calibration data at different distances to get good coverage.

Frame rate consistency: GStreamer pipelines can drop frames under load. I use frame queues and drop policies to maintain real-time performance, but this means some frames get skipped during heavy processing.

Coordinate system alignment: Converting between camera coordinates, world coordinates, and device orientation is tricky. I use extrinsic calibration matrices to transform between coordinate systems, but small errors compound over distance.

What I Learned

This project taught me a lot about building real-time computer vision systems:

The system works well for its intended use case - automatically adjusting for environmental conditions in real-time. It's also a good example of how to combine computer vision, sensor data, and machine learning in an embedded system.