YOLO Basic Usage¶
This tutorial will guide you through using trtutils with YOLO models. We will cover:
Building a TensorRT engine from ONNX weights
Running inference with the engine
Advanced features like parallel execution and benchmarking
Building TensorRT Engines¶
trtutils provides a simple interface to build TensorRT engines from ONNX weights.
The find_trtexec() function will locate your trtexec
installation and build_engine() will build the engine.
Example:
from trtutils.trtexec import build_engine
# Build a TensorRT engine from ONNX weights
build_engine(
"yolo.onnx",
"yolo.engine",
precision="fp16",
workspace_size=1 << 30, # 1GB
)
Running YOLO Inference¶
The YOLO class provides a high-level interface
for running YOLO inference. It handles all the preprocessing and postprocessing
steps automatically.
Example:
import cv2
from trtutils.models import YOLO
# Load the YOLO model
yolo = YOLO("yolo.engine")
# Read and process an image
img = cv2.imread("example.jpg")
detections = yolo.end2end(img)
# Print results
for bbox, confidence, class_id in detections:
print(f"Class: {class_id}, Confidence: {confidence}")
print(f"Bounding Box: {bbox}")
Advanced Features¶
Parallel Execution¶
You can run multiple YOLO models in parallel using the ParallelYOLO class:
from trtutils.models import ParallelYOLO
# Create a parallel YOLO instance with multiple engines
yolo = ParallelYOLO(["yolo1.engine", "yolo2.engine"])
# Run inference on multiple images
images = [cv2.imread(f"image{i}.jpg") for i in range(2)]
results = yolo.end2end(images)
# OR
yolo.submit(images)
results = yolo.retrieve()
# OR
yolo.submit_model(images[0], 0)
single_result = yolo.retrieve_model(0)
# print results
for i, result in enumerate(results):
print(f"Results for model {i}:")
for bbox, confidence, class_id in result:
print(f"Class: {class_id}, Confidence: {confidence}")
print(f"Bounding Box: {bbox}")
Benchmarking¶
You can benchmark YOLO models using the built-in benchmarking utilities. It is recommended to enable the MAXN power mode and enable jetson_clocks when using Jetson devices to get both the fastest and most stable results.
If using Python API:
from trtutils import benchmark_engine
# Run 1000 iterations
results = benchmark_engine("yolo.engine", iterations=1000)
print(f"Average latency: {results.latency.mean:.2f}ms")
print(f"Throughput: {1000/results.latency.mean:.2f} FPS")
# On Jetson devices, you can also measure power consumption
from trtutils.jetson import benchmark_engine as jetson_benchmark
results = jetson_benchmark(
"yolo.engine",
iterations=1000,
tegra_interval=1 # More frequent power measurements
)
print(f"Average power draw: {results.power_draw.mean:.2f}W")
print(f"Total energy used: {results.energy.mean:.2f}J")
If using CLI:
python3 -m trtutils benchmark yolo.engine --iterations 1000 --tegra_interval 1
# On Jetson devices, you can also measure power consumption by adding the --jetson flag
# This requires that jetsontools be installed via pip
# pip install jetsontools
# This is installed by default as a dependency from 0.6.0 onwards
python3 -m trtutils benchmark yolo.engine --iterations 1000 --tegra_interval 1 --jetson
Troubleshooting¶
Common issues and solutions:
ONNX Export Fails - Ensure you’re using the correct virtual environment - Ensure you have the latest version of ultralytics or other package for YOLO - Check if your PyTorch weights are valid
Engine Creation Fails - Ensure you have enough GPU memory (workspace_size parameter) - Check if the ONNX weights are valid - Try different ONNX opset versions
Incorrect Detections - Verify the input image preprocessing matches the training - Check if the confidence and IoU thresholds are appropriate
Performance Issues - Try enabling FP16 precision - On Jetson devices, ensure MAXN power mode and enable jetson_clocks
Dynamic Shape Issues - Always specify the input shape when building the engine (for dynamic input shape models) - The shape must match the img-size used during export - If you need multiple input sizes, build separate engines