YOLOv12: State-of-the-Art Object Detection Model

What is YOLOv8?

YOLOv8 is a computer vision model architecture developed by Ultralytics, the creators of YOLOv5. You can deploy YOLOv8 models on a wide range of devices, including NVIDIA Jetson, NVIDIA GPUs, and macOS systems with Roboflow Inference, an open source Python package for running vision models.

What is YOLOv12?

YOLOv12 is a new, state-of-the-art object detection model. The model uses an attention-based YOLO implementation that "matches the speed of previous CNN-based ones while harnessing the performance benefits of attention mechanisms" according to the paper accompanying the model.

You can run YOLO12 models on a NVIDIA Jetson, NVIDIA GPUs, and macOS systems with Roboflow Inference, an open source Python package for running vision models.

To learn more about the architecture of the model, refer to the YOLOv12 paper.

Deploy a YOLOv12 Model

To deploy a YOLOv12 model, first install Inference with pip install inference. Then, use the following code:

 from inference import get_model
import supervision as sv
from inference.core.utils.image_utils import load_image_bgr
 
image = load_image_bgr("https://media.roboflow.com/inference/vehicles.png")
model = get_model(model_id="yolov12n-640")
results = model.infer(image)[0]
results = sv.Detections.from_inference(results)
annotator = sv.BoxAnnotator(thickness=4)
annotated_image = annotator.annotate(image, results)
annotator = sv.LabelAnnotator(text_scale=2, text_thickness=2)
annotated_image = annotator.annotate(annotated_image, results)
sv.plot_image(annotated_image)

 from inference import InferencePipeline
from inference.core.interfaces.stream.sinks import render_boxes
 
pipeline = InferencePipeline.init(
    model_id="yolov12n-640",
    video_reference=0,
    on_prediction=render_boxes
)
pipeline.start()
pipeline.join()

Find YOLOv12 Datasets

Using Roboflow Universe, you can find datasets for use in training YOLOv12 models, and pre-trained models you can use out of the box.

Selected YOLOv12 Models and Datasets

Plane Detection

Count airplanes photographed using aerial imagery techniques.

Warehouse Item Detection

Detect forklifts, pallets, and small load carriers, among other items you would find in a warehouse environment.

License Plate Detection

Identify license plates in an image.

Search for Datasets

Search for YOLOv12 Models on Roboflow Universe, the world's largest collection of open source computer vision datasets and APIs

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Train a YOLOv12 Model

Notebooks

YOLOv12 Object Detection

You can train a YOLOv12 model using the YOLOv12 Python package.

To train a model, install the project from soucre:

pip install -q git+https://github.com/sunsmarterjie/yolov12.git flash-attn supervision

Then, use the following code to train your model:

 from ultralytics import YOLO
 
model = YOLO('yolov12s.yaml')
 
results = model.train(data=f'/path/to/dataset/data.yaml', epochs=100)

Replace data with the name of your YOLO12-formatted dataset. Learn more about the YOLO12 format.

You can then test your model on images in your test dataset with the following command:

 import supervision as sv
from ultralytics import YOLO
 
dataset = "/path/to/dataset"
 
model = YOLO(f'runs/detect/train/weights/best.pt')
 
ds = sv.DetectionDataset.from_yolo(
    images_directory_path=f"{dataset}/test/images",
    annotations_directory_path=f"{dataset}/test/labels",
    data_yaml_path=f"{dataset}/data.yaml"
)
 
import random
 
i = random.randint(0, len(ds))
 
image_path, image, target = ds[i]
 
results = model(image, verbose=False)[0]
detections = sv.Detections.from_ultralytics(results).with_nms()
 
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
 
annotated_image = image.copy()
annotated_image = box_annotator.annotate(scene=annotated_image, detections=detections)
annotated_image = label_annotator.annotate(scene=annotated_image, detections=detections)
 
sv.plot_image(annotated_image)

YOLOv12 Model Sizes

There are five sizes of YOLOv12 models available for each task type. The model sizes are nano, small, medium, large, and extra-large.

When benchmarked on the COCO dataset for object detection, here is how YOLOv12 performs.

Model

Size (px)

mAP^{50-95 val}

YOLOv12n

640

40.6

YOLOv12s

640

48.0

YOLOv12m

640

52.5

YOLOv12l

640

53.7

YOLOv12x

640

55.2

Explore YOLOv12