Explore YOLOv12

YOLOv12 is a state-of-the-art object detection model with additional support for classification, segmentation, and more.

The model was introduced in the "YOLOv12: Attention-Centric Real-Time Object Detectors" paper by a team of researchers from University at Buffalo, SUNY, and the University of Chinese Academy of Sciences.

In this guide, we will walk through how to use YOLOv12.

What is YOLOv8?

YOLOv8 is a computer vision model architecture developed by Ultralytics, the creators of YOLOv5. You can deploy YOLOv8 models on a wide range of devices, including NVIDIA Jetson, NVIDIA GPUs, and macOS systems with Roboflow Inference, an open source Python package for running vision models.

What is YOLOv12?

YOLOv12 is a new, state-of-the-art object detection model. The model uses an attention-based YOLO implementation that "matches the speed of previous CNN-based ones while harnessing the performance benefits of attention mechanisms" according to the paper accompanying the model.

You can run YOLO12 models on a NVIDIA Jetson, NVIDIA GPUs, and macOS systems with Roboflow Inference, an open source Python package for running vision models.

To learn more about the architecture of the model, refer to the YOLOv12 paper.

Deploy a YOLOv12 Model

To deploy a YOLOv12 model, first install Inference with pip install inference. Then, use the following code:
from inference import get_model
import supervision as sv
from inference.core.utils.image_utils import load_image_bgr
image = load_image_bgr("https://media.roboflow.com/inference/vehicles.png")
model = get_model(model_id="yolov12n-640")
results = model.infer(image)[0]
results = sv.Detections.from_inference(results)
annotator = sv.BoxAnnotator(thickness=4)
annotated_image = annotator.annotate(image, results)
annotator = sv.LabelAnnotator(text_scale=2, text_thickness=2)
annotated_image = annotator.annotate(annotated_image, results)
sv.plot_image(annotated_image)
from inference import InferencePipeline
from inference.core.interfaces.stream.sinks import render_boxes
pipeline = InferencePipeline.init(
model_id="yolov12n-640",
video_reference=0,
on_prediction=render_boxes
)
pipeline.start()
pipeline.join()

Find YOLOv12 Datasets

Using Roboflow Universe, you can find datasets for use in training YOLOv12 models, and pre-trained models you can use out of the box.

Search for Datasets

Search for YOLOv12 Models on Roboflow Universe, the world's largest collection of open source computer vision datasets and APIs
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Train a YOLOv12 Model

You can train a YOLOv12 model using the YOLOv12 Python package.

To train a model, install the project from soucre:

pip install -q git+https://github.com/sunsmarterjie/yolov12.git flash-attn supervision

Then, use the following code to train your model:

from ultralytics import YOLO
model = YOLO('yolov12s.yaml')
results = model.train(data=f'/path/to/dataset/data.yaml', epochs=100)

Replace data with the name of your YOLO12-formatted dataset. Learn more about the YOLO12 format.

You can then test your model on images in your test dataset with the following command:

import supervision as sv
from ultralytics import YOLO
dataset = "/path/to/dataset"
model = YOLO(f'runs/detect/train/weights/best.pt')
ds = sv.DetectionDataset.from_yolo(
images_directory_path=f"{dataset}/test/images",
annotations_directory_path=f"{dataset}/test/labels",
data_yaml_path=f"{dataset}/data.yaml"
)
import random
i = random.randint(0, len(ds))
image_path, image, target = ds[i]
results = model(image, verbose=False)[0]
detections = sv.Detections.from_ultralytics(results).with_nms()
box_annotator = sv.BoxAnnotator()
label_annotator = sv.LabelAnnotator()
annotated_image = image.copy()
annotated_image = box_annotator.annotate(scene=annotated_image, detections=detections)
annotated_image = label_annotator.annotate(scene=annotated_image, detections=detections)
sv.plot_image(annotated_image)

YOLOv12 Model Sizes

There are five sizes of YOLOv12 models available for each task type. The model sizes are nano, small, medium, large, and extra-large.

When benchmarked on the COCO dataset for object detection, here is how YOLOv12 performs.
Model
Size (px)
mAP 50-95 val
YOLOv12n
640
40.6
YOLOv12s
640
48.0
YOLOv12m
640
52.5
YOLOv12l
640
53.7
YOLOv12x
640
55.2

Frequently Asked Questions

What are the main features in YOLOv12?

YOLOv12 uses an attention-based implementation of YOLO that "matches the speed" compared to previous CNN-based YOLO implementations. The YOLOv12 model architecture is able to achieve behighertter accuracy when validated against the Microsoft COCO benchmark.

What is the license for YOLOv12?
Who created YOLOv12?

YOLOv12 was developed by Yunjie Tian (University of Buffalo, SUNY), Qixiang Ye, (University of Chinese Academy of Sciences) and David Doermann (University of Buffalo, SUNY).

© 2025 Roboflow, Inc. All rights reserved.
Made with 💜 by Roboflow.