Supported Models

Roboflow Inference supports a wide range of models for computer vision tasks. You can run fine-tuned models trained on your own data, pre-trained models from Roboflow Universe, and foundation models for zero-shot tasks.

Quick Start

Using a model with Inference is straightforward:

from inference import get_model

model = get_model(model_id="rfdetr-small")

results = model.infer("https://media.roboflow.com/inference/people-walking.jpg")

Pre-Trained Model Aliases

Inference provides convenient aliases for common pre-trained models. See the Pre-Trained Aliases page for a full list of available model IDs.

Model Categories

Fine-Tuned Models

Fine-tuned models are trained on your own data using Roboflow or external training pipelines. Inference supports the following fine-tuned model architectures:

Architecture	Tasks
RF-DETR	Object Detection
YOLO26	Object Detection, Instance Segmentation, Keypoint Detection
YOLOv11	Object Detection, Image Segmentation, Keypoint Detection
YOLOv10	Object Detection
YOLOv9	Object Detection
YOLOv8	Object Detection, Classification, Image Segmentation, Keypoint Detection
YOLOv7	Classification
YOLOv5	Object Detection, Classification, Image Segmentation
YOLO-NAS	Object Detection, Keypoint Detection

Foundation Models

Foundation models are pre-trained on large datasets and can be used out of the box for a variety of tasks without additional training. See the Foundation Models overview for more details.

Model	Use Case
CLIP	Image classification, embedding similarity
Grounding DINO	Zero-shot object detection
SAM 3	Open-vocabulary segmentation
SAM 2	Interactive image segmentation
SAM	Image segmentation
Florence-2	Object detection, captioning, OCR, and more
YOLO-World	Zero-shot object detection
PaliGemma	VQA, object detection, segmentation
Qwen 3.5-VL	Multimodal understanding, VQA
SmolVLM2	VQA, document OCR
Moondream2	Image captioning, VQA
DocTR	OCR
TrOCR	OCR (single-line text)
GLM-OCR	OCR
OWLv2	Few-shot object detection
Perception Encoder	Image/text embeddings
Depth Estimation	Depth maps
Gaze Detection	Gaze estimation

Universe Models

Roboflow Universe hosts over 50,000 pre-trained models shared by the community. You can use any of these models with Inference.

Pre-Trained Model Aliases

Inference provides convenient IDs for common pre-trained models that do not require an API key. See the full list on the Pre-Trained Aliases page.