Training Resolutions by Model Type

Training resolution affects model accuracy, inference speed, and training time. Each model architecture has a default resolution that balances these factors. By default, Roboflow suggests the default training resolution for the selected model architecture.

The table below shows the default training resolution for each model architecture and size. You can override these defaults by configuring the resize preprocessing step when creating a new Dataset Version.

Object Detection

Model Type	Family & Size	Default Training Resolution
Object Detection	RF-DETR Nano	384x384
Object Detection	RF-DETR Small	512x512
Object Detection	RF-DETR Medium	576x576
Object Detection	RF-DETR Large	704x704
Object Detection	RF-DETR X Large	700x700
Object Detection	RF-DETR 2X Large	880x880
Object Detection	Roboflow 3.0 - Fast	640x640
Object Detection	Roboflow 3.0 - Accurate	640x640
Object Detection	Roboflow 3.0 - Medium	640x640
Object Detection	Roboflow 3.0 - Large	640x640
Object Detection	Roboflow 3.0 - Extra Large	640x640
Object Detection	YOLOv26 (n/s/m/l/x)	640x640
Object Detection	YOLOv12 (n/s/m/l/x)	640x640
Object Detection	YOLOv11 (n/s/m/l/x)	640x640
Object Detection	YOLOv10 (n/s/m/b/l/x)	640x640
Object Detection	YOLOv9 (s/m/c/e)	640x640
Object Detection	YOLOv8 (n/s/m/l/x)	640x640
Object Detection	YOLOv5 (n/s/m/l/x)	640x640
Object Detection	YOLOv7 (legacy)	640x640
Object Detection	YOLO-NAS Small	640x640
Object Detection	YOLO-NAS Medium	640x640
Object Detection	Roboflow Instant	1008x1008

Instance Segmentation

Model Type	Family & Size	Default Training Resolution
Instance Segmentation	RF-DETR Seg Nano	312x312
Instance Segmentation	RF-DETR Seg Small	384x384
Instance Segmentation	RF-DETR Seg Medium	432x432
Instance Segmentation	RF-DETR Seg Large	504x504
Instance Segmentation	RF-DETR Seg X Large	624x624
Instance Segmentation	RF-DETR Seg 2X Large	768x768
Instance Segmentation	Roboflow 3.0 - Fast (Seg)	640x640
Instance Segmentation	Roboflow 3.0 - Accurate (Seg)	640x640
Instance Segmentation	Roboflow 3.0 - Medium (Seg)	640x640
Instance Segmentation	Roboflow 3.0 - Large (Seg)	640x640
Instance Segmentation	Roboflow 3.0 - Extra Large (Seg)	640x640
Instance Segmentation	YOLO-seg (v8/10/11/12)	640x640
Instance Segmentation	SAM3 (Segment Anything 3)	1008x1008
Instance Segmentation	Semantic segmentation (DeepLabV3+)	>= 512x512

Classification & Pose

Model Type	Family & Size	Default Training Resolution
Classification & Pose	Resnet-18/34/50	224x224
Classification & Pose	YOLO-cls (v8/11)	224x224
Classification & Pose	Vision Transformer (ViT)	224x224
Classification & Pose	YOLO-pose (keypoints)	640x640

Multimodal/VLM

Model Type	Family & Size	Default Training Resolution
Multimodal/VLM	PaliGemma 2 - 3 B	448x448
Multimodal/VLM	PaliGemma 2 - 10 B/28 B	448x448
Multimodal/VLM	Florence-2	448x448
Multimodal/VLM	QWEN 2.5 VL	448x448
Multimodal/VLM	SmolVLM2	384x384