Dedicated Deployments

What are Dedicated Deployments?

Dedicated Deployments are private cloud servers managed by Roboflow, specifically designed to run your computer vision models. These models can include:

Object detection
Image segmentation
Classification
Keypoint detection
Foundation models like CLIP (if trained on Roboflow)
Roboflow Workflows (low-code vision applications)
...and many others

Benefits of Dedicated Deployments

Focus on your machine vision business problem, leave the infrastructure to us: Spin up inference serving infrastructure with a few clicks and without having to signup with cloud providers, installing and securing servers, managing TLS certificates or worrying about server management, patching, updates etc.
Dedicated Resources: Get cloud servers allocated specifically for your use, ensuring consistent performance for your models.
Secure Access: Dedicated Deployments are accessible with your workspace's unique API key and utilize HTTPS for secure communication.
Easy Integration: Each deployment receives a subdomain within roboflow.cloud, simplifying integration with your applications.
Pay-Per-Hour: You are only charged for the duration of the server's existence (billed in 1 minute intervals).
Auto Pause & Resume: Your Dedicated Deployments will automatically pause after a configurable period of inactivity. For dev-cpu or dev-gpu deployment types, this period is fixed at 1 hour. They can be quickly resumed by sending a request with your API key. This feature is designed to help you save on costs.

Current Limitations

All dedicated deployments are currently hosted in US-based data centers; users from other geographies may see higher latencies. Please contact us for a customized solution if you are outside of US, we can help you to reduce the network latency.
Dedicated Deployments are available to Core and Enterprise plan workspaces. See Roboflow plans.

Types of Dedicated Deployments

Roboflow offers 4 different types of Dedicated Deployments: dev-cpu, dev-gpu, prod-cpu, and prod-gpu. While dev-cpu and dev-gpu are designed for development and testing purposes and will be deleted automatically after a few hours, prod-cpu and prod-gpu are persistent, ideal for serving large-scale production traffic.

Type	Features
dev-cpu	Ephemeral: will be automatically deleted after 3 hours. CPU: model inference runs on the CPU. Ideal for testing integrations and prototyping applications.
dev-gpu	Ephemeral: will be automatically deleted after 3 hours. GPU: models need GPU acceleration (like Florence 2). Ideal for testing integrations and prototyping applications.
prod-cpu	Persistent: dedicated subdomain `<some-name>.roboflow.cloud`. CPU: model inference runs on the CPU. Ideal for serving production traffic.
prod-gpu	Persistent: dedicated subdomain `<some-name>.roboflow.cloud`. GPU: models need GPU acceleration (like Florence 2). Ideal for serving production traffic.

Billing

The rate for GPU deployments (dev-gpu, prod-gpu) is 1 credit/hour, while the rate for CPU deployments (dev-cpu, prod-cpu) is 0.25 credit/hour.

If you prefer to be billed based on number of requests sent to your dedicated deployment server, please contact our sales team.