Dedicated Deployments
What are Dedicated Deployments?
Dedicated Deployments are private cloud servers managed by Roboflow, specifically designed to run your computer vision models. These models can include:
- Object detection
- Image segmentation
- Classification
- Keypoint detection
- Foundation models like CLIP (if trained on Roboflow)
- Roboflow Workflows (low-code vision applications)
- ...and many others
Benefits of Dedicated Deployments
- Focus on your machine vision business problem, leave the infrastructure to us: Spin up inference serving infrastructure with a few clicks and without having to signup with cloud providers, installing and securing servers, managing TLS certificates or worrying about server management, patching, updates etc.
- Dedicated Resources: Get cloud servers allocated specifically for your use, ensuring consistent performance for your models.
- Secure Access: Dedicated Deployments are accessible with your workspace's unique API key and utilize HTTPS for secure communication.
- Easy Integration: Each deployment receives a subdomain within
roboflow.cloud, simplifying integration with your applications. - Pay-Per-Hour: You are only charged for the duration of the server's existence (billed in 1 minute intervals).
- Auto Pause & Resume: Your Dedicated Deployments will automatically pause after a configurable period of inactivity. For
dev-cpuordev-gpudeployment types, this period is fixed at 1 hour. They can be quickly resumed by sending a request with your API key. This feature is designed to help you save on costs.
Current Limitations
- All dedicated deployments are currently hosted in US-based data centers; users from other geographies may see higher latencies. Please contact us for a customized solution if you are outside of US, we can help you to reduce the network latency.
- Dedicated Deployments are available to Core and Enterprise plan workspaces. See Roboflow plans.
Types of Dedicated Deployments
Roboflow offers 4 different types of Dedicated Deployments: dev-cpu, dev-gpu, prod-cpu, and prod-gpu. While dev-cpu and dev-gpu are designed for development and testing purposes and will be deleted automatically after a few hours, prod-cpu and prod-gpu are persistent, ideal for serving large-scale production traffic.
| Type | Features |
|---|---|
| dev-cpu | Ephemeral: will be automatically deleted after 3 hours. CPU: model inference runs on the CPU. Ideal for testing integrations and prototyping applications. |
| dev-gpu | Ephemeral: will be automatically deleted after 3 hours. GPU: models need GPU acceleration (like Florence 2). Ideal for testing integrations and prototyping applications. |
| prod-cpu | Persistent: dedicated subdomain <some-name>.roboflow.cloud. CPU: model inference runs on the CPU. Ideal for serving production traffic. |
| prod-gpu | Persistent: dedicated subdomain <some-name>.roboflow.cloud. GPU: models need GPU acceleration (like Florence 2). Ideal for serving production traffic. |
Billing
The rate for GPU deployments (dev-gpu, prod-gpu) is 1 credit/hour, while the rate for CPU deployments (dev-cpu, prod-cpu) is 0.25 credit/hour.
If you prefer to be billed based on number of requests sent to your dedicated deployment server, please contact our sales team.
All dedicated deployment servers will run Roboflow Inference, our open-source inference server. Review the Roboflow Inference documentation to learn more about all of the features available.