Install on Linux
The easiest way to start the correct container optimized for your machine and with good default settings (like a cache volume and a secure, non-privileged execution mode) is to use the CLI to choose and start the container using the inference server start command. (Note: you will need to install Docker first):
pip install inference-cli
inference server start
Manually Starting the Container
If you want more control of the container settings you can also start it manually.
The core CPU Docker image includes support for OpenVINO acceleration on x64 CPUs via onnxruntime. Heavy models like SAM2 may run too slowly (dozens of seconds per image) to be practical (and you should look into getting a CUDA-capable GPU if you want to use them).
The primary use-cases for CPU inference are processing still images (e.g. for NSFW classification of uploads or document verification) or infrequent sampling of frames on a video (e.g. for occupancy tracking of a parking lot).
To get started with CPU inference, use the roboflow/roboflow-inference-server-cpu:latest container.
sudo docker run -d \
--name inference-server \
--read-only \
-p 9001:9001 \
--volume ~/.inference/cache:/tmp:rw \
--security-opt="no-new-privileges" \
--cap-drop="ALL" \
--cap-add="NET_BIND_SERVICE" \
roboflow/roboflow-inference-server-cpu:latest
Docker Compose
If you are using Docker Compose for your application, the equivalent yaml is:
version: "3.9"
services:
inference-server:
container_name: inference-server
image: roboflow/roboflow-inference-server-cpu:latest
read_only: true
ports:
- "9001:9001"
volumes:
- "${HOME}/.inference/cache:/tmp:rw"
security_opt:
- no-new-privileges
cap_drop:
- ALL
cap_add:
- NET_BIND_SERVICE
Using Your New Server
See Using Your New Server for next steps.
Enterprise Considerations
A Helm Chart is available for enterprise cloud deployments. Enterprise networking solutions to support deployment in OT networks are also available upon request.
Roboflow also offers customized support and installation packages and a pre-configured Jetson-based edge device suitable for rapid prototyping. Contact our sales team if you're part of a large organization and interested in learning more.