Download Model Weights
Automatic Download via Roboflow Inference (Recommended)
Roboflow Inference is our open-source, scalable system for running models locally on CPU and GPU devices.
This is the fastest and most reliable way to get started. When you use Inference, you do not need to manage files or versioning; Roboflow Inference automatically fetches and caches your model weights the first time you run your code.
- How it works: On your first inference request, the weights are downloaded from Roboflow's servers and stored locally. All future predictions use this local cache -- images are not sent to the cloud.
- Deployment options:
- Workflows: For production-ready multi-step computer vision workflows
- Python Inference SDK: For Python integration
By default, model weights are stored in /tmp/cache, which is cleared on system reboot. For production deployments or scenarios where you need weights to persist across reboots, you must configure a persistent cache directory using the MODEL_CACHE_DIR environment variable (see Cache Location below).
For enterprise deployments requiring completely disconnected operation, see our Enterprise Offline Mode documentation. This guide focuses on model weights download and caching, while maintaining connectivity for usage tracking, billing, and workflow updates.
Cache Location
By default, model weights are cached in /tmp/cache. This directory is cleared on system reboot, which means you will need to re-download model weights after each restart.
For production deployments or any scenario where you need weights to persist across reboots, you must configure a persistent cache directory using the MODEL_CACHE_DIR environment variable:
import os
# Set to a persistent directory (not /tmp)
os.environ["MODEL_CACHE_DIR"] = "/home/user/.roboflow/cache"
from inference import get_model
# ... rest of your code
Alternatively, set it system-wide:
export MODEL_CACHE_DIR="/home/user/.roboflow/cache"
Make sure the directory exists and has appropriate permissions:
mkdir -p /home/user/.roboflow/cache
chmod 755 /home/user/.roboflow/cache
When running Inference in Docker, mount a persistent cache volume to preserve weights across container restarts. See Docker Configuration Options for details.
Native Python API
The native Python API automatically downloads and caches weights when you load a model with get_model().
Pre-downloading Weights
from inference import get_model
# Load model (downloads and caches weights)
model = get_model(
model_id="rfdetr-base",
api_key="YOUR_ROBOFLOW_API_KEY"
)
print("Model weights cached!")
Running Inference
from inference import get_model
# Uses cached weights for on-device inference
model = get_model(
model_id="rfdetr-base",
api_key="YOUR_ROBOFLOW_API_KEY"
)
results = model.infer("path/to/image.jpg")
Manual Model Weights Download
Sometimes you may need the raw weights file (e.g., a PyTorch .pt file) to run on devices Roboflow does not yet natively support, such as custom Android implementations.
Premium Feature: Manual weights download is only available for paid users on Core plans and certain Enterprise customers. Read more on the Pricing page.
Method A: Roboflow Platform
Navigate to the Model version within your Project. If your plan allows, clicking the "Download Weights" button will allow you to download the weights. This will provide a file you can convert for use on embedded devices.
Method B: Python SDK
You can also use the Roboflow Python package to download weights directly to your directory:
import roboflow
rf = roboflow.Roboflow(api_key="YOUR_API_KEY")
model = rf.workspace().project("PROJECT_ID").version("1").model
model.download() # Downloads 'weights.pt' to your local folder
Roboflow does not provide technical support for model weights used outside of the Roboflow Inference ecosystem. For the best experience, we recommend using the Inference path outlined above.
Best Practices
- Configure Persistent Cache First: Before downloading any weights, configure
MODEL_CACHE_DIRto point to a persistent directory (not/tmp). This is essential for production deployments. - Pre-download During Setup: Download all required model weights during your deployment setup phase to ensure they are cached and ready.
- Use Persistent Cache in Docker: Always mount a persistent volume when running in Docker containers.
- Verify Before Deployment: Always verify that models are properly cached and that the cache directory persists across reboots before deploying to production environments.
- Document Model IDs: Keep a list of all model IDs and versions your application requires for easier pre-caching and troubleshooting.
- Consider Storage: Model weights can be large (100MB - 1GB+ per model). Ensure sufficient disk space is available in your persistent cache directory.
- Test Reboot Behavior: After caching weights, test that they persist after a system reboot.
Troubleshooting
Weights Disappear After Reboot
The default cache location (/tmp/cache) is cleared on reboot. Configure a persistent cache directory as described in the Cache Location section, or use a persistent volume mount for Docker.
Model Not Found Error
- Verify the model was actually downloaded (check cache directory with
ls -lh $MODEL_CACHE_DIR) - Ensure you are using the exact same
model_idas when downloading - Check that
MODEL_CACHE_DIRis set correctly if using a custom location
Permission Issues
Ensure the application has read/write permissions to the cache directory:
chmod -R 755 /path/to/cache