Download Model Weights

Automatic Download via Roboflow Inference (Recommended)

Roboflow Inference is our open-source, scalable system for running models locally on CPU and GPU devices.

This is the fastest and most reliable way to get started. When you use Inference, you do not need to manage files or versioning; Roboflow Inference automatically fetches and caches your model weights the first time you run your code.

  • How it works: On your first inference request, the weights are downloaded from Roboflow's servers and stored locally. All future predictions use this local cache -- images are not sent to the cloud.
  • Deployment options:
Warning

By default, model weights are stored in /tmp/cache, which is cleared on system reboot. For production deployments or scenarios where you need weights to persist across reboots, you must configure a persistent cache directory using the MODEL_CACHE_DIR environment variable (see Cache Location below).

Note

For enterprise deployments requiring completely disconnected operation, see our Enterprise Offline Mode documentation. This guide focuses on model weights download and caching, while maintaining connectivity for usage tracking, billing, and workflow updates.

Cache Location

By default, model weights are cached in /tmp/cache. This directory is cleared on system reboot, which means you will need to re-download model weights after each restart.

For production deployments or any scenario where you need weights to persist across reboots, you must configure a persistent cache directory using the MODEL_CACHE_DIR environment variable:

import os
# Set to a persistent directory (not /tmp)
os.environ["MODEL_CACHE_DIR"] = "/home/user/.roboflow/cache"

from inference import get_model
# ... rest of your code

Alternatively, set it system-wide:

export MODEL_CACHE_DIR="/home/user/.roboflow/cache"

Make sure the directory exists and has appropriate permissions:

mkdir -p /home/user/.roboflow/cache
chmod 755 /home/user/.roboflow/cache
Note

When running Inference in Docker, mount a persistent cache volume to preserve weights across container restarts. See Docker Configuration Options for details.

Native Python API

The native Python API automatically downloads and caches weights when you load a model with get_model().

Pre-downloading Weights

from inference import get_model

# Load model (downloads and caches weights)
model = get_model(
    model_id="rfdetr-base",
    api_key="YOUR_ROBOFLOW_API_KEY"
)
print("Model weights cached!")

Running Inference

from inference import get_model

# Uses cached weights for on-device inference
model = get_model(
    model_id="rfdetr-base",
    api_key="YOUR_ROBOFLOW_API_KEY"
)

results = model.infer("path/to/image.jpg")

Manual Model Weights Download

Sometimes you may need the raw weights file (e.g., a PyTorch .pt file) to run on devices Roboflow does not yet natively support, such as custom Android implementations.

Warning

Premium Feature: Manual weights download is only available for paid users on Core plans and certain Enterprise customers. Read more on the Pricing page.

Method A: Roboflow Platform

Navigate to the Model version within your Project. If your plan allows, clicking the "Download Weights" button will allow you to download the weights. This will provide a file you can convert for use on embedded devices.

Method B: Python SDK

You can also use the Roboflow Python package to download weights directly to your directory:

import roboflow

rf = roboflow.Roboflow(api_key="YOUR_API_KEY")
model = rf.workspace().project("PROJECT_ID").version("1").model
model.download() # Downloads 'weights.pt' to your local folder
Note

Roboflow does not provide technical support for model weights used outside of the Roboflow Inference ecosystem. For the best experience, we recommend using the Inference path outlined above.

Best Practices

  1. Configure Persistent Cache First: Before downloading any weights, configure MODEL_CACHE_DIR to point to a persistent directory (not /tmp). This is essential for production deployments.
  2. Pre-download During Setup: Download all required model weights during your deployment setup phase to ensure they are cached and ready.
  3. Use Persistent Cache in Docker: Always mount a persistent volume when running in Docker containers.
  4. Verify Before Deployment: Always verify that models are properly cached and that the cache directory persists across reboots before deploying to production environments.
  5. Document Model IDs: Keep a list of all model IDs and versions your application requires for easier pre-caching and troubleshooting.
  6. Consider Storage: Model weights can be large (100MB - 1GB+ per model). Ensure sufficient disk space is available in your persistent cache directory.
  7. Test Reboot Behavior: After caching weights, test that they persist after a system reboot.

Troubleshooting

Weights Disappear After Reboot

The default cache location (/tmp/cache) is cleared on reboot. Configure a persistent cache directory as described in the Cache Location section, or use a persistent volume mount for Docker.

Model Not Found Error

  • Verify the model was actually downloaded (check cache directory with ls -lh $MODEL_CACHE_DIR)
  • Ensure you are using the exact same model_id as when downloading
  • Check that MODEL_CACHE_DIR is set correctly if using a custom location

Permission Issues

Ensure the application has read/write permissions to the cache directory:

chmod -R 755 /path/to/cache