Inference Features
Inference aspires to be a one-stop shop for all of your computer vision needs (and where the feature you seek is not in stock, it's straightforward to extend the functionality to suit your needs). A non-exhaustive list of features is captured here.
Model Serving
The core of Inference is centered around serving computer vision models. It implements architectures for tasks like Object Detection, Image Classification, Instance Segmentation, Keypoint Detection, Image Embedding, OCR, Visual Question Answering, Gaze Detection, and more.
Image Processing
A model is only as good as the image it receives. As the old adage says, "Garbage In, Garbage Out", which is why Inference includes and applies the same image pre- and post-processing methods that models use during training. It also applies these steps efficiently to avoid unnecessary latency.
Video Stream Management
To do video inference properly requires attention to the image ingestion pipeline. Inference spawns separate threads to process video streams so that your model is never left hanging and always gets the most recent frame possible.
Workflows
If models are the brains and cameras are the eyes, Workflows are the nervous system and Workflow Blocks are the body's other organs and tools. By declaring a computation graph, Workflows efficiently pipe and parallelize data through models, logic, integrations, and custom code. There are already over 100 different Workflow Blocks included with new ones contributed regularly.
Server
Inference exposes an HTTP server for interfacing with its functionality. This lets you use it as a micro-service in a larger system and programmatically command its operations.
SDK
The included Python Client makes it easy to interact with the server from your applications.
CLI
The Command Line Interface provides convenience methods for starting the server, running benchmarks, cloud deployment, and testing.
Speed
Built-in optimizations like automatic parallelization via multiprocessing, hardware acceleration, and dynamic batching help you get the most out of your hardware. With the optional TensorRT flag you can also take advantage of quantization and device-specific layer fusion optimizations on supported GPUs. This lets you run more streams and process with higher resolution and lower latency.
Offline Cache
By pulling down models and Workflow definitions and storing them locally, the Inference Server can operate in offline mode.
Insights
Inference can connect with Roboflow's platform to help you monitor and improve your models by uploading outlier data, accessing stats and telemetry, and tying into downstream data sinks and business intelligence suites.
Roboflow Integration
With over 100k fine-tuned models available, Roboflow Universe is the largest repository of computer vision models in the world. It also creates the most popular platform for training custom computer vision models. Inference's optional integration with the Roboflow platform supercharges CV applications by providing access to the best and most relevant problems for solving computer vision problems.
Portability
We support running Inference on a myriad of devices from MacOS development machines to beefy cloud servers to tiny edge devices and AI appliances. Simply swap out the Docker tag and the same code you're running locally can be deployed to another platform.
Extensibility
As an open source project, Inference can be forked and extended to add new models, Workflow Blocks, backends, and more. We also support using Dynamic Python Blocks to seamlessly bridge the gaps between blocks.
Cutting Edge Updates
With over 60 releases last year, Inference is constantly being updated. It often adopts new state of the art models within days of their release and new functionality is added weekly.
Stability and Scalability
Inference has a suite of over 350 tests that run on every Pull Request across each supported target. It's used in production by many of the world's largest companies and is backed by the creator of the industry standard computer vision platform. It's powered billions of inferences across hundreds of thousands of models. It also powers Roboflow's core model serving infrastructure and products.
Security
We undergo an annual pen test and SOC 2 Type II audit and have an infrastructure and security team working to mitigate issues and respond to vulnerabilities and issues flagged by automated scans and external parties.
Support
Roboflow has a fully staffed customer support and professional services organization available for enterprise customers.