`inference_sdk` API Reference

`Top-level`

Top-level SDK configuration: API URLs, timeouts, environment variable loading, and remote execution settings.

`inference_sdk.config`

Module could not be imported for documentation.

`http`

Core HTTP client for making inference requests. InferenceHTTPClient supports object detection, classification, segmentation, keypoint detection, OCR, CLIP embeddings, and workflow execution.

`inference_sdk.http.client`

Module could not be imported for documentation.

`inference_sdk.http.entities`

Module could not be imported for documentation.

`inference_sdk.http.errors`

`class APIKeyNotProvided`

Error for API key not provided.

`class EncodingError`

Error for encoding errors.

`class HTTPCallErrorError`

Error for HTTP call errors.

Attributes: description: The description of the error. status_code: The status code of the error. api_message: The API message of the error.

`init(self, description: str, status_code: int, api_message: str | None)`

Initialize self. See help(type(self)) for accurate signature.

`class HTTPClientError`

Base class for HTTP client errors.

`class InvalidInputFormatError`

Error for invalid input format.

`class InvalidModelIdentifier`

Error for invalid model identifier.

`class InvalidParameterError`

Error for invalid parameter.

`class ModelNotInitializedError`

Error for model not initialized.

`class ModelNotSelectedError`

Error for model not selected.

`class ModelTaskTypeNotSupportedError`

Error for model task type not supported.

`class RetryError`

Common base class for all non-exit exceptions.

`init(self, description: str, status_code: int | None = None, inner_error: Exception | None = None)`

Initialize self. See help(type(self)) for accurate signature.

`class WrongClientModeError`

Error for wrong client mode.

`http/utils`

Internal utilities for request building, image encoding/decoding, response post-processing, retries, and API key handling.

`inference_sdk.http.utils.aliases`

`resolve_ocr_path(model_name: str) -> str`

Resolve an OCR model name to its corresponding endpoint path.

Args: model_name: The name of the OCR model.

Returns: The endpoint path for the OCR model.

`resolve_roboflow_model_alias(model_id: str) -> str`

Resolve a Roboflow model alias to a registered model ID.

Args: model_id: The model alias to resolve.

Returns: The registered model ID.

`inference_sdk.http.utils.encoding`

`bytes_to_opencv_image(payload: bytes, array_type: numpy.number = <class 'numpy.uint8'>) -> numpy.ndarray`

Decode a bytes object to an OpenCV image.

Args: payload: The bytes object to decode. array_type: The type of the array.

Returns: The OpenCV image.

`bytes_to_pillow_image(payload: bytes) -> PIL.Image.Image`

Decode a bytes object to a PIL image.

Args: payload: The bytes object to decode.

Returns: The PIL image.

`encode_base_64(payload: bytes) -> str`

Encode a bytes object to a base64 string.

Args: payload: The bytes object to encode.

Returns: The base64 string.

`numpy_array_to_base64_jpeg(image: numpy.ndarray) -> str`

Encode a numpy array to a base64 JPEG string.

Args: image: The numpy array to encode.

Returns: The base64 JPEG string.

`pillow_image_to_base64_jpeg(image: PIL.Image.Image) -> str`

Encode a PIL image to a base64 JPEG string.

Args: image: The PIL image to encode.

Returns: The base64 JPEG string.

`inference_sdk.http.utils.executors`

Module could not be imported for documentation.

`inference_sdk.http.utils.iterables`

`make_batches(iterable: Iterable[~T], batch_size: int) -> Generator[List[~T], NoneType, NoneType]`

Make batches from an iterable.

Args: iterable: The iterable to make batches from. batch_size: The size of the batches.

Returns: The batches.

`remove_empty_values(dictionary: dict) -> dict`

Remove empty values from a dictionary.

Args: dictionary: The dictionary to remove empty values from.

Returns: The dictionary with empty values removed.

`unwrap_single_element_list(sequence: List[~T]) -> ~T | List[~T]`

Unwrap a single element list.

Args: sequence: The list to unwrap.

Returns: The unwrapped list.

`inference_sdk.http.utils.loaders`

Module could not be imported for documentation.

`inference_sdk.http.utils.post_processing`

Module could not be imported for documentation.

`inference_sdk.http.utils.pre_processing`

`determine_scaling_aspect_ratio(image_height: int, image_width: int, max_height: int, max_width: int) -> float | None`

Determine the scaling aspect ratio.

Args: image_height: The height of the image. image_width: The width of the image. max_height: The maximum height of the image. max_width: The maximum width of the image.

Returns: The scaling aspect ratio.

`resize_opencv_image(image: numpy.ndarray, max_height: int | None, max_width: int | None) -> Tuple[numpy.ndarray, float | None]`

Resize an OpenCV image.

Args: image: The image to resize. max_height: The maximum height of the image. max_width: The maximum width of the image.

Returns: The resized image and the scaling factor.

`resize_pillow_image(image: PIL.Image.Image, max_height: int | None, max_width: int | None) -> Tuple[PIL.Image.Image, float | None]`

Resize a Pillow image.

Args: image: The image to resize. max_height: The maximum height of the image. max_width: The maximum width of the image.

Returns: The resized image and the scaling factor.

`inference_sdk.http.utils.profilling`

`save_workflows_profiler_trace(directory: str, profiler_trace: List[dict]) -> None`

Save a workflow profiler trace.

Args: directory: The directory to save the profiler trace. profiler_trace: The profiler trace.

`inference_sdk.http.utils.request_building`

`class ImagePlacement`

Create a collection of name/value pairs.

Example enumeration:

class Color(Enum): ... RED = 1 ... BLUE = 2 ... GREEN = 3

Access them by:

attribute access:
Color.RED <Color.RED: 1>
value lookup:
Color(1) <Color.RED: 1>
name lookup:
Color['RED'] <Color.RED: 1>

Enumerations can be iterated over, and know how many members they have:

len(Color) 3

list(Color) [<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]

Methods can be added to enumerations, and members can have their own attributes -- see the documentation for details.

`class RequestData`

Data class for request data.

Attributes: url: The URL of the request. request_elements: The number of request elements. headers: The headers of the request. parameters: The parameters of the request. data: The data of the request. payload: The payload of the request. image_scaling_factors: The scaling factors of the images.

`init(self, url: str, request_elements: int, headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, data: str | bytes | None, payload: Dict[str, Any] | None, image_scaling_factors: List[float | None]) -> None`

Initialize self. See help(type(self)) for accurate signature.

`assembly_request_data(url: str, batch_inference_inputs: List[Tuple[str, float | None]], headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, payload: Dict[str, Any] | None, image_placement: inference_sdk.http.utils.request_building.ImagePlacement) -> inference_sdk.http.utils.request_building.RequestData`

Assemble request data.

Args: url: The URL of the request. batch_inference_inputs: The batch inference inputs. headers: The headers of the request. parameters: The parameters of the request. payload: The payload of the request. image_placement: The image placement.

Returns: The request data.

`prepare_requests_data(url: str, encoded_inference_inputs: List[Tuple[str, float | None]], headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, payload: Dict[str, Any] | None, max_batch_size: int, image_placement: inference_sdk.http.utils.request_building.ImagePlacement) -> List[inference_sdk.http.utils.request_building.RequestData]`

Prepare requests data.

Args: url: The URL of the request. encoded_inference_inputs: The encoded inference inputs. headers: The headers of the request. parameters: The parameters of the request. payload: The payload of the request. max_batch_size: The maximum batch size. image_placement: The image placement.

Returns: The list of request data.

`inference_sdk.http.utils.requests`

`api_key_safe_raise_for_status(response: requests.models.Response) -> None`

Raise an exception if the request is not successful.

Args: response: The response of the request.

`deduct_api_key(match: re.Match) -> str`

Deduct the API key from the string.

Args: match: The match of the API key.

Returns: The string with the API key deducted.

`deduct_api_key_from_string(value: str) -> str`

Deduct the API key from the string.

Args: value: The string to deduct the API key from.

Returns: The string with the API key deducted.

`inject_images_into_payload(payload: dict, encoded_images: List[Tuple[str, float | None]], key: str = 'image') -> dict`

Inject images into the payload.

Args: payload: The payload to inject the images into. encoded_images: The encoded images. key: The key of the images.

Returns: The payload with the images injected.

`inject_nested_batches_of_images_into_payload(payload: dict, encoded_images: list | Tuple[str, float | None], key: str = 'image') -> dict`

Inject nested batches of images into the payload.

Args: payload: The payload to inject the images into. encoded_images: The encoded images. key: The key of the images.

Returns: The payload with the images injected.

`utils`

General-purpose helpers: lifecycle decorators, environment variable parsing, and SDK logging.

`inference_sdk.utils.decorators`

`deprecated(reason: str)`

Create a decorator that marks functions as deprecated.

This decorator will emit a warning when the decorated function is called, indicating that the function is deprecated and providing a reason.

Args: reason (str): The reason why the function is deprecated.

Returns: callable: A decorator function that can be applied to mark functions as deprecated.

`experimental(info: str)`

Create a decorator that marks functions as experimental.

This decorator will emit a warning when the decorated function is called, indicating that the function is experimental and providing additional information.

Args: info (str): Information about the experimental status of the function.

Returns: callable: A decorator function that can be applied to mark functions as experimental.

`inference_sdk.utils.environment`

`str2bool(value: str | bool) -> bool`

Convert a string or boolean value to a boolean.

Args: value (Union[str, bool]): The value to convert. Can be either a string ('true'/'false') or a boolean value.

Returns: bool: The boolean value. Returns True for 'true' (case-insensitive) or True input, False for 'false' (case-insensitive) or False input.

Raises: ValueError: If the input string is not 'true' or 'false' (case-insensitive).

`inference_sdk.utils.logging`

Centralized logging configuration for the Inference SDK.

`get_logger(module_name: str) -> logging.Logger`

Get a logger for the specified module.

Automatically configures basic logging on first use if no handlers exist.

Args: module_name: Name of the module requesting the logger.

Returns: logging.Logger: Configured logger for the module.

`webrtc`

WebRTC streaming client for real-time video inference over peer connections.

`inference_sdk.webrtc.client`

Module could not be imported for documentation.

`inference_sdk.webrtc.config`

Module could not be imported for documentation.

`inference_sdk.webrtc.datachannel`

Module could not be imported for documentation.

`inference_sdk.webrtc.session`

Module could not be imported for documentation.

`inference_sdk.webrtc.sources`

Module could not be imported for documentation.

inference_sdk API Reference

Top-level

inference_sdk.config

http

inference_sdk.http.client

inference_sdk.http.entities

inference_sdk.http.errors

class APIKeyNotProvided

class EncodingError

class HTTPCallErrorError

__init__(self, description: str, status_code: int, api_message: str | None)

class HTTPClientError

class InvalidInputFormatError

class InvalidModelIdentifier

class InvalidParameterError

class ModelNotInitializedError

class ModelNotSelectedError

class ModelTaskTypeNotSupportedError

class RetryError

__init__(self, description: str, status_code: int | None = None, inner_error: Exception | None = None)

class WrongClientModeError

http/utils

inference_sdk.http.utils.aliases

resolve_ocr_path(model_name: str) -> str

resolve_roboflow_model_alias(model_id: str) -> str

inference_sdk.http.utils.encoding

bytes_to_opencv_image(payload: bytes, array_type: numpy.number = <class 'numpy.uint8'>) -> numpy.ndarray

bytes_to_pillow_image(payload: bytes) -> PIL.Image.Image

encode_base_64(payload: bytes) -> str

numpy_array_to_base64_jpeg(image: numpy.ndarray) -> str

pillow_image_to_base64_jpeg(image: PIL.Image.Image) -> str

inference_sdk.http.utils.executors

inference_sdk.http.utils.iterables

make_batches(iterable: Iterable[~T], batch_size: int) -> Generator[List[~T], NoneType, NoneType]

remove_empty_values(dictionary: dict) -> dict

unwrap_single_element_list(sequence: List[~T]) -> ~T | List[~T]

inference_sdk.http.utils.loaders

inference_sdk.http.utils.post_processing

inference_sdk.http.utils.pre_processing

determine_scaling_aspect_ratio(image_height: int, image_width: int, max_height: int, max_width: int) -> float | None

resize_opencv_image(image: numpy.ndarray, max_height: int | None, max_width: int | None) -> Tuple[numpy.ndarray, float | None]

resize_pillow_image(image: PIL.Image.Image, max_height: int | None, max_width: int | None) -> Tuple[PIL.Image.Image, float | None]

inference_sdk.http.utils.profilling

save_workflows_profiler_trace(directory: str, profiler_trace: List[dict]) -> None

inference_sdk.http.utils.request_building

class ImagePlacement

class RequestData

__init__(self, url: str, request_elements: int, headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, data: str | bytes | None, payload: Dict[str, Any] | None, image_scaling_factors: List[float | None]) -> None

inference_sdk.http.utils.requests

api_key_safe_raise_for_status(response: requests.models.Response) -> None

deduct_api_key(match: re.Match) -> str

deduct_api_key_from_string(value: str) -> str

inject_images_into_payload(payload: dict, encoded_images: List[Tuple[str, float | None]], key: str = 'image') -> dict

inject_nested_batches_of_images_into_payload(payload: dict, encoded_images: list | Tuple[str, float | None], key: str = 'image') -> dict

utils

inference_sdk.utils.decorators

deprecated(reason: str)

experimental(info: str)

inference_sdk.utils.environment

str2bool(value: str | bool) -> bool

inference_sdk.utils.logging

get_logger(module_name: str) -> logging.Logger

webrtc

inference_sdk.webrtc.client

inference_sdk.webrtc.config

inference_sdk.webrtc.datachannel

inference_sdk.webrtc.session

inference_sdk.webrtc.sources

`inference_sdk` API Reference

`Top-level`

`inference_sdk.config`

`http`

`inference_sdk.http.client`

`inference_sdk.http.entities`

`inference_sdk.http.errors`

`class APIKeyNotProvided`

`class EncodingError`

`class HTTPCallErrorError`

`init(self, description: str, status_code: int, api_message: str | None)`

`class HTTPClientError`

`class InvalidInputFormatError`

`class InvalidModelIdentifier`

`class InvalidParameterError`

`class ModelNotInitializedError`

`class ModelNotSelectedError`

`class ModelTaskTypeNotSupportedError`

`class RetryError`

`init(self, description: str, status_code: int | None = None, inner_error: Exception | None = None)`

`class WrongClientModeError`

`http/utils`

`inference_sdk.http.utils.aliases`

`resolve_ocr_path(model_name: str) -> str`

`resolve_roboflow_model_alias(model_id: str) -> str`

`inference_sdk.http.utils.encoding`

`bytes_to_opencv_image(payload: bytes, array_type: numpy.number = <class 'numpy.uint8'>) -> numpy.ndarray`

`bytes_to_pillow_image(payload: bytes) -> PIL.Image.Image`

`encode_base_64(payload: bytes) -> str`

`numpy_array_to_base64_jpeg(image: numpy.ndarray) -> str`

`pillow_image_to_base64_jpeg(image: PIL.Image.Image) -> str`

`inference_sdk.http.utils.executors`

`inference_sdk.http.utils.iterables`

`make_batches(iterable: Iterable[~T], batch_size: int) -> Generator[List[~T], NoneType, NoneType]`

`remove_empty_values(dictionary: dict) -> dict`

`unwrap_single_element_list(sequence: List[~T]) -> ~T | List[~T]`

`inference_sdk.http.utils.loaders`

`inference_sdk.http.utils.post_processing`

`inference_sdk.http.utils.pre_processing`

`determine_scaling_aspect_ratio(image_height: int, image_width: int, max_height: int, max_width: int) -> float | None`

`resize_opencv_image(image: numpy.ndarray, max_height: int | None, max_width: int | None) -> Tuple[numpy.ndarray, float | None]`

`resize_pillow_image(image: PIL.Image.Image, max_height: int | None, max_width: int | None) -> Tuple[PIL.Image.Image, float | None]`

`inference_sdk.http.utils.profilling`

`save_workflows_profiler_trace(directory: str, profiler_trace: List[dict]) -> None`

`inference_sdk.http.utils.request_building`

`class ImagePlacement`

`class RequestData`

`init(self, url: str, request_elements: int, headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, data: str | bytes | None, payload: Dict[str, Any] | None, image_scaling_factors: List[float | None]) -> None`

`inference_sdk.http.utils.requests`

`api_key_safe_raise_for_status(response: requests.models.Response) -> None`

`deduct_api_key(match: re.Match) -> str`

`deduct_api_key_from_string(value: str) -> str`

`inject_images_into_payload(payload: dict, encoded_images: List[Tuple[str, float | None]], key: str = 'image') -> dict`

`inject_nested_batches_of_images_into_payload(payload: dict, encoded_images: list | Tuple[str, float | None], key: str = 'image') -> dict`

`utils`

`inference_sdk.utils.decorators`

`deprecated(reason: str)`

`experimental(info: str)`

`inference_sdk.utils.environment`

`str2bool(value: str | bool) -> bool`

`inference_sdk.utils.logging`

`get_logger(module_name: str) -> logging.Logger`

`webrtc`

`inference_sdk.webrtc.client`

`inference_sdk.webrtc.config`

`inference_sdk.webrtc.datachannel`

`inference_sdk.webrtc.session`

`inference_sdk.webrtc.sources`