inference_sdk API Reference

Top-level

Top-level SDK configuration: API URLs, timeouts, environment variable loading, and remote execution settings.

inference_sdk.config

Module could not be imported for documentation.

http

Core HTTP client for making inference requests. InferenceHTTPClient supports object detection, classification, segmentation, keypoint detection, OCR, CLIP embeddings, and workflow execution.

inference_sdk.http.client

Module could not be imported for documentation.

inference_sdk.http.entities

Module could not be imported for documentation.

inference_sdk.http.errors

class APIKeyNotProvided

Error for API key not provided.

class EncodingError

Error for encoding errors.

class HTTPCallErrorError

Error for HTTP call errors.

Attributes: description: The description of the error. status_code: The status code of the error. api_message: The API message of the error.

__init__(self, description: str, status_code: int, api_message: str | None)

Initialize self. See help(type(self)) for accurate signature.

class HTTPClientError

Base class for HTTP client errors.

class InvalidInputFormatError

Error for invalid input format.

class InvalidModelIdentifier

Error for invalid model identifier.

class InvalidParameterError

Error for invalid parameter.

class ModelNotInitializedError

Error for model not initialized.

class ModelNotSelectedError

Error for model not selected.

class ModelTaskTypeNotSupportedError

Error for model task type not supported.

class RetryError

Common base class for all non-exit exceptions.

__init__(self, description: str, status_code: int | None = None, inner_error: Exception | None = None)

Initialize self. See help(type(self)) for accurate signature.

class WrongClientModeError

Error for wrong client mode.

http/utils

Internal utilities for request building, image encoding/decoding, response post-processing, retries, and API key handling.

inference_sdk.http.utils.aliases

resolve_ocr_path(model_name: str) -> str

Resolve an OCR model name to its corresponding endpoint path.

Args: model_name: The name of the OCR model.

Returns: The endpoint path for the OCR model.

resolve_roboflow_model_alias(model_id: str) -> str

Resolve a Roboflow model alias to a registered model ID.

Args: model_id: The model alias to resolve.

Returns: The registered model ID.

inference_sdk.http.utils.encoding

bytes_to_opencv_image(payload: bytes, array_type: numpy.number = <class 'numpy.uint8'>) -> numpy.ndarray

Decode a bytes object to an OpenCV image.

Args: payload: The bytes object to decode. array_type: The type of the array.

Returns: The OpenCV image.

bytes_to_pillow_image(payload: bytes) -> PIL.Image.Image

Decode a bytes object to a PIL image.

Args: payload: The bytes object to decode.

Returns: The PIL image.

encode_base_64(payload: bytes) -> str

Encode a bytes object to a base64 string.

Args: payload: The bytes object to encode.

Returns: The base64 string.

numpy_array_to_base64_jpeg(image: numpy.ndarray) -> str

Encode a numpy array to a base64 JPEG string.

Args: image: The numpy array to encode.

Returns: The base64 JPEG string.

pillow_image_to_base64_jpeg(image: PIL.Image.Image) -> str

Encode a PIL image to a base64 JPEG string.

Args: image: The PIL image to encode.

Returns: The base64 JPEG string.

inference_sdk.http.utils.executors

Module could not be imported for documentation.

inference_sdk.http.utils.iterables

make_batches(iterable: Iterable[~T], batch_size: int) -> Generator[List[~T], NoneType, NoneType]

Make batches from an iterable.

Args: iterable: The iterable to make batches from. batch_size: The size of the batches.

Returns: The batches.

remove_empty_values(dictionary: dict) -> dict

Remove empty values from a dictionary.

Args: dictionary: The dictionary to remove empty values from.

Returns: The dictionary with empty values removed.

unwrap_single_element_list(sequence: List[~T]) -> ~T | List[~T]

Unwrap a single element list.

Args: sequence: The list to unwrap.

Returns: The unwrapped list.

inference_sdk.http.utils.loaders

Module could not be imported for documentation.

inference_sdk.http.utils.post_processing

Module could not be imported for documentation.

inference_sdk.http.utils.pre_processing

determine_scaling_aspect_ratio(image_height: int, image_width: int, max_height: int, max_width: int) -> float | None

Determine the scaling aspect ratio.

Args: image_height: The height of the image. image_width: The width of the image. max_height: The maximum height of the image. max_width: The maximum width of the image.

Returns: The scaling aspect ratio.

resize_opencv_image(image: numpy.ndarray, max_height: int | None, max_width: int | None) -> Tuple[numpy.ndarray, float | None]

Resize an OpenCV image.

Args: image: The image to resize. max_height: The maximum height of the image. max_width: The maximum width of the image.

Returns: The resized image and the scaling factor.

resize_pillow_image(image: PIL.Image.Image, max_height: int | None, max_width: int | None) -> Tuple[PIL.Image.Image, float | None]

Resize a Pillow image.

Args: image: The image to resize. max_height: The maximum height of the image. max_width: The maximum width of the image.

Returns: The resized image and the scaling factor.

inference_sdk.http.utils.profilling

save_workflows_profiler_trace(directory: str, profiler_trace: List[dict]) -> None

Save a workflow profiler trace.

Args: directory: The directory to save the profiler trace. profiler_trace: The profiler trace.

inference_sdk.http.utils.request_building

class ImagePlacement

Create a collection of name/value pairs.

Example enumeration:

class Color(Enum): ... RED = 1 ... BLUE = 2 ... GREEN = 3

Access them by:

  • attribute access:

    Color.RED <Color.RED: 1>

  • value lookup:

    Color(1) <Color.RED: 1>

  • name lookup:

    Color['RED'] <Color.RED: 1>

Enumerations can be iterated over, and know how many members they have:

len(Color) 3

list(Color) [<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]

Methods can be added to enumerations, and members can have their own attributes -- see the documentation for details.

class RequestData

Data class for request data.

Attributes: url: The URL of the request. request_elements: The number of request elements. headers: The headers of the request. parameters: The parameters of the request. data: The data of the request. payload: The payload of the request. image_scaling_factors: The scaling factors of the images.

__init__(self, url: str, request_elements: int, headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, data: str | bytes | None, payload: Dict[str, Any] | None, image_scaling_factors: List[float | None]) -> None

Initialize self. See help(type(self)) for accurate signature.

assembly_request_data(url: str, batch_inference_inputs: List[Tuple[str, float | None]], headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, payload: Dict[str, Any] | None, image_placement: inference_sdk.http.utils.request_building.ImagePlacement) -> inference_sdk.http.utils.request_building.RequestData

Assemble request data.

Args: url: The URL of the request. batch_inference_inputs: The batch inference inputs. headers: The headers of the request. parameters: The parameters of the request. payload: The payload of the request. image_placement: The image placement.

Returns: The request data.

prepare_requests_data(url: str, encoded_inference_inputs: List[Tuple[str, float | None]], headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, payload: Dict[str, Any] | None, max_batch_size: int, image_placement: inference_sdk.http.utils.request_building.ImagePlacement) -> List[inference_sdk.http.utils.request_building.RequestData]

Prepare requests data.

Args: url: The URL of the request. encoded_inference_inputs: The encoded inference inputs. headers: The headers of the request. parameters: The parameters of the request. payload: The payload of the request. max_batch_size: The maximum batch size. image_placement: The image placement.

Returns: The list of request data.

inference_sdk.http.utils.requests

api_key_safe_raise_for_status(response: requests.models.Response) -> None

Raise an exception if the request is not successful.

Args: response: The response of the request.

deduct_api_key(match: re.Match) -> str

Deduct the API key from the string.

Args: match: The match of the API key.

Returns: The string with the API key deducted.

deduct_api_key_from_string(value: str) -> str

Deduct the API key from the string.

Args: value: The string to deduct the API key from.

Returns: The string with the API key deducted.

inject_images_into_payload(payload: dict, encoded_images: List[Tuple[str, float | None]], key: str = 'image') -> dict

Inject images into the payload.

Args: payload: The payload to inject the images into. encoded_images: The encoded images. key: The key of the images.

Returns: The payload with the images injected.

inject_nested_batches_of_images_into_payload(payload: dict, encoded_images: list | Tuple[str, float | None], key: str = 'image') -> dict

Inject nested batches of images into the payload.

Args: payload: The payload to inject the images into. encoded_images: The encoded images. key: The key of the images.

Returns: The payload with the images injected.

utils

General-purpose helpers: lifecycle decorators, environment variable parsing, and SDK logging.

inference_sdk.utils.decorators

deprecated(reason: str)

Create a decorator that marks functions as deprecated.

This decorator will emit a warning when the decorated function is called, indicating that the function is deprecated and providing a reason.

Args: reason (str): The reason why the function is deprecated.

Returns: callable: A decorator function that can be applied to mark functions as deprecated.

experimental(info: str)

Create a decorator that marks functions as experimental.

This decorator will emit a warning when the decorated function is called, indicating that the function is experimental and providing additional information.

Args: info (str): Information about the experimental status of the function.

Returns: callable: A decorator function that can be applied to mark functions as experimental.

inference_sdk.utils.environment

str2bool(value: str | bool) -> bool

Convert a string or boolean value to a boolean.

Args: value (Union[str, bool]): The value to convert. Can be either a string ('true'/'false') or a boolean value.

Returns: bool: The boolean value. Returns True for 'true' (case-insensitive) or True input, False for 'false' (case-insensitive) or False input.

Raises: ValueError: If the input string is not 'true' or 'false' (case-insensitive).

inference_sdk.utils.logging

Centralized logging configuration for the Inference SDK.

get_logger(module_name: str) -> logging.Logger

Get a logger for the specified module.

Automatically configures basic logging on first use if no handlers exist.

Args: module_name: Name of the module requesting the logger.

Returns: logging.Logger: Configured logger for the module.

webrtc

WebRTC streaming client for real-time video inference over peer connections.

inference_sdk.webrtc.client

Module could not be imported for documentation.

inference_sdk.webrtc.config

Module could not be imported for documentation.

inference_sdk.webrtc.datachannel

Module could not be imported for documentation.

inference_sdk.webrtc.session

Module could not be imported for documentation.

inference_sdk.webrtc.sources

Module could not be imported for documentation.