inference_sdk API Reference
Top-level
Top-level SDK configuration: API URLs, timeouts, environment variable loading, and remote execution settings.
inference_sdk.config
Module could not be imported for documentation.
http
Core HTTP client for making inference requests. InferenceHTTPClient supports object detection, classification, segmentation, keypoint detection, OCR, CLIP embeddings, and workflow execution.
inference_sdk.http.client
Module could not be imported for documentation.
inference_sdk.http.entities
Module could not be imported for documentation.
inference_sdk.http.errors
class APIKeyNotProvided
Error for API key not provided.
class EncodingError
Error for encoding errors.
class HTTPCallErrorError
Error for HTTP call errors.
Attributes: description: The description of the error. status_code: The status code of the error. api_message: The API message of the error.
__init__(self, description: str, status_code: int, api_message: str | None)
Initialize self. See help(type(self)) for accurate signature.
class HTTPClientError
Base class for HTTP client errors.
class InvalidInputFormatError
Error for invalid input format.
class InvalidModelIdentifier
Error for invalid model identifier.
class InvalidParameterError
Error for invalid parameter.
class ModelNotInitializedError
Error for model not initialized.
class ModelNotSelectedError
Error for model not selected.
class ModelTaskTypeNotSupportedError
Error for model task type not supported.
class RetryError
Common base class for all non-exit exceptions.
__init__(self, description: str, status_code: int | None = None, inner_error: Exception | None = None)
Initialize self. See help(type(self)) for accurate signature.
class WrongClientModeError
Error for wrong client mode.
http/utils
Internal utilities for request building, image encoding/decoding, response post-processing, retries, and API key handling.
inference_sdk.http.utils.aliases
resolve_ocr_path(model_name: str) -> str
Resolve an OCR model name to its corresponding endpoint path.
Args: model_name: The name of the OCR model.
Returns: The endpoint path for the OCR model.
resolve_roboflow_model_alias(model_id: str) -> str
Resolve a Roboflow model alias to a registered model ID.
Args: model_id: The model alias to resolve.
Returns: The registered model ID.
inference_sdk.http.utils.encoding
bytes_to_opencv_image(payload: bytes, array_type: numpy.number = <class 'numpy.uint8'>) -> numpy.ndarray
Decode a bytes object to an OpenCV image.
Args: payload: The bytes object to decode. array_type: The type of the array.
Returns: The OpenCV image.
bytes_to_pillow_image(payload: bytes) -> PIL.Image.Image
Decode a bytes object to a PIL image.
Args: payload: The bytes object to decode.
Returns: The PIL image.
encode_base_64(payload: bytes) -> str
Encode a bytes object to a base64 string.
Args: payload: The bytes object to encode.
Returns: The base64 string.
numpy_array_to_base64_jpeg(image: numpy.ndarray) -> str
Encode a numpy array to a base64 JPEG string.
Args: image: The numpy array to encode.
Returns: The base64 JPEG string.
pillow_image_to_base64_jpeg(image: PIL.Image.Image) -> str
Encode a PIL image to a base64 JPEG string.
Args: image: The PIL image to encode.
Returns: The base64 JPEG string.
inference_sdk.http.utils.executors
Module could not be imported for documentation.
inference_sdk.http.utils.iterables
make_batches(iterable: Iterable[~T], batch_size: int) -> Generator[List[~T], NoneType, NoneType]
Make batches from an iterable.
Args: iterable: The iterable to make batches from. batch_size: The size of the batches.
Returns: The batches.
remove_empty_values(dictionary: dict) -> dict
Remove empty values from a dictionary.
Args: dictionary: The dictionary to remove empty values from.
Returns: The dictionary with empty values removed.
unwrap_single_element_list(sequence: List[~T]) -> ~T | List[~T]
Unwrap a single element list.
Args: sequence: The list to unwrap.
Returns: The unwrapped list.
inference_sdk.http.utils.loaders
Module could not be imported for documentation.
inference_sdk.http.utils.post_processing
Module could not be imported for documentation.
inference_sdk.http.utils.pre_processing
determine_scaling_aspect_ratio(image_height: int, image_width: int, max_height: int, max_width: int) -> float | None
Determine the scaling aspect ratio.
Args: image_height: The height of the image. image_width: The width of the image. max_height: The maximum height of the image. max_width: The maximum width of the image.
Returns: The scaling aspect ratio.
resize_opencv_image(image: numpy.ndarray, max_height: int | None, max_width: int | None) -> Tuple[numpy.ndarray, float | None]
Resize an OpenCV image.
Args: image: The image to resize. max_height: The maximum height of the image. max_width: The maximum width of the image.
Returns: The resized image and the scaling factor.
resize_pillow_image(image: PIL.Image.Image, max_height: int | None, max_width: int | None) -> Tuple[PIL.Image.Image, float | None]
Resize a Pillow image.
Args: image: The image to resize. max_height: The maximum height of the image. max_width: The maximum width of the image.
Returns: The resized image and the scaling factor.
inference_sdk.http.utils.profilling
save_workflows_profiler_trace(directory: str, profiler_trace: List[dict]) -> None
Save a workflow profiler trace.
Args: directory: The directory to save the profiler trace. profiler_trace: The profiler trace.
inference_sdk.http.utils.request_building
class ImagePlacement
Create a collection of name/value pairs.
Example enumeration:
class Color(Enum): ... RED = 1 ... BLUE = 2 ... GREEN = 3
Access them by:
attribute access:
Color.RED <Color.RED: 1>
value lookup:
Color(1) <Color.RED: 1>
name lookup:
Color['RED'] <Color.RED: 1>
Enumerations can be iterated over, and know how many members they have:
len(Color) 3
list(Color) [<Color.RED: 1>, <Color.BLUE: 2>, <Color.GREEN: 3>]
Methods can be added to enumerations, and members can have their own attributes -- see the documentation for details.
class RequestData
Data class for request data.
Attributes: url: The URL of the request. request_elements: The number of request elements. headers: The headers of the request. parameters: The parameters of the request. data: The data of the request. payload: The payload of the request. image_scaling_factors: The scaling factors of the images.
__init__(self, url: str, request_elements: int, headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, data: str | bytes | None, payload: Dict[str, Any] | None, image_scaling_factors: List[float | None]) -> None
Initialize self. See help(type(self)) for accurate signature.
assembly_request_data(url: str, batch_inference_inputs: List[Tuple[str, float | None]], headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, payload: Dict[str, Any] | None, image_placement: inference_sdk.http.utils.request_building.ImagePlacement) -> inference_sdk.http.utils.request_building.RequestData
Assemble request data.
Args: url: The URL of the request. batch_inference_inputs: The batch inference inputs. headers: The headers of the request. parameters: The parameters of the request. payload: The payload of the request. image_placement: The image placement.
Returns: The request data.
prepare_requests_data(url: str, encoded_inference_inputs: List[Tuple[str, float | None]], headers: Dict[str, str] | None, parameters: Dict[str, str | List[str]] | None, payload: Dict[str, Any] | None, max_batch_size: int, image_placement: inference_sdk.http.utils.request_building.ImagePlacement) -> List[inference_sdk.http.utils.request_building.RequestData]
Prepare requests data.
Args: url: The URL of the request. encoded_inference_inputs: The encoded inference inputs. headers: The headers of the request. parameters: The parameters of the request. payload: The payload of the request. max_batch_size: The maximum batch size. image_placement: The image placement.
Returns: The list of request data.
inference_sdk.http.utils.requests
api_key_safe_raise_for_status(response: requests.models.Response) -> None
Raise an exception if the request is not successful.
Args: response: The response of the request.
deduct_api_key(match: re.Match) -> str
Deduct the API key from the string.
Args: match: The match of the API key.
Returns: The string with the API key deducted.
deduct_api_key_from_string(value: str) -> str
Deduct the API key from the string.
Args: value: The string to deduct the API key from.
Returns: The string with the API key deducted.
inject_images_into_payload(payload: dict, encoded_images: List[Tuple[str, float | None]], key: str = 'image') -> dict
Inject images into the payload.
Args: payload: The payload to inject the images into. encoded_images: The encoded images. key: The key of the images.
Returns: The payload with the images injected.
inject_nested_batches_of_images_into_payload(payload: dict, encoded_images: list | Tuple[str, float | None], key: str = 'image') -> dict
Inject nested batches of images into the payload.
Args: payload: The payload to inject the images into. encoded_images: The encoded images. key: The key of the images.
Returns: The payload with the images injected.
utils
General-purpose helpers: lifecycle decorators, environment variable parsing, and SDK logging.
inference_sdk.utils.decorators
deprecated(reason: str)
Create a decorator that marks functions as deprecated.
This decorator will emit a warning when the decorated function is called, indicating that the function is deprecated and providing a reason.
Args: reason (str): The reason why the function is deprecated.
Returns: callable: A decorator function that can be applied to mark functions as deprecated.
experimental(info: str)
Create a decorator that marks functions as experimental.
This decorator will emit a warning when the decorated function is called, indicating that the function is experimental and providing additional information.
Args: info (str): Information about the experimental status of the function.
Returns: callable: A decorator function that can be applied to mark functions as experimental.
inference_sdk.utils.environment
str2bool(value: str | bool) -> bool
Convert a string or boolean value to a boolean.
Args: value (Union[str, bool]): The value to convert. Can be either a string ('true'/'false') or a boolean value.
Returns: bool: The boolean value. Returns True for 'true' (case-insensitive) or True input, False for 'false' (case-insensitive) or False input.
Raises: ValueError: If the input string is not 'true' or 'false' (case-insensitive).
inference_sdk.utils.logging
Centralized logging configuration for the Inference SDK.
get_logger(module_name: str) -> logging.Logger
Get a logger for the specified module.
Automatically configures basic logging on first use if no handlers exist.
Args: module_name: Name of the module requesting the logger.
Returns: logging.Logger: Configured logger for the module.
webrtc
WebRTC streaming client for real-time video inference over peer connections.
inference_sdk.webrtc.client
Module could not be imported for documentation.
inference_sdk.webrtc.config
Module could not be imported for documentation.
inference_sdk.webrtc.datachannel
Module could not be imported for documentation.
inference_sdk.webrtc.session
Module could not be imported for documentation.
inference_sdk.webrtc.sources
Module could not be imported for documentation.