Job Operations
This guide provides detailed instructions on performing job operations using the SubModel Endpoint. You can initiate jobs, check their status, purge your job queue, and more through these operations. The examples below demonstrate how to use cURL to interact with an Endpoint.
For more information on sending requests, refer to Send a Request.
Asynchronous Endpoints
Asynchronous endpoints are designed for long-running tasks. When you submit a job, you receive a Job ID, which you can use to check the job's status later. This allows your application to continue processing without waiting for the job to complete immediately, making it ideal for tasks that require significant processing time or when managing multiple jobs concurrently.
curl -X POST https://api.submodel.ai/v1/sl/{endpoint_id}/run \
-H 'Content-Type: application/json' \
-H 'x-apikey: ${API_KEY}' \
-d '{"input": {"prompt": "Your prompt"}}'
Output:
{
"id": "eaebd6e7-6a92-4bb8-a911-f996ac5ea99d",
"status": "IN_QUEUE"
}
Synchronous Endpoints
Synchronous endpoints are suitable for short-lived tasks where immediate results are necessary. These endpoints wait for the job to complete and return the result directly in the response, making them ideal for operations expected to finish quickly.
curl -X POST https://api.submodel.ai/v1/sl/{endpoint_id}/runsync \
-H 'Content-Type: application/json' \
-H 'x-apikey: ${API_KEY}' \
-d '{"input": {"prompt": "Your prompt"}}'
Output:
{
"delayTime": 824,
"executionTime": 3391,
"id": "sync-79164ff4-d212-44bc-9fe3-389e199a5c15",
"output": [
{
"image": "https://image.url",
"seed": 46578
}
],
"status": "COMPLETED"
}
Health Endpoint
The /health
endpoint provides insights into the operational status of the endpoint, including the number of workers available and job statistics. This information helps monitor the health and performance of the API, aiding in workload management and troubleshooting.
curl --request GET \
--url https://api.submodel.ai/v1/sl/{endpoint_id}/health \
--header 'accept: application/json' \
--header 'x-apikey: ${API_KEY}'
Output:
{
"jobs": {
"completed": 1,
"failed": 5,
"inProgress": 0,
"inQueue": 2,
"retried": 0
},
"workers": {
"idle": 0,
"running": 0
}
}
Cancel Job
To cancel a job in progress, specify the cancel
parameter with the endpoint ID and the job ID.
curl -X POST https://api.submodel.ai/v1/sl/{endpoint_id}/cancel/{job_id} \
-H 'Content-Type: application/json' \
-H 'x-apikey: ${API_KEY}'
Output:
{
"id": "724907fe-7bcc-4e42-998d-52cb93e1421f-u1",
"status": "CANCELLED"
}
Purge Queue Endpoint
The /purge-queue
endpoint allows you to clear all jobs currently in the queue, without affecting jobs already in progress. This is useful for resetting or clearing pending tasks due to operational changes or errors.
curl -X POST https://api.submodel.ai/v1/sl/{endpoint_id}/purge-queue \
-H 'Content-Type: application/json' \
-H 'x-apikey: ${API_KEY}'
Output:
{
"removed": 2,
"status": "completed"
}
Check Job Status
To track the progress or result of an asynchronous job, use the Job ID to check its status. This endpoint provides detailed information about the job, including its current status, execution time, and output if the job has completed.
curl -X POST https://api.submodel.ai/v1/sl/{endpoint_id}/status/{job_id} \
-H 'x-apikey: ${API_KEY}'
Output:
{
"delayTime": 31618,
"executionTime": 1437,
"id": "60902e6c-08a1-426e-9cb9-9eaec90f5e2b-u1",
"output": {
"input_tokens": 22,
"output_tokens": 16,
"text": ["Hello! How can I assist you today?\nUSER: I'm having"]
},
"status": "COMPLETED"
}
Stream Results
For jobs that produce output incrementally, the stream endpoint allows you to receive results as they are generated. This is particularly useful for tasks involving continuous data processing or where immediate partial results are beneficial.
curl -X POST https://api.submodel.ai/v1/sl/{endpoint_id}/stream/{job_id} \
-H 'Content-Type: application/json' \
-H 'x-apikey: ${API_KEY}'
Output:
[
{
"metrics": {
"avg_gen_throughput": 0,
"avg_prompt_throughput": 0,
"cpu_kv_cache_usage": 0,
"gpu_kv_cache_usage": 0.0016722408026755853,
"input_tokens": 0,
"output_tokens": 1,
"pending": 0,
"running": 1,
"scenario": "stream",
"stream_index": 2,
"swapped": 0
},
"output": {
"input_tokens": 0,
"output_tokens": 1,
"text": [" How"]
}
}
// omitted for brevity
]
Note: The maximum size for a payload that can be sent using yield to stream results is 1 MB.
Rate Limits
SubModel's Endpoints facilitate submitting jobs and retrieving outputs. Access these endpoints at: https://api.submodel.ai/v2/{endpoint_id}/{operation}
/run
1000 requests per 10 seconds, 200 concurrent
/runsync
2000 requests per 10 seconds, 400 concurrent
/status
,/status-sync
,/stream
2000 requests per 10 seconds, 400 concurrent
/cancel
100 requests per 10 seconds, 20 concurrent
/purge-queue
2 requests per 10 seconds
/openai/*
2000 requests per 10 seconds, 400 concurrent
/requests
10 requests per 10 seconds, 2 concurrent
Note: Retrieve results from
/status
within 30 minutes for privacy protection.
For reference information on Endpoints, see Endpoint Operations.
Last updated