Manage Endpoints

Learn how to effectively manage Serverless Endpoints.

Create an Endpoint

You can create an Endpoint using the Web interface.

  1. Click on + New Endpoint and provide the following details:

    1. Endpoint Name.

    2. Select your GPUs.

    3. Configure your workers.

    4. Add a container image.

    5. Click Deploy.

Delete an Endpoint

You can delete an Endpoint via the Web interface. Ensure all workers are removed before deleting an Endpoint.

  1. Choose the Endpoint you wish to delete.

  2. Click Edit Endpoint and set Max Workers to 0.

  3. Click Update and then Delete Endpoint.

Edit an Endpoint

You can modify a running Endpoint through the Web interface after deployment.

  1. Select the Endpoint you want to edit.

  2. Click Edit Endpoint, make the necessary changes.

  3. Click Update.

Set GPU Prioritization for an Endpoint

When creating or modifying a Worker Endpoint, specify your GPU preferences in order of priority. This configuration allows you to select preferred GPU models for your Worker Endpoints.

SubModel will attempt to allocate your top choice if available. If the preferred GPU is unavailable, the system will automatically switch to the next available GPU in your list.

  1. Choose the Endpoint you wish to update.

  2. Set the priority for the GPUs you prefer.

  3. Click Update.

To force a configuration update:

  • Set Max Workers to 0.

  • Click Update.

  • Adjust your Max Workers back to the desired value.

Last updated