Creating an Endpoint in Generative AI

Create an endpoint for a a pretrained foundational model or for a custom model in Generative AI.

  1. In the navigation bar of the Console, select a region with Generative AI, for example, US Midwest (Chicago). If you don't know which region to select, see Regions with Generative AI.
  2. Open the navigation menu and click Analytics & AI. Under AI Services, click Generative AI.
  3. In the left navigation, choose the compartment that contains the custom model that you want to add an endpoint for.
  4. Perform one of the following actions:
    • To create an endpoint for a custom model with the model name and version pre-populated:
      1. Click Custom models.
      2. Click the name of the custom model that you want to add an endpoint for.
      3. Check the base model for the custom model to match it to a cluster in the following steps. For example, cohere.command.light 15.6.
      4. Under Resources, click Endpoints.
      5. Click Create endpoint.
    • To create an endpoint for any out-of-the-box pretrained or custom model:
      1. Click Endpoints.
      2. Click Create endpoint
  5. (Optional) Enter a name for the endpoint. Start the name with a letter or underscore, followed by letters, numbers, hyphens, or underscores. The length can be 1 to 255 characters. If you leave the name blank, the system generates a name that you can change later.

    The generated name has the format generativeaiendpoint<timestamp>.

    generativeaiendpoint20240119172620

  6. (Optional) To moderate the model's generated responses toggle on Content moderation. This option is off by default. Learn about Content Moderation. You can add this feature later if you edit the endpoint.
  7. If not selected, choose the model name and version that you want to add an endpoint for.
    Tip

    • If the model is in a different compartment than the current compartment, click Change compartment and choose the compartment that hosts the model. We recommend that you create the endpoint in the same compartment as the model.
    • If the custom model that you're looking for isn't listed, click Cancel. Then under Generative AI, click Custom models and ensure that the custom model is in an active state.
  8. Choose a hosting dedicated AI cluster by performing one of the following actions:
    • If you already have a cluster, choose a Dedicated AI cluster from the drop-down list. If you just created a cluster, wait for that cluster to become active. Ensure that the base model that 's associated with this cluster matches the base model of the custom model.
    • To create a cluster, in the Dedicated AI cluster drop-down list, click Create new dedicated AI cluster and perform the following steps:
      1. (Optional) Enter a name and description.
      2. Choose a Base model that matches the base model of the model that you want to host.
      3. Add 1 instance count for the endpoint. When you create a cluster you need at least one unit for an endpoint. For an existing cluster, you can use that same unit to host new endpoints. Each instance hosts all the active endpoints. Going from 1 to 2 instance doubles the number of supported RPM for all active endpoints hosted on the cluster.
      4. Read the commitment unit hours for the hosting dedicated AI cluster and click to agree with the commitment.
      5. Click Create and wait for the cluster to become active.
      6. From the Dedicated AI cluster drop-down list, click the dedicated AI cluster that you created.
  9. Click Create endpoint.
    You're directed to the endpoint details page where you can track the state of the endpoint.
  10. After the endpoint is active, click View in playground and start using the model from this endpoint.