# How To

This section is designed to help you quickly find clear, actionable guidance on how to use the Model Hub tool effectively.

Refer to the following page to know more:

  • Add a New Model
  • Add AWS Bedrock Model
  • Add Amazon Sagemaker Model
  • Add Azure AI Studio Model
  • Add Azure AI OpenAI Model
  • Add a Cohere Model
  • Enable and Disable Models

# Add a New Model

  1. Navigate to the Model Hub from the Admin module
  2. Browse the list of available models, listed under PF managed and Custom Models
  3. Click on ‘Add Model’ placed on the top right of the page
  4. A list of Model Providers currently supported by the platform will be displayed for you to select from
  5. Select the Model Provider for the model you want to add
  6. A list of prerequisites would appear
  7. Once you have provided all the prerequisites, you can save the model. Once saved, it will become immediately available for use across the platform

# Add AWS Bedrock Model

To get started with AWS Bedrock APIs, you need to meet the following prerequisites:

  1. AWS Account: You must have an active AWS account. If you do not have one, you can create a new account on the AWS website
  2. IAM Permissions: Ensure that your AWS Identity and Access Management (IAM) user has the necessary permissions to access AWS Bedrock. You may need permissions for services like Amazon SageMaker and other related services
  3. AWS SDK or CLI: Familiarity with the AWS SDK (e.g., Boto3 for Python) or the AWS Command Line Interface (CLI) to interact with the Bedrock APIs
  4. Region Availability: Check that AWS Bedrock is available in your desired AWS region. As of the latest updates, Bedrock may not be available in all regions
  5. Understanding of APIs: Basic knowledge of how to make API calls and handle responses, as well as familiarity with RESTful APIs
  6. For more detailed information, you can refer to the official AWS documentation on Bedrock: Getting Started with AWS Bedrock.
    If all prerequisites are met, you can proceed to configure. You need to fill out the following details to add a model by AWS Bedrock

LLMs / Embeddings
Select the type of model you want to configure.

  • LLMs: Large Language Models used for tasks within the Expert Agent Studio
  • Embeddings: Models that convert text into vector representations for embedding the documents in the Knowledge hub

Model ID
This dropdown lists the available model identifiers supported by Purple fabric from AWS Bedrock (e.g., anthropic.claude-v2, amazon.titan-tg1-large). It helps the system route requests correctly to the chosen foundation model on Bedrock.

Display Name
A user-friendly name that will appear in the Model Hub or selection dropdowns (e.g., “Claude 2 - Bedrock”). This name is for display purposes only and does not affect runtime behavior.

Region
The AWS region where the Bedrock model is deployed (e.g., us-east-1). This ensures requests are directed to the correct Bedrock endpoint for model inference.

Access Key
The AWS access key associated with the IAM user or role that has permissions to invoke Bedrock APIs. This is required for authenticating API requests to AWS.

Secret Key
The secret key corresponding to the provided access key. Together with the access key, this enables secure access to the Bedrock service.

Deployment Type
A dropdown to specify how the model is deployed and consumed

  • On-Demand - This deployment offers a pay‑per‑call model where you only pay for the requests you make. It’s ideal for low to medium volume, bursty workloads. While it offers flexibility and no infrastructure reservation, latency may vary under heavy load
  • Provisioned - Provisioned deployments use reserved compute capacity called Provisioned Throughput Units (PTUs). You select a consistent capacity level upfront, which delivers predictable performance and lower latency for high-throughput use cases. This is suitable for workloads that require stable, large-scale inference
  • Batch - Batch deployments are optimized for asynchronous, high-volume processing. You submit large input files and receive results once processing completes, typically within 24 hours. This comes at a lower cost than real-time deployments but doesn’t support immediate response. Useful for document summarization, dataset processing, or bulk classification
  • Custom Model Import- This option supports deploying custom or open-source models using dedicated infrastructure

# Add Amazon Sagemaker Model

Prerequisites
To get started with AWS SageMaker, you need to meet the following prerequisites:

  • AWS Account: You must have an active AWS account. If you do not have one, you can create a new account on the AWS website
  • IAM Permissions: Ensure that your AWS Identity and Access Management (IAM) user has the necessary permissions to access AWS SageMaker. You may need permissions for related services such as Amazon S3, AWS Lambda, and others
  • AWS SDK or CLI: Familiarity with the AWS SDK (e.g., Boto3 for Python) or the AWS Command Line Interface (CLI) to interact with SageMaker
  • Data Storage: Set up an Amazon S3 bucket to store your training data and model artifacts, as SageMaker uses S3 for data input and output
  • Understanding of Machine Learning: A basic understanding of machine learning concepts and workflows, as well as familiarity with the specific algorithms you plan to use
  • Programming Environment: Set up a programming environment where you can write and execute your code, such as Jupyter notebooks or a local development environment
  • For more detailed information, you can refer to the official AWS SageMaker documentation: AWS SageMaker Documentation.
    Post the Prerequisites are met, you can proceed to configure the model. You will need to add the following details to add a model by Amazon Sagemaker -

Embeddings/ReRanker
Select the type of model you want to configure.

  • Embeddings: Select this if the SageMaker model generates vector embeddings
  • Reranker: Select this if the model is used to reorder a list of results based on relevance

Model Id
A dropdown field listing available SageMaker model IDs deployed in your Amazon account.

Display Name
A user-friendly name that will appear in the Model Hub or selection dropdowns. This name is for display purposes only and does not affect runtime behavior.

Endpoint
The fully qualified SageMaker endpoint URL is used to invoke the model. Must correspond to the chosen model ID. Note: This is critical for routing inference requests to SageMaker.

# Add Azure AI Studio Model

Prerequisites for Using Azure AI Foundry Portal and Hosting Serverless LLM, Embedding, or Reranker Models

General Prerequisites

  1. Azure Subscription - An active Azure subscription is required to access and use Azure AI Foundry services.
  2. User Permissions - Ensure that your Azure account has the necessary permissions
  3. Contributor or Owner Role - Provides full access to projects, including the ability to assign permissions to other users.
  4. Azure AI Developer Role - Allows users to perform most actions, such as creating deployments, but does not permit assigning permissions to project users.

For more details, refer to Microsoft's documentation on Role-based access control in Azure AI Foundry.

Resource Group Permissions

  • Your account must have appropriate permissions on the resource group containing the AI project
  • Deploying Models as Serverless APIs - Azure AI Foundry enables you to deploy models as serverless API endpoints, allowing you to consume models without managing infrastructure

Steps to Deploy a Model as a Serverless API:

  1. Model Subscription
  • Non-Microsoft models: Subscribe to the desired model via Azure Marketplace
  • Microsoft models (e.g., Phi-3 models): No subscription required
  1. Deployment Process
  • Navigate to the Azure AI Foundry portal and select the model to deploy
  • Click on 'Deploy' → Select 'Serverless API with Azure AI Content Safety (preview)'
  • Choose the appropriate project and region
  • Assign a unique name to your deployment (part of the API endpoint URL)
  • Configure content filtering settings if needed
  • Click 'Deploy' and wait for the process to complete
  1. Accessing the Deployment
    After deployment, access the Target URI and authentication keys via the Azure AI Foundry portal. Use these details to integrate the deployed model into your applications. For detailed guidance, refer to Deploy models as serverless APIs - Azure AI Foundry.

  2. Additional Considerations
    Region Availability - Ensure the selected Azure region supports your model and deployment type
    Content Filtering - Utilize Azure AI Content Safety to detect and manage harmful content
    Cost Management

    • Be aware of the billing structure based on token usage and minimal infrastructure costs
    • By fulfilling these prerequisites and following the outlined steps, you can effectively deploy serverless LLMs, embeddings, or reranker models using Azure AI Foundry

Post the prerequisites are met, you will need to add the following details for adding a model by Azure AI Studio -

LLMs/ReRanker
Select the type of model you want to configure.

  • LLMs: Large Language Models used for tasks within the Expert Agent Studio
  • Reranker: Select this if the model is used to reorder a list of results based on relevance

Model ID
A unique identifier assigned to the model within Azure AI Studio tells the platform exactly which model to route requests to

Display Name
A user-friendly name that will appear in the platform UI

API Key
A secure token provided by Azure AI Studio to authenticate API requests. Ensures only authorized applications can access the model

Endpoint
The full URL where the model can be accessed for inference. This is where the platform sends requests when invoking the model

# Add Azure AI OpenAI Model

Prerequisites for using Azure OpenAI

  • An Azure subscription
  • Python: Version 3.8 or later
  • Python Libraries: The following Python libraries are required: os.
  • Azure OpenAI Service Resource: An Azure OpenAI Service resource with either the gpt-35-turbo or the gpt-4 models deployed. For more information about model deployment, see the resource deployment guide

Post the requirements are met, you can go ahead and proceed to configure. The following details need to be provided -

LLMs / Embeddings Select the type of model you want to configure.

  • LLMs: Large Language Models used for tasks within the Expert Agent Studio
  • Embeddings: Models that convert text into vector representations for embedding the documents in the Knowledge hub

Model ID
A unique identifier assigned to the model within Azuri OpenAI tells the platform exactly which model to route requests to

Deployment Name
The deployment name is a user-defined identifier assigned when deploying an OpenAI model (like GPT-3.5 Turbo or GPT-4) within your Azure OpenAI resource

Display Name
A user-friendly name that will appear in the platform UI

API Key
A secure token provided by Azure Open AI to authenticate API requests. Ensures only authorized applications can access the model

Endpoint
The full URL where the model can be accessed for inference. This is where the platform sends requests when invoking the model

Deployment Type
A dropdown to specify how the model is deployed and consumed

  • On-Demand: This deployment offers a pay‑per‑call model where you only pay for the requests you make. It’s ideal for low to medium volume, bursty workloads. While it offers flexibility and no infrastructure reservation, latency may vary under heavy load
  • Provisioned: Provisioned deployments use reserved compute capacity called Provisioned Throughput Units (PTUs). You select a consistent capacity level upfront, which delivers predictable performance and lower latency for high-throughput use cases. This is suitable for workloads that require stable, large-scale inference
  • Batch: Batch deployments are optimized for asynchronous, high-volume processing. You submit large input files and receive results once processing completes, typically within 24 hours. This comes at a lower cost than real-time deployments but doesn’t support immediate response. Useful for document summarization, dataset processing, or bulk classification
  • Custom Model Import: This option supports deploying custom or open-source models using dedicated infrastructure

# Add a Cohere Model

Prerequisites for using Cohere APIs
To get started with Cohere APIs, you need to meet the following prerequisites:

  1. Cohere Account: You must create an account on the Cohere website. This will provide you with access to the API and your API key
  2. API Key: After creating your account, you will receive an API key. This key is required to authenticate your requests to the Cohere API
  3. Programming Environment: Set up a programming environment where you can make HTTP requests. This can be done using various programming languages such as Python, JavaScript, etc.
  4. HTTP Client: Familiarity with an HTTP client or library (like requests in Python or axios in JavaScript) to make API calls
  5. Basic Understanding of APIs: A basic understanding of how to interact with RESTful APIs, including making requests and handling responses

For more detailed information, you can refer to the official Cohere API documentation

Post the Prerequisites are met, you can go ahead and proceed to configure. The following details need to be added to use a model by Cohere

Model ID
The specific identifier for the model you want to use within Cohere’s ecosystem

Display Name
A user-defined, human-readable name shown in the internal model catalog

API Key
A secret token provided by Cohere to authorize access to their API. Where to Get It: The Cohere dashboard under the API Keys section

Gemini
Prerequisites to Host Google Gemini Models

  1. Google Cloud Account: Ensure you have an active Google Cloud account.
  2. Project Setup: Create a new project in the Google Cloud Console or use an existing one.
  3. Enable the necessary APIs:
  • Vertex AI API
  • Vertex AI in Firebase API (if integrating with Firebase)
  1. Billing Configuration: Upgrade your project to the Blaze (pay-as-you-go) pricing plan to access Vertex AI services
  2. Authentication and Permission: Set up appropriate authentication methods:
    For server-side applications, configure a Google Cloud service account with the necessary permissions
    For mobile or web applications, utilize the Vertex AI in Firebase SDKs, which offer built-in security features like Firebase App Check
  3. SDK Integration: Depending on your development environment, integrate the relevant SDKs
  4. Model Deployment: Use Vertex AI to deploy and manage your Gemini models. Vertex AI offers a suite of MLOps tools to streamline usage, deployment, and monitoring, ensuring efficiency and reliability

Post the Prerequisites are met, you can proceed to configure, you will need to add the following details for adding a model by Gemini:

Model ID
The exact identifier of the Gemini model variant you are using. Specifies which model your requests are sent to for inference tasks

Display Name
A custom, user-defined name shown within the platform's UI

API Key
A secure credential from Google Cloud is used to authorize API requests

# Enable and Disable Models

Tenant Admins can enable or disable any model visible in the catalog:

  • Enable Model: Makes it available across Expert Agent Studio, APIs, and Benchmarking
  • Disable Model: Removes the model from active use. Warning banners appear if any agents still depend on it