A Knowledge Base refers to an advanced, centralized enterprise-specific repository of information that is not only structured for human comprehension but is also optimized for machine understanding. It leverages artificial intelligence to draw inferences, and provide more contextually relevant and personalized responses to user queries.
Enterprise-specific knowledge bases play a pivotal role in GEN AI solutions for enterprises by offering tailored insights and solutions that are highly relevant to the organization’s unique context and requirements.
For example, consider a business user tasked with analyzing portfolio performance for a group of clients. Instead of manually sifting through numerous documents or databases, the business user interacts with the firm’s AI-powered Agent. They input a query such as, “Retrieve recent portfolio returns for clients in the XYZ investment group.” Integrated with the Knowledge Base, the system comprehends the user’s request and swiftly retrieves the relevant portfolio performance data for clients within the specified investment group.
Users must have the Gen AI User policy to create Knowledge Base.
This guide will walk you through the steps on how to create a Knowledge Base.
- Upload documents
- Initiate knowledge base creation
- Import documents
- Configure a chunking strategy
- Choose an embedding model
- Metadata tagging for chunks
- Experiment/test the knowledge base
- Publish the knowledge base
Step 1: Upload documents
It is recommended to upload the required documents in the Document Library to avail the documents during Knowledge Base creation. For more information on how to upload documents, see Upload documents.
Note: Skip this step if you have already uploaded the required documents in the Document Library.
Step 2: Initiate knowledge base creation
- Head to Asset Studio, choose Knowledge hub.
- In the Knowledge hub section, click Create Knowledge.
- In the Create Knowledge base window, enter the unique Knowledge base Knowledage Base Name and the Description.
- Click Create to initiate the creation of the Knowledge Base
Step 3: Import documents
- On the Knowledge Base creation page, click
and select Import documents.
- In the Documents window that appears, select the required documents.
Filter
- Click
and choose the appropriate filters to view the documents you are searching for and then click Apply.
- You can view the applicable results based on the chosen filters.
- You can view the applicable results based on the chosen filters.
- Select the required documents and click X (close) to import the documents.
Step 4: Configure a chunking strategy
Document chunking refers to breaking down large documents or data sets into smaller, more manageable chunks for processing. Document chunking is a technique that improves the performance and cost efficiency of Gen AI platforms by allowing parallel processing, resource optimization, scalability, fault tolerance, and cost-effective operation.
The chunking strategy helps the LLM model for better retrieval, fast processing and better understanding.
- On the Import image message that appears, click Chunk Now.
- Alternate: In the menu tab, click Chunk viewer and then select
Configuration.
- In the Configuration window that appears, choose any one of the following chunk strategies.
- Block: Choose this option if you wish to chunk the documents by blocks. Suitable for documents with diverse sections or topics where each block may represent a distinct segment requiring individual processing.
- Page: Choose this option if you wish to chunk the documents by pages. Appropriate for documents with consistent and uniform content, where dividing by page ensures even distribution and manageable sections.
- Word: Choose this option if you wish to chunk the documents by words. Beneficial for content where word-level context is paramount.
- Set word limit: Ensuring that the model processes a specific number of words for accuracy and coherence.
- Set word limit: Ensuring that the model processes a specific number of words for accuracy and coherence.
- Block: Choose this option if you wish to chunk the documents by blocks. Suitable for documents with diverse sections or topics where each block may represent a distinct segment requiring individual processing.
Note: You can also choose an embedding model in this step. For more information see choose an embedding model.
- Click Update changes.
- You can now get the chunks for each document that you have selected.
Step 5: Choose an embedding model
A vector embedding model is like a special tool that helps a system understand the meaning of words and sentences in a simpler way. It works by turning words and sentences into numbers, which the computer can easily work with.
The embedding model converts the Chunks into Numerical vector representation of the actual content that is understandable by LLMs to generate output.
- After chunking, click RAG Viewer and then click
Configuration.
- Alternate: In the menu tab, click Chunk viewer and then select
Configuration.
- In the configuration window that appears, enable the Embedding option.
- Choose any one of the following embedding models to convert the Chunks into Numeric vector representation.
- Azure OpenAI Text Ada 002
- BGE Large
- Azure OpenAI Text Embedding 3 Small
- Click Update changes to initiate embedding.
- You can identify the embedded chunks with an Embedded label.
Step 6: Metadata and tagging for chunks
Add metadata
Metadata is a systematic way to communicate information about content. It is significant because it facilitates the discovery, usage, and preservation of that content by establishing a consistent mechanism and terminology.
Metadata for documents could include information such as title, author, publication date, journal or conference name, abstract, keywords, citations, and more.
- In the Knowledge base creation page, select the required chunks that you wish to add metadata.
- Select Add metadata option that appears against the selected chunk(s).
- In the Metadata window that appears, enter the Name and Value that you wish to add as a metadata for the selected chunks.
- Optional: Use
to add more Metadata.
- Click Submit and view the added metadata details at the bottom of the chunk.
Add Tag
Tagging in document retrieval involves assigning descriptive keywords or labels to documents based on their content or user-defined categories.
Tagging helps to effectively organize and retrieve information, speeding the search procedures and improving efficiency and performance of the model
- In the Knowledge base creation page, select the required chunks that you wish to add tag.
- Select Add tag option that appears against the selected chunk(s).
- In the tags window that appears, enter the tag name and press Enter then click Submit.
- Click
to view the added tags in the bottom of the chunks.
Step 7: Experiment/test the knowledge base
- In the Knowledge base creation page, click RAG Viewer.
- In the Search bar, enter your query and press Enter.
- Based on the query you entered, you can view the respective chunks of information.
- Click
option and choose the metadata and tags that are associated with the chunks for better filtration.
- Click Apply to get the filtered chunks.
- You can add metadata/tag against the chunks for the better retrieval and speed processing while using this Knowledge Base in other Use cases.
Step 8: Publish the knowledge base
- If the desired Knowledge Base has been created, click Publish.
- The Published Knowledge Base can be accessed in the Knowledge Hub.