Documentation

# Limitations

The following points detail the current limitations associated with the Enterprise Knowledge Garden (EKG) and its capabilities

# File Upload

  • File Size: Individual files must be under 20MB. Images must be under 5MB.

  • File Security: Files that are encrypted, zipped, or password-protected are not supported.

  • File Format: Supported file formats are PDF, PNG, JPEG, TIFF, DOCX

  • PDF and TIFF Files: These files have a memory limit of 500 MB and a page limit of 3,000 pages.

  • PDF Dimensions: The maximum height and width is 40 inches and 2880 points.

# Language

  • Supported Language: The EKG currently supports files only in English.

# Data Ingestion

  • External Sources: You cannot get data directly from external sources like Amazon S3 through the EKG interface. To ingest data from these sources, use the Data Library or build a flow in the Flow Designer.

  • Connectors: Some connectors are designated as read-only, which means they are designed exclusively to retrieve or "read" data from a source. They do not have the functionality to modify, create, or update records in the original system.

For example, the current ServiceNow and IMAP connectors only support fetching information; you cannot use them to create new records or update existing ones directly from the platform.

  • HTML: Web crawlers are currently the only way to ingest HTML files.

# Data Retrieval

  • Chunking: For large files, chunking may result in context loss while being processed by LLM model which can reduce overall retrieval accuracy and performance of the agent. It is recommended to add Metadata to all chunks to improve overall performance.

  • Metadata: There is no automatic way to add metadata to chunks within the EKG. You must do so manually or by constructing a custom process using a flow designer.

  • Top-K Retrieval: The agent is limited to retrieving a maximum of 100 chunks per query, which may cause it to miss relevant information in large datasets.

  • Top-N Re-ranking: The re-ranking process is limited to refining only the top 50 of those retrieved chunks, potentially excluding other valuable context from the final response.

  • Indexing Delay: After embedding is completed, RAG retrieval may not immediately return results. We recommend waiting a few minutes for the data to be fully indexed in the database, after which it will be available for retrieval.