#
Benchmark Automation Agents
#
Manual Setup
Manually entering the Input and expected output, useful in cases where a smaller number of queries need to be tested.
- Click on + Field, which will add a row within the benchmarking schema
- Enter the Input
- Provide the Expected Output (ideal response) manually
- The input and output can have multiple fields, depending on the input and output schema you have defined while building the agent
- For multi-occurrence fields, you can click on 'Add Value' to add multiple values for the same field
Note - For each line item added in the output schema, a separate field for expected output will be added.
In case you have selected ‘File’ as the data type for input, you can click Browse and upload the file.
#
Bulk Import
- Click Import

- Download the ‘Document Template’ ( illustrates the JSON structure which is supported by the system)
- Prepare the JSON file according to the template
- Upload the file by dragging and dropping or browsing through the system
- Click Submit
In case ‘File’ is a data type for Input, you need to add the “doc id” and “doc name” for each input file to enable the system to add that as an input. Click on the documents tab ( placed alongside the import button )
- The Documents dialog box appears, and you can directly import files here
- After the import is complete, an option to download the file list appears
- Click the ‘Download file list’ button
- A CSV file would be downloaded, this file would contain the ‘doc id’ and ‘doc name’ for each file uploaded, to reference within the import file
OR
- The Documents dialog box will appear, and you can opt for the option to ‘Select files from Doc Library’, the Doc Library appears
- You can select the folder and then select the files for which you want to download the file list
- You can also import additional files into the existing folders to update the library and then download the document list
After adding the Input and Expected Output for all the rows you want to include in the benchmark, you can proceed to add different tracks (Model and Prompt Variations) and configure the benchmarking metrics to start the benchmarking run.
These actions are not dependent on any specific order; you can add tracks and set metrics in whichever sequence suits your workflow.