# Benchmark Automation Agents

# Manual Setup

Manually entering the Input and expected output, useful in cases where a smaller number of queries need to be tested.

Click on + Field, which will add a row within the benchmarking schema
Enter the Input
Provide the Expected Output (ideal response) manually
The input and output can have multiple fields, depending on the input and output schema you have defined while building the agent
For multi-occurrence fields, you can click on 'Add Value' to add multiple values for the same field

Note - For each line item added in the output schema, a separate field for expected output will be added.

In case you have selected ‘File’ as the data type for input, you can click Browse and upload the file.

# Bulk Import

Click Import
Download the ‘Document Template’ ( illustrates the JSON structure which is supported by the system)
Prepare the JSON file according to the template
Upload the file by dragging and dropping or browsing through the system
Click Submit

In case ‘File’ is a data type for Input, you need to add the “doc id” and “doc name” for each input file to enable the system to add that as an input. Click on the documents tab ( placed alongside the import button )

The Documents dialog box appears, and you can directly import files here
After the import is complete, an option to download the file list appears
Click the ‘Download file list’ button
A CSV file would be downloaded, this file would contain the ‘doc id’ and ‘doc name’ for each file uploaded, to reference within the import file

The Documents dialog box will appear, and you can opt for the option to ‘Select files from Doc Library’, the Doc Library appears
You can select the folder and then select the files for which you want to download the file list
You can also import additional files into the existing folders to update the library and then download the document list

After adding the Input and Expected Output for all the rows you want to include in the benchmark, you can proceed to add different tracks (Model and Prompt Variations) and configure the benchmarking metrics to start the benchmarking run.

These actions are not dependent on any specific order; you can add tracks and set metrics in whichever sequence suits your workflow.