# Building a Self-Improving Multi-Agent System on Purple Fabric

# A Real-World Walkthrough: Automating Company Intelligence Reports

- Tarun Shah | Data Engineer

Agentic AI is moving from one-shot text generation to self-evolving systems — AI that acts, observes, and improves with every loop.
This post walks through how we built a self-improving marketing-intelligence agent system on Purple Fabric, leveraging its no-code orchestration, tool integration, and multi-agent capabilities.

The use case:
Given a company name and website, automatically discover public data (like annual reports and leadership pages), extract structured insights, assess completeness, and iterate until a high-confidence marketing report is ready for delivery.


# Why We Built It This Way

Generating a one-off company summary with an LLM is easy.
But for enterprise use, accuracy, coverage, and traceability are non-negotiable.
We needed:

  • Verified facts (CEO, HQ, revenue, ESG metrics)
  • Evidence with source URLs and document provenance
  • Self-review and correction without human babysitting

So we designed a multi-agent architecture where each agent has a narrow, auditable responsibility — all orchestrated inside Purple Fabric’s visual builder.


#

#

# Why “self-improvement” matters for agents

Traditional single-shot LLM calls often produce useful but incomplete/partial results. Agentic systems (in Purple Fabric) solve more complex tasks by chaining “digital experts” and tools. Self-improvement adds an iterative review and directed follow-up mechanism so your system can:

  • catch missing facts or low-confidence outputs
  • refine strategies (eg: choose a different parsing tool)
  • In research, iterative self-feedback and small adjustments have been shown to meaningfully raise task-success and robustness.
    The “think” tool is better suited for use cases which require complex tools, analyze tool outputs carefully in long chains of tool calls, navigate policy-heavy environments with detailed guidelines, or make sequential decisions where each step builds on previous ones and mistakes are costly.

# System Architecture in Purple Fabric

At a high level, the workflow runs as a loop of agents coordinated by a main Orchestrator Agent:

Web Discovery → Document Harvester → Evidence Extractor → Think & Audit
↳ (if incomplete) → Remediation Loop → Ready → Report Synthesizer → Compliance & Finalizer

Each agent is a Digital Expert inside Purple Fabric, connected through the orchestration canvas.


#

# 🔹 Agent A: Orchestrator

The Orchestrator is the control center.
It:

  • Receives {company_name, website_url} as input
  • Invokes all agents in sequence
  • Evaluates Think & Audit’s feedback
  • Triggers remediation cycles (up to two) if coverage is missing
  • Packages final results for reporting or further review

In Purple Fabric, this is defined as the primary workflow agent, using condition-based routing in the canvas.


# 🔹 Agent B: Web Discovery

This agent finds all the starting material:

  • Runs search_web (a custom Python tool integrated into Purple Fabric) using queries like
    "{company_name} investor relations site-{domain}"
    (note the enforced “site-” syntax to prevent malformed search strings)
  • Crawls the main domain with crawl_site to collect additional IR, leadership, and ESG links
  • Returns a ranked list of URLs — PDFs, HTML pages, and news sources

Output:
{
"company_name": "TechNova Inc.",
"ranked_urls": [
"https://technova.com/investor/Annual-Report-2024.pdf",
"https://technova.com/about/leadership",
"https://technova.com/sustainability"
],
"credible_count": 3
}

These URLs flow directly into the Document Harvester agent.


# 🔹 Agent C: Document Harvester

Document Harvester turns raw URLs into structured, machine-readable text.

It:

  • Checks file types:
    • PDFs → processed by pdf_to_markdown and extract_tables
    • HTML → processed by html_to_markdown
  • Assigns roles (annual_report, leadership_page, esg_page, etc.)
  • Produces Markdown-formatted text + extracted tables with metadata

Example output snippet:
{
"role": "leadership_page",
"source_url": "https://technova.com/about/leadership",
"markdown": "# Leadership Team\nJane Smith – CEO\nMark Davis – CFO\n...",
"meta": {
"content_type": "text/html",
"status": 200,
"extraction_time": "2025-10-26T12:34:56Z"
}
}

All documents are accumulated across iterations for downstream extraction.


# 🔹 Agent D: Evidence Extractor

The Evidence Extractor parses the harvested markdown and emits factual statements as “evidence atoms”:

  • Fact (text)
  • Fact type (financials, leadership, ESG, positioning)
  • Confidence score
  • Source URL and anchor

Example:
{
"fact": "Jane Smith is the Chief Executive Officer.",
"fact_type": "leadership",
"source": { "url": "https://technova.com/about/leadership" },
"confidence": 0.95
}

The extractor also supports "sanitize_output": true — added to handle JSON escaping issues (Invalid \escape) and ensure robust outputs even with malformed URLs or special characters.


# 🔹 Agent E: Think & Audit

This is the self-improvement engine — the agent that “thinks.”

It:

  1. Evaluates whether all required sections are covered (Company Profile, Financials, Leadership, Market Positioning, ESG)
  2. Scores:
    • coverage_score
    • confidence_score
    • freshness_ok
  3. Identifies missing or low-confidence sections

Generates a remediation_plan like:
{
"needs_remediation": true,
"recommended_actions": [
{
"target_agent": "Document Harvester",
"reason": "Leadership information missing",
"needed_sections": ["leadership"],
"hint_queries": ["{company_name} leadership team site-"]
}
]
}

  1. The Orchestrator reads this plan and re-invokes the right agents automatically

# Remediation Loop

This is where self-improvement comes to life.

  • The Orchestrator runs up to two remediation cycles
  • In each:
    • Unharvested URLs relevant to missing sections are prioritized
    • Evidence Extractor re-runs with a focused context and sanitize_output:true
    • Think & Audit evaluates again

If coverage becomes acceptable → move forward.
If not → stop after 2 loops and mark status = "NEEDS_REMEDIATION".

This closed feedback loop — implemented directly within Purple Fabric’s orchestration canvas — allows autonomous refinement without external code.


# 🔹 Agent F: Report Synthesizer

Once Think & Audit signals readiness, the Report Synthesizer agent composes the structured marketing report:

  • Executive Summary
  • Company Overview
  • Financial Highlights
  • Market Positioning
  • Leadership & Organization
  • ESG & Risk
  • Notes on Data Quality

All statements include citations from evidence atoms.


#

# 🔹 Agent G: Compliance & Finalizer

The final agent ensures enterprise readiness:

  • Validates language and compliance (e.g., “the company expects” instead of “will”)
  • Attaches disclaimers
  • Flags confidential or unverifiable statements
  • Packages final JSON output with compliance status and metadata

Output structure:
{
"status": "READY_FOR_DELIVERY",
"final_report": ,
"citation_index": [...],
"compliance_status": {
"approved_for_distribution": true,
"risk_notes": []
},
"packaging_metadata": {
"coverage_score": 0.89,
"confidence_score": 0.92
}
}


# How Self-Improvement Works Inside Purple Fabric

The beauty of this implementation is that everything happens inside Purple Fabric, no external orchestration code required.

Mechanism How It Works in Our Build
Introspective Loop Think & Audit reviews extraction results and directly drives remediation actions
Targeted Re-Harvesting Orchestrator picks only unharvested URLs relevant to missing sections
Evidence Accumulation All new evidence merges with previous cycles — nothing is lost
Auto-Sanitization Evidence Extractor cleans malformed JSON with sanitize_output:true on re-runs
Governance & Auditability Every agent run, tool call, and remediation decision is logged by Purple Fabric’s Enterprise Governance stack

This pattern forms a genuine agentic feedback loop:

Act → Observe → Think → Re-Act → Validate → Deliver


# Results and Observations

  • Autonomy: The system can run end-to-end for most companies without manual intervention
  • Precision: By the second loop, coverage scores improved from ~0.6 → ~0.9 on average
  • Resilience: The Think & Audit + remediation design gracefully recovers from missing data or parser errors
  • Traceability: Every fact is tied back to a source document and extraction timestamp

# Why Purple Fabric Was Ideal

Purple Fabric enabled us to do all this without writing custom orchestrator code:

  • Each agent (Digital Expert) was designed in the no-code canvas
  • Each Python tool (search_web, crawl_site, pdf_to_markdown, html_to_markdown) was integrated as reusable “Custom Tools”
  • The feedback loop logic (audit → remediation → retry) was embedded directly in the Orchestrator’s system prompt — not hardcoded externally
  • Governance, observability, and approvals were included automatically

Essentially, we implemented a self-healing, knowledge-driven multi-agent system that stays compliant, explainable, and continuously improving — all within Purple Fabric.


# Summary

Agentic AI isn’t just about chaining tools.
It’s about closing the loop — teaching agents to reflect, plan, and act until confidence is high.

Using Purple Fabric, we built a fully orchestrated pipeline where:

A company name and website → trigger
Web discovery → document harvesting → evidence extraction → self-audit → remediation → synthesis → compliance → final delivery

Every phase improves the previous one.

This is self-improvement by design — agentic intelligence that learns, refines, and delivers enterprise-grade outputs reliably and transparently.