Skip to main content

Flow MCP — AI Agent Integration Test Report

Date: March 7, 2026 Endpoint: https://api.rynko.dev/api/flow/mcp (Streamable HTTP) LLM: GLM-4.5 Air (via OpenRouter) Protocol: MCP (Model Context Protocol) over JSON-RPC 2.0 Result: 4/4 scenarios passed


What Was Tested

A real LLM agent was connected to Rynko Flow's MCP server and asked to complete four increasingly complex workflows — entirely autonomously. The agent discovered available tools at runtime, decided which to call and in what order, and interpreted Flow's responses to drive its next action.

No tool calls were hardcoded. The LLM made every decision.


Test Setup

A Flow Gate was created with:

  • Schema: vendor (string, 1-255 chars), amount (number, >= 0), currency (enum: USD/EUR/GBP/INR), po_number (optional string)
  • Business Rule: "Amount must be positive" — expression: amount > 0
  • Approval Mode: Auto (no human review needed)

The LLM was given 9 MCP tools (discovered dynamically) and a simple system prompt. Everything else was up to the model.


Scenario 1: Discover Gates and Submit Valid Invoice

Task: Find the test gate, confirm it exists, then submit a valid invoice.

What the LLM did:

StepTool CalledPurpose
1list_flow_gatesDiscovered all active gates, found the test gate
2validate_agent_test_invoice_gateSubmitted: vendor="Globex Corp", amount=12500, currency="USD", po_number="PO-AGENT-001"
3get_flow_run_statusChecked the run status to confirm completion

Flow's response: Run created (frun_yysz2ctz), status: completed — schema validation passed, business rule passed, auto-approved.

LLM's final answer:

Gate confirmed: 'agent-test-invoice-gate' exists and is active. Invoice submitted successfully! Run ID: 00a9dab6-d004-46a2-aa27-affcc8d70a18 Status: completed

Verdict: The agent autonomously chose a 3-step workflow (discover, submit, verify) without being told the exact sequence.


Scenario 2: Submit Invalid Data and Explain Errors

Task: Submit intentionally bad data (vendor="", amount=-100, currency="JPY") and explain what went wrong.

What the LLM did:

StepTool CalledPayload
1validate_agent_test_invoice_gatevendor="", amount=-100, currency="JPY"

Flow's response: validation_failed with 3 errors:

  • vendor must be at least 1 character
  • currency must be one of: USD, EUR, GBP, INR
  • Business rule amount > 0 failed: "Invoice amount must be greater than zero"

LLM's final answer:

The invoice submission failed validation. Here are the errors:

  1. vendor — must be at least 1 character (was empty)
  2. currency — "JPY" is not valid; must be USD, EUR, GBP, or INR
  3. amount — must be greater than zero (was -100)

Verdict: The agent correctly submitted the exact bad values, received Flow's structured error response, and translated it into a clear human-readable explanation.


Scenario 3: Multi-Step Reasoning — Inspect Gate and Analyze Runs

Task: Describe a gate's schema, business rules, and run history.

What the LLM did:

StepTool CalledPurpose
1get_flow_gateRetrieved full gate configuration
2list_flow_runsFetched run history for the gate

LLM's final answer:

Schema Fields Required

  • vendor (string, 1-255 characters) — required
  • amount (number, minimum 0) — required
  • currency (string, one of: USD/EUR/GBP/INR) — required
  • po_number (string) — optional

Business Rules

  • "Amount must be positive": expression amount > 0

Runs Processed 2 runs — 1 completed, 1 validation_failed

Verdict: The agent synthesized information from two different tools into a structured summary, demonstrating multi-source reasoning.


Scenario 4: Self-Correction Loop

Task: Submit bad data, read Flow's error response, fix the problems, and resubmit until it passes.

This is the key scenario — it demonstrates the closed-loop agent workflow that Rynko Flow is designed for.

What the LLM did:

StepTool CalledPayloadResult
1validate_agent_test_invoice_gatevendor="", amount=-500, currency="JPY"validation_failed — 3 errors
2validate_agent_test_invoice_gatevendor="Acme Inc", amount=500, currency="USD"completed

Flow's error response on first attempt:

  • vendor must be at least 1 character
  • currency must be one of: USD, EUR, GBP, INR
  • Business rule failed: "Invoice amount must be greater than zero"

LLM's corrections:

FieldBefore (invalid)After (corrected)Error that guided the fix
vendor"" (empty)"Acme Inc""must be at least 1 character"
amount-500500"must be greater than zero"
currency"JPY""USD""must be one of: USD, EUR, GBP, INR"

LLM's final answer:

The invoice validation has succeeded! Here's what happened:

Initial submission failed:

  • vendor="" — "must be at least 1 characters"
  • amount=-500 — "must be greater than zero"
  • currency="JPY" — "not in allowed values"

Corrected submission passed:

  • vendor="Acme Inc", amount=500, currency="USD"
  • Run ID: completed successfully

Verdict: The agent read Flow's structured validation errors, reasoned about what each one meant, applied the correct fix for each field, and resubmitted — all without human intervention. This is the core value proposition: Flow gates act as guardrails that LLM agents can understand and respond to programmatically.


What This Demonstrates

  1. MCP tool discovery works end-to-end. The LLM received 9 tools at runtime (including dynamically-generated validate_* tools per gate) and correctly chose which ones to use for each task.

  2. Flow's validation errors are LLM-readable. Structured error responses with field names, constraint descriptions, and business rule messages gave the agent enough context to self-correct without any additional prompting.

  3. The self-correction loop is real. An agent submitting invalid data to a Flow gate can read the errors, fix its payload, and retry — creating a closed feedback loop between the LLM and the validation pipeline.

  4. No special integration code needed. The LLM connected via standard MCP protocol, discovered tools dynamically, and operated autonomously. Any MCP-compatible agent (Claude, Cursor, Windsurf, custom) gets the same experience.


Test Infrastructure

  • Transport: MCP Streamable HTTP with JSON-RPC batching
  • Session management: Fresh session per call with retry (up to 6 attempts) for multi-instance load balancing
  • LLM provider: OpenRouter free tier (OpenAI-compatible API)

Try It Yourself

  1. Sign up free — 500 Flow runs/month included
  2. Create a gate with a schema and business rules
  3. Connect your AI agent via MCP
  4. Submit a bad payload and watch the agent self-correct