Flow MCP — AI Agent Integration Test Report
Date: March 7, 2026
Endpoint: https://api.rynko.dev/api/flow/mcp (Streamable HTTP)
LLM: GLM-4.5 Air (via OpenRouter)
Protocol: MCP (Model Context Protocol) over JSON-RPC 2.0
Result: 4/4 scenarios passed
What Was Tested
A real LLM agent was connected to Rynko Flow's MCP server and asked to complete four increasingly complex workflows — entirely autonomously. The agent discovered available tools at runtime, decided which to call and in what order, and interpreted Flow's responses to drive its next action.
No tool calls were hardcoded. The LLM made every decision.
Test Setup
A Flow Gate was created with:
- Schema:
vendor(string, 1-255 chars),amount(number, >= 0),currency(enum: USD/EUR/GBP/INR),po_number(optional string) - Business Rule: "Amount must be positive" — expression:
amount > 0 - Approval Mode: Auto (no human review needed)
The LLM was given 9 MCP tools (discovered dynamically) and a simple system prompt. Everything else was up to the model.
Scenario 1: Discover Gates and Submit Valid Invoice
Task: Find the test gate, confirm it exists, then submit a valid invoice.
What the LLM did:
| Step | Tool Called | Purpose |
|---|---|---|
| 1 | list_flow_gates | Discovered all active gates, found the test gate |
| 2 | validate_agent_test_invoice_gate | Submitted: vendor="Globex Corp", amount=12500, currency="USD", po_number="PO-AGENT-001" |
| 3 | get_flow_run_status | Checked the run status to confirm completion |
Flow's response: Run created (frun_yysz2ctz), status: completed — schema validation passed, business rule passed, auto-approved.
LLM's final answer:
Gate confirmed: 'agent-test-invoice-gate' exists and is active. Invoice submitted successfully! Run ID:
00a9dab6-d004-46a2-aa27-affcc8d70a18Status:completed
Verdict: The agent autonomously chose a 3-step workflow (discover, submit, verify) without being told the exact sequence.
Scenario 2: Submit Invalid Data and Explain Errors
Task: Submit intentionally bad data (vendor="", amount=-100, currency="JPY") and explain what went wrong.
What the LLM did:
| Step | Tool Called | Payload |
|---|---|---|
| 1 | validate_agent_test_invoice_gate | vendor="", amount=-100, currency="JPY" |
Flow's response: validation_failed with 3 errors:
vendormust be at least 1 charactercurrencymust be one of: USD, EUR, GBP, INR- Business rule
amount > 0failed: "Invoice amount must be greater than zero"
LLM's final answer:
The invoice submission failed validation. Here are the errors:
- vendor — must be at least 1 character (was empty)
- currency — "JPY" is not valid; must be USD, EUR, GBP, or INR
- amount — must be greater than zero (was -100)
Verdict: The agent correctly submitted the exact bad values, received Flow's structured error response, and translated it into a clear human-readable explanation.
Scenario 3: Multi-Step Reasoning — Inspect Gate and Analyze Runs
Task: Describe a gate's schema, business rules, and run history.
What the LLM did:
| Step | Tool Called | Purpose |
|---|---|---|
| 1 | get_flow_gate | Retrieved full gate configuration |
| 2 | list_flow_runs | Fetched run history for the gate |
LLM's final answer:
Schema Fields Required
- vendor (string, 1-255 characters) — required
- amount (number, minimum 0) — required
- currency (string, one of: USD/EUR/GBP/INR) — required
- po_number (string) — optional
Business Rules
- "Amount must be positive": expression
amount > 0Runs Processed 2 runs — 1 completed, 1 validation_failed
Verdict: The agent synthesized information from two different tools into a structured summary, demonstrating multi-source reasoning.
Scenario 4: Self-Correction Loop
Task: Submit bad data, read Flow's error response, fix the problems, and resubmit until it passes.
This is the key scenario — it demonstrates the closed-loop agent workflow that Rynko Flow is designed for.
What the LLM did:
| Step | Tool Called | Payload | Result |
|---|---|---|---|
| 1 | validate_agent_test_invoice_gate | vendor="", amount=-500, currency="JPY" | validation_failed — 3 errors |
| 2 | validate_agent_test_invoice_gate | vendor="Acme Inc", amount=500, currency="USD" | completed |
Flow's error response on first attempt:
vendormust be at least 1 charactercurrencymust be one of: USD, EUR, GBP, INR- Business rule failed: "Invoice amount must be greater than zero"
LLM's corrections:
| Field | Before (invalid) | After (corrected) | Error that guided the fix |
|---|---|---|---|
vendor | "" (empty) | "Acme Inc" | "must be at least 1 character" |
amount | -500 | 500 | "must be greater than zero" |
currency | "JPY" | "USD" | "must be one of: USD, EUR, GBP, INR" |
LLM's final answer:
The invoice validation has succeeded! Here's what happened:
Initial submission failed:
- vendor="" — "must be at least 1 characters"
- amount=-500 — "must be greater than zero"
- currency="JPY" — "not in allowed values"
Corrected submission passed:
- vendor="Acme Inc", amount=500, currency="USD"
- Run ID: completed successfully
Verdict: The agent read Flow's structured validation errors, reasoned about what each one meant, applied the correct fix for each field, and resubmitted — all without human intervention. This is the core value proposition: Flow gates act as guardrails that LLM agents can understand and respond to programmatically.
What This Demonstrates
-
MCP tool discovery works end-to-end. The LLM received 9 tools at runtime (including dynamically-generated
validate_*tools per gate) and correctly chose which ones to use for each task. -
Flow's validation errors are LLM-readable. Structured error responses with field names, constraint descriptions, and business rule messages gave the agent enough context to self-correct without any additional prompting.
-
The self-correction loop is real. An agent submitting invalid data to a Flow gate can read the errors, fix its payload, and retry — creating a closed feedback loop between the LLM and the validation pipeline.
-
No special integration code needed. The LLM connected via standard MCP protocol, discovered tools dynamically, and operated autonomously. Any MCP-compatible agent (Claude, Cursor, Windsurf, custom) gets the same experience.
Test Infrastructure
- Transport: MCP Streamable HTTP with JSON-RPC batching
- Session management: Fresh session per call with retry (up to 6 attempts) for multi-instance load balancing
- LLM provider: OpenRouter free tier (OpenAI-compatible API)
Try It Yourself
- Sign up free — 500 Flow runs/month included
- Create a gate with a schema and business rules
- Connect your AI agent via MCP
- Submit a bad payload and watch the agent self-correct