MCP Server
FlutterProbe ships an MCP (Model Context Protocol) server as a standalone binary called probe-mcp that exposes your entire test suite as tools callable by AI agents. Once connected, Claude Desktop, Cursor, or any MCP-compatible client can manage devices, write and run tests, record interactions, generate reports, and inspect results — all without leaving the AI chat.
As of v0.6.0, the MCP server is a separate binary
probe-mcp. The legacyprobe mcp-serversubcommand still works but prints a deprecation notice on stderr. Update your MCP client configuration to point atprobe-mcpdirectly.
What the MCP server exposes
Section titled “What the MCP server exposes”Device lifecycle (5 tools)
Section titled “Device lifecycle (5 tools)”| Tool | Description |
|---|---|
list_devices | List booted/connected simulators, emulators, and physical devices (id, name, platform, state, OS version) |
list_simulators | List all iOS simulators (booted + shutdown) so the agent can pick one to boot |
list_avds | List Android Virtual Device names available to launch |
start_device | Boot an Android emulator (by AVD name) or iOS simulator (by UDID); blocks until online |
shutdown_device | Shut down an iOS simulator (udid) or Android emulator (serial) |
Authoring & execution (9 tools)
Section titled “Authoring & execution (9 tools)”| Tool | Description |
|---|---|
get_widget_tree | Dump the live Flutter widget tree from the running app |
take_screenshot | Capture the current screen and return it as an image |
read_test | Read the contents of a .probe file |
write_test | Create or overwrite a .probe file — content is validated before writing; syntax errors are returned without creating the file |
run_script | Execute inline ProbeScript without creating a file |
run_tests | Run .probe test files (supports composite multi-device tests via composite_devices) |
list_files | List all .probe files in a directory |
lint | Validate .probe file syntax without running |
record | Record user interactions and generate a .probe test file (runs for timeout duration, default 30s) |
get_widget_tree, take_screenshot, run_script, and run_tests accept an optional device argument (serial or UDID) to pin a specific target.
Reporting & generation (3 tools)
Section titled “Reporting & generation (3 tools)”| Tool | Description |
|---|---|
get_report | Read the most recently modified JSON test run report |
generate_report | Generate a standalone HTML report from a JSON results file |
generate_test | AI-generate a .probe test from a natural language prompt |
Project management (1 tool)
Section titled “Project management (1 tool)”| Tool | Description |
|---|---|
init_project | Initialize a new FlutterProbe project (creates probe.yaml and tests/ scaffold) |
The full workflow this enables: list_devices → start_device (if needed) → get_widget_tree → write_test → run_tests → get_report → generate_report — a complete AI-driven test authoring loop, including device bring-up and HTML reporting.
run_tests flags
Section titled “run_tests flags”The run_tests tool has named parameters for common options (paths, tag, device, composite_devices) and a flags string for everything else. Key flags an agent should know:
| Flag | What it does |
|---|---|
--timeout 60s | Per-step timeout (default 30s) |
-v | Verbose step output — prints → step before each step and ✓/✗ step (Xs) after; slow steps (>5s) emit ⏱ progress ticks and a ⚠ warning at 80% of timeout. Recommended when diagnosing failures. |
--format json | Structured JSON results (pipe to get_report) |
--format junit | JUnit XML for CI systems |
--dry-run | Validate syntax without a device connection |
--parallel | Distribute tests across all connected devices |
--shard 1/3 | Run 1/3 of test files (for CI matrix builds) |
--host <ip> --token <t> | WiFi mode for physical devices |
--disable-animations | Set timeDilation=0 for faster tests |
-y | Auto-approve destructive operations (CI mode) |
--video | Record device screen during the run |
--stream | Emit one ndjson line per test as it completes (requires --format json) |
Tip for agents: pass
-vinflagswhen a test is failing and you need to understand which step is slow or stuck. The⏱progress ticks and⚠timeout warnings appear inrun_testsoutput and tell you exactly where time is being spent before the failure.
Composite (multi-device) tests
Section titled “Composite (multi-device) tests”write_test supports the full composite test syntax for coordinating multiple devices:
composite test "alice sends bob a message" devices A: iPhone 15 Simulator B: Pixel 9 Emulator
A: tap "New Message" type "Hello Bob" in "compose" tap "Send"
sync "message sent"
B: wait until "Hello Bob" appears see "Hello Bob"Pass device connections to run_tests via composite_devices (space-separated ALIAS=SPEC pairs):
- WiFi:
"A=192.168.1.10:48686/token" - iOS simulator:
"B=A1B2C3D4-E5F6-..." - Android:
"C=emulator-5554"
Or configure them in probe.yaml under composite.devices.
Annotation-driven tests (flutter_probe_annotation + flutter_probe_gen)
Section titled “Annotation-driven tests (flutter_probe_annotation + flutter_probe_gen)”When the user’s Flutter project uses the new annotation packages
(flutter_probe_annotation for the DSL and flutter_probe_gen for the
build_runner generator), .probe test files appear under
tests/generated/ after dart run build_runner build. The MCP run_tests
tool runs them like any other .probe file — pass tests/ as the paths
argument.
For agents authoring tests in this style:
- Don’t use
write_testto write totests/generated/directly — those files are regenerated. Usewrite_testto author the annotated Dart source instead (the screen class with@ProbeSuite). - After the user runs
dart run build_runner build, callrun_testsas usual. The generated.probefiles are picked up automatically. read_testworks on either hand-written or generated files.
Requirements
Section titled “Requirements”- Flutter app running with
--dart-define=PROBE_AGENT=true(for live tools likeget_widget_tree,take_screenshot,run_tests) - An MCP-compatible client (Claude Desktop, Cursor, etc.)
- For Option 2 below:
probe-mcpbinary v0.9.0+ installed and inPATH
Install — Option 1: one-click Claude Desktop Extension (recommended)
Section titled “Install — Option 1: one-click Claude Desktop Extension (recommended)”As of v0.9.4, every release includes a .mcpb Claude Desktop Extension that bundles the probe-mcp binary. Installation is one click — no brew, no JSON config, no PATH setup.
- Open the latest GitHub release and download the bundle for your platform:
flutter-probe-darwin-arm64.mcpb— Apple Silicon Macsflutter-probe-darwin-amd64.mcpb— Intel Macsflutter-probe-linux-amd64.mcpb— Linux x86_64flutter-probe-win32-amd64.mcpb— Windows x86_64
- In Claude Desktop, open Settings → Extensions and click Install Extension.
- Pick the downloaded
.mcpbfile. When prompted, select your Flutter project directory (the folder containingprobe.yamlandtests/). - Done — all 18 tools are immediately available in any new Claude conversation.
Auto-updates and lifecycle are handled by Claude Desktop. To update, just install a newer .mcpb over the older one.
Install — Option 2: manual MCP server config
Section titled “Install — Option 2: manual MCP server config”Use this path for Cursor, Claude Code, or any MCP-compatible client other than Claude Desktop, or if you prefer to keep the binary on $PATH.
Verify the binary is accessible:
which probe-mcp # should print /opt/homebrew/bin/probe-mcp or similarClaude Desktop (manual config)
Section titled “Claude Desktop (manual config)”Edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{ "mcpServers": { "flutter-probe": { "command": "probe-mcp" } }}If probe-mcp is not in your shell PATH (common when Claude Desktop doesn’t inherit your shell environment), use the absolute path:
{ "mcpServers": { "flutter-probe": { "command": "/opt/homebrew/bin/probe-mcp" } }}Find the absolute path with which probe-mcp.
Windows
Section titled “Windows”Edit %APPDATA%\Claude\claude_desktop_config.json:
{ "mcpServers": { "flutter-probe": { "command": "probe-mcp.exe" } }}After editing the config, restart Claude Desktop. The flutter-probe tools appear in the tool picker (hammer icon) in any new conversation.
Cursor
Section titled “Cursor”In Cursor, open Settings → MCP (or edit .cursor/mcp.json at the repo root):
{ "mcpServers": { "flutter-probe": { "command": "probe-mcp" } }}Restart Cursor after saving.
Working directory
Section titled “Working directory”The MCP server inherits the working directory from the client. For tools like run_tests, list_files, and get_report to find your test files, the working directory must be your Flutter project root (the folder containing probe.yaml and tests/).
Claude Desktop launches the server with its own working directory, which may not be your project. Set it explicitly:
{ "mcpServers": { "flutter-probe": { "command": "probe-mcp", "cwd": "/Users/you/dev/my-flutter-app" } }}Verifying the connection
Section titled “Verifying the connection”Test the server manually in your terminal:
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' | probe-mcpExpected response:
{"jsonrpc":"2.0","id":1,"result":{"capabilities":{"tools":{}},"protocolVersion":"2024-11-05","serverInfo":{"name":"probe-mcp","version":"0.9.8"}}}List all available tools:
echo '{"jsonrpc":"2.0","id":2,"method":"tools/list","params":{}}' | probe-mcpExample AI sessions
Section titled “Example AI sessions”Single-device test authoring
Section titled “Single-device test authoring”“Look at the widget tree and write a test that verifies the login flow works.”
get_widget_tree— inspect the running app’s UItake_screenshot— see what is on screenwrite_test— createtests/login.probe(content validated before writing)run_tests— execute itget_report— verify all steps passed
Device bring-up from chat
Section titled “Device bring-up from chat”“Boot an iOS simulator and run the smoke tests on it.”
list_simulators— discover available UDIDsstart_devicewith{platform: "ios", udid: "<chosen>"}list_devices— confirm it came onlinerun_testswith{tag: "smoke", device: "<udid>"}
Composite (multi-device) test authoring
Section titled “Composite (multi-device) test authoring”“Write a test that verifies push notifications are delivered between two simulators.”
list_simulators— discover two available iOS simulatorsstart_devicefor each if neededget_widget_treeon both devices (using thedeviceargument) to understand the UIwrite_test— create acomposite testblock with device aliases andsyncbarriersrun_testswithcomposite_devices: "A=<udid1> B=<udid2>"get_report— inspect per-device pass/fail resultsgenerate_report— produce a shareable HTML report
Record then refine
Section titled “Record then refine”“Record my interactions for 30 seconds and turn them into a reusable test.”
recordwith{timeout: "30s", output: "tests/recorded.probe"}— returns the generated filelinton the recorded file to check for issueswrite_testto clean up and refine the generated stepsrun_teststo verify the refined test passes
HTML report from CI results
Section titled “HTML report from CI results”“The CI run produced JSON results. Generate a report I can share.”
get_report— reads the most recently modified JSON inreports/generate_report— producesreports/report.html
Migrating from probe mcp-server
Section titled “Migrating from probe mcp-server”Change:
{ "command": "probe", "args": ["mcp-server"] }to:
{ "command": "probe-mcp" }The tools, protocol, and behavior are identical. The legacy form continues to work but prints a deprecation notice.
Troubleshooting
Section titled “Troubleshooting”Tools don’t appear in Claude Desktop
Section titled “Tools don’t appear in Claude Desktop”- Restart Claude Desktop after editing the config
- Verify the JSON is valid (no trailing commas)
- Check Claude Desktop logs:
~/Library/Logs/Claude/
probe-mcp: command not found
Section titled “probe-mcp: command not found”Claude Desktop may not inherit your shell PATH. Use the absolute binary path:
which probe-mcp # copy this outputThen set "command": "/absolute/path/to/probe-mcp" in the config.
get_widget_tree / take_screenshot return errors
Section titled “get_widget_tree / take_screenshot return errors”These tools require the Flutter app to be running and connected to the probe agent. Start the app first:
flutter run --dart-define=PROBE_AGENT=trueThen confirm the agent is reachable:
probe test --dry-runwrite_test returns a syntax error
Section titled “write_test returns a syntax error”The content you provided failed ProbeScript validation. The file was not created. Check the error message for the line and column of the problem. Common issues:
- Unterminated strings (
"without closing") - Wrong indentation (use 2 spaces, not tabs)
- Unknown step keywords
Run lint on a corrected version to verify before calling write_test again.
record returns “no output file”
Section titled “record returns “no output file””Recording requires a WebSocket-connected device (simulators or emulators). It does not work over HTTP (physical-device WiFi mode). Ensure the device is a simulator or emulator and the app is running.