Skip to main content

UI Automata

Windows Desktop Control and Automation for AI Agents

A declarative workflow engine that gives AI agents structured,
observable, repeatable control over any Windows GUI app.

Declarative

Declarative YAML workflows that are robust by design: expected outcomes on every step, automatic recovery from bad states, and no fragile sleep timers.

Cross-Framework Coverage

Across all UI surfaces — Win32, MFC, WPF, UWP, WinUI 3, terminal window, and web browser. All access modes supported: UIA, DOM, and pure vision.

Agent-Native

Explore live UI element trees interactively, author workflows with schema guidance and linting, and drive everything through rich MCP tools designed for AI agents.

Industrial-Grade Apps

Automate complex, professional-grade applications with deep, dense UI hierarchies. Reliable automation for the apps that matter in real workflows.

UI Automata vs Computer Use

UI AutomataComputer Use (Cowork)
ApproachUIA elements + DOM query + visionScreenshot only
ReliabilityDeterministic — same selector works across runsMay vary across runs
SpeedSub-second per stepRound-trip to inference API per step
CostLow — runs locally, no per-step inferenceHigh — every step consumes token
VisionOn-device, used as fallbackCloud inference, primary approach
PlatformWindows (all frameworks)macOS-first, limited Windows
Model dependencyAny agent, any modelLocked to Claude
Browser automationCDP (structured page access)Screenshot of browser
TraceStructured log with detailed action per stepSequence of screenshots