Changelog

// Release notes and updates for SuperBased

v2.0.13

Latest April 2026

ffmpeg-static cross-arch fix for Intel macOS

Intel macOS dictation fixed — ffmpeg-static ships one binary per install selected by host arch. Cross-installed both Darwin x64 + arm64 binaries during CI build, and the runtime resolveFfmpegPath now picks the arched sibling at ${path}-${process.arch} with fallback. Spawn errors with exit=null on Intel Macs are gone.
asarUnpack expanded for **/ffmpeg-static/* so all three binaries (ffmpeg, ffmpeg-x64, ffmpeg-arm64) land outside the read-only archive where they're actually executable.
Each .dmg ships ~45MB heavier (bundled wrong-arch ffmpeg). Acceptable cost for not breaking ~20% of the Mac user base on first launch.

v2.0.0

Major release April 2026

GUI Automation Suite (S0–S9 + Track C) + Humanization v2

72 MCP tools (was 28) — full agent-hands surface across capture, dictation, and the entire GUI automation suite.
GUI primitives (13) — click, type, hotkey, scroll, drag, hover, pixel_color, mouse_position, wait, wait_for, locate, ui_dump, accessibility_tree. Reliability pyramid: AutomationId / AXIdentifier → role+name → OCR label.
Sequence orchestration — superbased_sequence bundles N steps under one approval, one activation. Per-step returnObservation; explicit screenshot steps return PNG inline.
Semantic + macros (5) — ax_invoke (UIA Invoke / Toggle / SelectionItem.Select / Value.SetValue / ExpandCollapse), form_fill, dialog_handle, context_menu_select, drag_file (Windows scaffold).
Window + browser control (8) — window_state, window_bounds, resize_window (with device presets), focus_window, launch_app (with sha256 hash pinning), open_url, tab_management, find_in_page.
Specialized (3) — tray_click (Shell_TrayWnd cross-process reads), virtual_desktop, doctor_gui_automation.
Cross-platform parity — full Windows + macOS via S9 + Track C. macOS routed through nut-js CGEventPost + osascript System Events AX + @superbased/macos-ax binding + /usr/bin/open.
Humanization v2 — per-call humanize: 'off' | 'light' | 'human' | 'paranoid'. Bezier-curved cursor approaches with sin-shaped velocity envelope, gaussian click-target jitter, gamma-distributed inter-key timing, click + key hold variation (50–110ms vs ~5ms atomic), pre-click settle dwell + tremor, optional typo+correct sequences. Per-process cross-session salt mixed into seeds. Defeats trajectory and timing classifiers used by reCAPTCHA, Cloudflare Turnstile, hCaptcha, GeeTest, KeyCAPTCHA.
Safety rails — master toggle guiAutomation.enabled (off by default); per-action group toggles (safe / write / destructive); confirm flag required; protected-apps blocklist; protected-process geofence (SELF_TARGET_REFUSED); kill switch (Ctrl+Shift+Esc).
Audit log + replay — every action recorded as NDJSON to ~/.superbased/audit.log. superbased_replay reproduces trajectories byte-for-byte within a process.
Stdio bridge co-existence — superbased mcp detects a running desktop instance via PID file and acts as a stdio↔HTTP bridge to its :47592 endpoint. Stdio MCP clients work whether the desktop, a headless serve, or nothing is running.
Plugin ecosystem — 7 platforms shipped: Claude Code, OpenAI Codex, Cursor, GitHub Copilot CLI, Cline (MCP marketplace), OpenCode (npm), OpenClaw (ClawHub).

v1.6.7

Stable April 2026

Dedicated multi-region capture hotkey + predictable re-press behavior

Multi-region hotkey — new Ctrl+Shift+D (Cmd+Shift+D on macOS) opens the selection canvas in forced multi-region mode. Every drag accumulates a region without needing to hold Ctrl. Fully customizable in Settings > General.
Done button & Space to confirm — floating "Done (N)" button bottom-right once you've selected at least one region, or press Enter or Space to commit all regions at once. Right-click pops the last region.
Appends to open overlay — if the overlay is already open, multi-region captures are added to the current session as separate thumbnails, matching the single-region "add to session" behavior.
Predictable re-press — pressing any capture hotkey (Ctrl+Shift+S / D / A) while the selection canvas is already open now cancels it instead of reopening a second window. Double-tap to dismiss.
Stale state fixes — scroll-capture mode and "adding to overlay" flags now reset on every selection close path (complete, Esc, crash), eliminating mode leaks between captures.
Flicker fix — right-clicking to pop a region no longer flashes the canvas. Canvas sizing is now decoupled from region state.

v1.5.0

Stable April 2026

Headless mode, MCP improvements, dictation performance

Window targeting — capture any window by name, even minimized (ShowWindow + SetForegroundWindow via Win32 P/Invoke)
28 MCP tools + 9 resources — expanded tool descriptions, skill docs rewritten for better agent discovery
Dictation cleanup pipeline — full 4-stage processing: deduplication, technical term normalization, filler removal, short phrase handling
100+ brand name normalizations — "Cloud Code" → Claude Code, "Super Bass" → SuperBased, "deep seek" → DeepSeek, and more
955 Azure Speech phrases — 170 new AI/ML terms (models, frameworks, tools, vector DBs)
Dictation performance — clipboard pre-save moved to recording start (~200-300ms faster stop path)
MCP tool descriptions — window targeting, dictation mic recording, clipboard readImage documented
superbased mcp command — unified API server + MCP stdio bridge in one process
Published to npm as superbased@1.5.0

v1.4.0

Stable April 2026

Headless CLI + MCP server, push-to-talk dictation

Headless mode — npm global package, no Electron dependency, full API server + MCP bridge
superbased mcp — single command for all AI editor integrations
Push-to-talk dictation — Cmd+Opt (macOS) / Ctrl+Alt (Windows) hotkey, warm mic, standby Azure recognizer
Window enumeration — Win32 EnumWindows via C# for multi-window apps
Capture hotkey — Ctrl+Shift for instant fullscreen screenshots
Free tier — 50 AI credits/day for first 30 days
MCP configs for Claude Code, Cursor, Windsurf, Cline, OpenCode, Zed, Codex

v1.0.0

Stable March 2026

Initial public release

Region, window, fullscreen, scroll, and timed captures
Image editor with annotations, blur, redact, numbered steps
AI analysis with OpenAI, Anthropic, and Google AI (BYOK)
Workflow recording with visual regression testing
Gallery with search, tags, notes, and statistics
OCR text extraction
Pin screenshots to desktop
Color picker tool
Auto-updates via GitHub Releases