Changelog
// Release notes and updates for SuperBased
v2.0.13
Latest April 2026ffmpeg-static cross-arch fix for Intel macOS
- Intel macOS dictation fixed —
ffmpeg-staticships one binary per install selected by host arch. Cross-installed both Darwin x64 + arm64 binaries during CI build, and the runtimeresolveFfmpegPathnow picks the arched sibling at${path}-${process.arch}with fallback. Spawn errors withexit=nullon Intel Macs are gone. - asarUnpack expanded for
**/ffmpeg-static/*so all three binaries (ffmpeg, ffmpeg-x64, ffmpeg-arm64) land outside the read-only archive where they're actually executable. - Each .dmg ships ~45MB heavier (bundled wrong-arch ffmpeg). Acceptable cost for not breaking ~20% of the Mac user base on first launch.
v2.0.0
Major release April 2026GUI Automation Suite (S0–S9 + Track C) + Humanization v2
- 72 MCP tools (was 28) — full agent-hands surface across capture, dictation, and the entire GUI automation suite.
- GUI primitives (13) —
click,type,hotkey,scroll,drag,hover,pixel_color,mouse_position,wait,wait_for,locate,ui_dump,accessibility_tree. Reliability pyramid: AutomationId / AXIdentifier → role+name → OCR label. - Sequence orchestration —
superbased_sequencebundles N steps under one approval, one activation. Per-stepreturnObservation; explicitscreenshotsteps return PNG inline. - Semantic + macros (5) —
ax_invoke(UIA Invoke / Toggle / SelectionItem.Select / Value.SetValue / ExpandCollapse),form_fill,dialog_handle,context_menu_select,drag_file(Windows scaffold). - Window + browser control (8) —
window_state,window_bounds,resize_window(with device presets),focus_window,launch_app(with sha256 hash pinning),open_url,tab_management,find_in_page. - Specialized (3) —
tray_click(Shell_TrayWnd cross-process reads),virtual_desktop,doctor_gui_automation. - Cross-platform parity — full Windows + macOS via S9 + Track C. macOS routed through nut-js CGEventPost + osascript System Events AX + @superbased/macos-ax binding + /usr/bin/open.
- Humanization v2 — per-call
humanize: 'off' | 'light' | 'human' | 'paranoid'. Bezier-curved cursor approaches with sin-shaped velocity envelope, gaussian click-target jitter, gamma-distributed inter-key timing, click + key hold variation (50–110ms vs ~5ms atomic), pre-click settle dwell + tremor, optional typo+correct sequences. Per-process cross-session salt mixed into seeds. Defeats trajectory and timing classifiers used by reCAPTCHA, Cloudflare Turnstile, hCaptcha, GeeTest, KeyCAPTCHA. - Safety rails — master toggle
guiAutomation.enabled(off by default); per-action group toggles (safe / write / destructive); confirm flag required; protected-apps blocklist; protected-process geofence (SELF_TARGET_REFUSED); kill switch (Ctrl+Shift+Esc). - Audit log + replay — every action recorded as NDJSON to
~/.superbased/audit.log.superbased_replayreproduces trajectories byte-for-byte within a process. - Stdio bridge co-existence —
superbased mcpdetects a running desktop instance via PID file and acts as a stdio↔HTTP bridge to its :47592 endpoint. Stdio MCP clients work whether the desktop, a headless serve, or nothing is running. - Plugin ecosystem — 7 platforms shipped: Claude Code, OpenAI Codex, Cursor, GitHub Copilot CLI, Cline (MCP marketplace), OpenCode (npm), OpenClaw (ClawHub).
v1.6.7
Stable April 2026Dedicated multi-region capture hotkey + predictable re-press behavior
- Multi-region hotkey — new Ctrl+Shift+D (Cmd+Shift+D on macOS) opens the selection canvas in forced multi-region mode. Every drag accumulates a region without needing to hold Ctrl. Fully customizable in Settings > General.
- Done button & Space to confirm — floating "Done (N)" button bottom-right once you've selected at least one region, or press Enter or Space to commit all regions at once. Right-click pops the last region.
- Appends to open overlay — if the overlay is already open, multi-region captures are added to the current session as separate thumbnails, matching the single-region "add to session" behavior.
- Predictable re-press — pressing any capture hotkey (Ctrl+Shift+S / D / A) while the selection canvas is already open now cancels it instead of reopening a second window. Double-tap to dismiss.
- Stale state fixes — scroll-capture mode and "adding to overlay" flags now reset on every selection close path (complete, Esc, crash), eliminating mode leaks between captures.
- Flicker fix — right-clicking to pop a region no longer flashes the canvas. Canvas sizing is now decoupled from region state.
v1.5.0
Stable April 2026Headless mode, MCP improvements, dictation performance
- Window targeting — capture any window by name, even minimized (ShowWindow + SetForegroundWindow via Win32 P/Invoke)
- 28 MCP tools + 9 resources — expanded tool descriptions, skill docs rewritten for better agent discovery
- Dictation cleanup pipeline — full 4-stage processing: deduplication, technical term normalization, filler removal, short phrase handling
- 100+ brand name normalizations — "Cloud Code" → Claude Code, "Super Bass" → SuperBased, "deep seek" → DeepSeek, and more
- 955 Azure Speech phrases — 170 new AI/ML terms (models, frameworks, tools, vector DBs)
- Dictation performance — clipboard pre-save moved to recording start (~200-300ms faster stop path)
- MCP tool descriptions — window targeting, dictation mic recording, clipboard readImage documented
- superbased mcp command — unified API server + MCP stdio bridge in one process
- Published to npm as
superbased@1.5.0
v1.4.0
Stable April 2026Headless CLI + MCP server, push-to-talk dictation
- Headless mode — npm global package, no Electron dependency, full API server + MCP bridge
- superbased mcp — single command for all AI editor integrations
- Push-to-talk dictation — Cmd+Opt (macOS) / Ctrl+Alt (Windows) hotkey, warm mic, standby Azure recognizer
- Window enumeration — Win32 EnumWindows via C# for multi-window apps
- Capture hotkey — Ctrl+Shift for instant fullscreen screenshots
- Free tier — 50 AI credits/day for first 30 days
- MCP configs for Claude Code, Cursor, Windsurf, Cline, OpenCode, Zed, Codex
v1.0.0
Stable March 2026Initial public release
- Region, window, fullscreen, scroll, and timed captures
- Image editor with annotations, blur, redact, numbered steps
- AI analysis with OpenAI, Anthropic, and Google AI (BYOK)
- Workflow recording with visual regression testing
- Gallery with search, tags, notes, and statistics
- OCR text extraction
- Pin screenshots to desktop
- Color picker tool
- Auto-updates via GitHub Releases