Discovery
Back to browse

tui-use - drive interactive REPLs from agents

Lets agents interact with programs that expect a human at the keyboard - REPLs, debuggers, TUI apps - things bash pipes cannot reach. Fills the gap between shell and full computer-use.

5 min readView source ↗

The pitch in one line: like BrowserUse, but for the terminal. Agents are great at running shell commands and calling APIs. They're stuck the moment a REPL waits for input, a debugger hits a breakpoint, or a TUI app renders a menu. tui-use fills that gap by spawning programs in a PTY, letting agents read the rendered screen as plain text, and sending keystrokes back.

The problem this solves better than tmux is the wait condition. tmux send-keys has no way to signal when a program is done responding - you sleep 2 and hope, or poll capture-pane in a loop. tui-use observes every PTY render event directly, so wait blocks until the screen has been stable for a configurable idle window. wait --text ">>>" goes further: wait for a semantic signal, not just silence.

What it actually unlocks

Four use cases the README leans into, all real:

  • Scientific computing with large in-memory state - when your variables are arrays with millions of elements that took an hour to compute, you can't dump them to a log file and restart. Drop an agent into a live Python interpreter or pdb session to debug, inspect, and optimise without losing the running process.
  • Debugger sessions - drive GDB, PDB, or any interactive debugger. Set breakpoints, step through code, inspect variables, all from the agent.
  • REPL sessions - run code in Python, Node, or any interactive interpreter, inspect output, keep going. No more one-shot scripts when you actually wanted an interactive session.
  • TUI applications - vim, lazygit, htop, fzf, and other full-screen programs that were never designed to be scripted.

That last one is the underrated case. Plenty of useful CLI tooling is full-screen and assumes a human at the keyboard. Now an agent can use them.

Quick start

npm install -g tui-use

Or from source:

git clone https://github.com/onesuper/tui-use.git
cd tui-use && npm install && npm run build && npm link

A minimal session:

tui-use start python3 examples/ask.py
tui-use wait                       # block until the screen settles
tui-use type "Alice"
tui-use press enter
tui-use wait
tui-use snapshot                   # read the rendered screen as plain text
tui-use kill

A daemon manages PTY sessions across CLI invocations, so the program keeps running between commands.

Plugins for Claude Code and Codex

Both follow the same pattern: install the CLI first (the plugin only provides the skill definitions), then add the marketplace.

Claude Code:

/plugin marketplace add onesuper/tui-use
/plugin install tui-use@tui-use
/reload-plugins

Codex: open the repo in Codex, run /plugins, choose tui-use local plugins, install. Restart the thread and the agent can use it.

Why the rendering matters

PTY output is full of ANSI escape sequences, cursor movement, and screen-clearing codes. Naive screen scrape approaches end up with garbage text. tui-use processes everything through a headless xterm emulator in real time, so the screen field on every snapshot is clean plain text. Cursor positioning is handled correctly. Screen clears reset the buffer correctly. The agent sees what a human at a terminal would see.

The other underrated detail in the snapshot: a highlights field listing inverse-video spans on screen. That's the standard way TUI programs indicate selected items in a menu, the active tab, or the current button. Agents can read which option is currently selected without parsing layout or guessing from cursor position.

The full command surface

tui-use start <cmd>                        # start a program
tui-use start --label <name> <cmd>         # start with a label
tui-use start --cols <n> --rows <n> <cmd>  # custom terminal size (default 120x30)
tui-use use <session_id>                   # switch to a session
tui-use type <text>                        # type text
tui-use type "<text>\n"                    # type with Enter
tui-use type "<text>\t"                    # type with Tab
tui-use paste "<text>\n<text>\n"           # multi-line paste
tui-use press <key>                        # press a key
tui-use snapshot                           # current screen
tui-use snapshot --format json             # JSON output
tui-use scrollup <n>                       # scroll up
tui-use scrolldown <n>                     # scroll down
tui-use find <pattern>                     # regex search in screen
tui-use wait                               # wait for screen change (3000ms timeout)
tui-use wait <ms>                          # custom timeout
tui-use wait --text <pattern>              # wait until screen contains pattern
tui-use wait --debounce <ms>               # idle time after last change (default 100ms)
tui-use list / info / kill / rename        # session lifecycle
tui-use daemon status / stop / restart     # daemon control

The --debounce knob on wait is the key tuning parameter for fast-rendering programs. The default 100ms works for most cases; chatty REPLs sometimes need more.

How the wait actually works

program outputs -> PTY -> xterm emulator -> render event
                                          -> debounce timer resets on each change
                                          -> 100ms of silence -> wait resolves

wait --text <pattern> short-circuits when a known prompt appears, giving agents a semantic readiness signal instead of just "the screen has stopped changing." For tools that print partial progress before settling, the semantic version is the right call.

When to reach for it

  • Anywhere your agent has been getting stuck on prompt: and you've been working around it with timeouts.
  • Long-running interactive sessions where restarting the program (a Jupyter kernel, a slow Python load) is the part that hurts.
  • Driving developer tooling that's TUI-shaped: lazygit, fzf, gh's interactive prompts, vim macros.
  • Test harnesses that exercise CLI applications with interactive prompts.

When not to

  • Pure non-interactive command pipelines. bash plus stdout capture is simpler.
  • GUI applications. tui-use is terminal-only.
  • Agent runs where you specifically want to capture colour and style. The screen field is plain text; colours are stripped (highlights are surfaced separately via inverse-video detection, but full styling isn't preserved).

Trade-offs and limits

The colour/style story is the explicit limit: screen is plain text, highlights covers inverse-video spans (the most semantically useful one), and title plus is_fullscreen are captured. Anything beyond that - foreground colours, bold runs, italics - is dropped. For most agent use cases this is the right trade-off; if you're scripting something that genuinely cares about syntax highlighting, you'll need a different approach.

Prebuilt binaries ship for the common platforms; the installer falls back to building from source if no prebuilt is available, which needs xcode-select --install on macOS or build-essential python3 g++ on Linux.

MIT licensed. The repo includes a /tui-use-integration-test Claude Code skill that runs the full integration suite end-to-end - useful both as a smoke test on install and as a working example of how the tool composes with an agent.

Featured in

Related entries