Principles for agent-native CLIs

(twitter.com)

51 points | by blumpy22 7 hours ago ago

29 comments

  • pseudosavant an hour ago ago

    I'm all in on agent-first CLIs. The CLIs I've been building have still been easier to use for me as a human than the average CLI tool. It isn't like CLIs tools have famously simple or consistent arguments from tool to tool anyway.

    I find it so much more successful to have an agent interact with a CLI than an API or MCP. I can just ask: query my dev DB for an ideal URL to test a new page. It'll find the right users, resources, etc and create an excellent test URL to quickly validate the behavior of my changes. I can have it get the latest spec from Confluence, or find the latest PR build for a workitem.

    If you have an API, you should really look at providing a CLI for it too.

    Plugging my tools/examples:

    - https://github.com/pseudosavant/confluence-fetch

    - https://github.com/pseudosavant/azwi

    - https://github.com/pseudosavant/sql-agent-cli

  • wolttam 5 hours ago ago

    Getting agents used to using `--force` to bypass prompts seems like a bad idea. `--force` is for when the action failed (or would fail) for some reason and you want it to definitely happen this time.

    I think `--yes` or `--yes-do-the-dangerous-thing` is leagues better.

    • tekacs 5 hours ago ago

      It also in the case of an LLM can bias it towards using that sort of flag more commonly, which is less than ideal when it then uses a more ordinary Unix command that uses that to mean something dangerous.

    • dimes 2 hours ago ago

      CLIs should check isatty and, if it returns false, disable any interactive functionality because it won’t work.

    • ihsw 5 hours ago ago

      `--non-interactive` has precedent too.

  • tfrancisl 5 hours ago ago

    I dont want "agent-native CLIs" to proliferate because I'd rather we design CLIs for human use and programmatic (automation) use first. Agents are good at vomiting json between tool calls, I am not, and never will be.

    Too many tools stray so wildly from UNIX principles. If we design for agents first we will likely see more and more of this.

    • theshrike79 4 hours ago ago

      The point IMO in "agent-native CLIs" is to make them match the statistical average.

      Let the Agent use the CLI and if it guesses the wrong option, you make that the RIGHT option.

      Every time it doesn't guess something right, you change it.

      • pmontra 4 hours ago ago

        I would naively suppose that the agent is able to read the man page or run the help command of the tool. They usually contain plenty of information. But bending the tool to suit the agent has some value. The GNU-AI suite of userland tools? Unfortunately it's possible that every model will settle on a different average. If that's the case we can't bend to every model. Models will have to bend to whatever we want to use.

        • riknos314 2 hours ago ago

          If the parameter names mostly standardize across tools because the models learn to predict those names, then humans will also learn to predict those flag names so this actually has the potential to make tools more human friendly and easier to learn.

        • theshrike79 3 hours ago ago

          Of course it can read the man page and run cmd --help.

          Now you've wasted context on, what? Learning how to use the tool. And it will waste context on it every single time. (You can write skills to mitigate this a bit, but still).

          The alternative is to make the tool work as the user (an LLM in this case) expects it to work, without having to resort to the manual.

      • tfrancisl 4 hours ago ago

        > Let the Agent use the CLI and if it guesses the wrong option, you make that the RIGHT option

        This sounds backwards and presumes that the statistics machines which are LLMs are getting it right when they "average" out to the wrong command. No, fix the agents behavior, dont change the CLI to accommodate it.

        • alchemist1e9 4 hours ago ago

          I don’t remember exactly the specific examples off the top of my head (some are definitely ffmpeg commands) but I do know that when LLMs keep hallucinating command line flags that don’t exist for that specific command their “suggestion” is actually very reasonable and so many developers are adding support to their tools for common hallucinations.

          • tfrancisl 3 hours ago ago

            Not to belabor my point, but I think "adding support to tools for common hallucinations" is a bad idea. Sounds like something a vibecoded project being spammed with issues by agents might do. Not so much a serious, mature project, though.

            • alchemist1e9 3 hours ago ago

              Well we will have to agree to disagree because my understanding of what has been generally the case is that the LLMs might vibe-coding spam, that’s true, but the interesting difference is generally speaking their “suggestions” are very reasonable and represent in hindsight useful changes that make the commands more useful for everyone, humans included.

    • alchemist1e9 4 hours ago ago

      It’s also likely that agents would also be better if they didn’t deal with json vomit either. I’m optimistic that agent frameworks will eventually come full circle and realize concise teletype linear CLIs aka old school UNIX is actually very effective and efficient for agents as well as humans!

  • zbentley 2 hours ago ago

    For reasons other commenters have expressed better than I could, the idea of "agent-native CLIs" seems like a poor one.

    Why not just do the "mycli skill-path" idea from the article, and skip the rest? Basically:

    1. Add regular, for-humans-or-programs flags and modes to your CLI as single-purpose, composable features (otherwise known as "how we've always added lots of features to a CLI without legislating a particular use-case"). Doing this in a messy way makes a messy CLI, same as it ever was. Don't do it in a messy way.

    2. When requested, have the CLI itself, or its manual/website, puke out a skill file which directs agents in how to compose those things for likely LLM uses of the CLI. Talking hardcoded, static text here. Nothing crazy.

    In other words, a "manpage for LLMs" or "manpage-as-skill" option. That's a lot more flexible and easier to change and update than an entire made-for-LLMs behavior layer. So you'd have "man mytool" and "skill mytool" available as separate documents, emphasizing separate capabilities of the same underlying CLI. "skill mytool" would be for use by LLMs or for piping "skill mytool > SKILL.md" or whatever.

    This is a little bit analogous to Git's notion of "porcelain" and "plumbing" (not that Git's a particularly sterling example of composable, friendly UX). The composable or special-case-only APIs still exist for direct use, are dogfooded internally for the human-user-intended paths, and a pre-baked document exists directing LLMs/users in how to use those lower-level details effectively.

    Sure, LLMs can read your manpage/helpdoc, or website, or source code, and figure things out, but that's slow and costs tokens and command-approval loops. This is a marginal efficiency proposal at best, but hopefully one that discourages people from writing bimodal, tortured CLIs just for the sake of LLM-friendliness.

    Is that nuts?

  • qudat 2 hours ago ago

    The entire concept that we need to cater CLIs to agents at all should tell us how far away they are from being “junior devs” or “an intern” and I reject the premise.

    A lack of structured output has never been a blocker for agents to work, that’s a traditional coding problem.

    “Write good help text and error messages” is just good design which is self evident.

  • rahimnathwani 4 hours ago ago

    This guy took inspiration from gog cli (steipete's cli for Google Workspace, which predates gws cli and is apparently more agent-friendly and token-efficient):

    https://github.com/mvanhorn/cli-printing-press

    He made a whole bunch of agent-friendly CLIs: https://printingpress.dev/

    https://github.com/mvanhorn/printing-press-library/tree/main...

  • debarshri 5 hours ago ago

    I think every CLI is agent native when invoked from claude or any coding agents.

    I was really suprised today. We at adaptive [1], is an access management platform to access psql, mysql, vms, k8s etc. When you use `adaptive connect <db-name>` it would connect create just-in-time tunnel and connect the user to the database. You cannot do traditional psql operation etc. That design is by choice.

    Today I was trying to invoke it via claude, and, god damn, it found a way to connect. It create a pseudo shell in python, pass the queries and treat our cli like a tool. This would have been humanly not possible. Partly because, you would like about risks, good practice/bad practice, would be scared to execute and write code like that, and it just did it and acheived the goal.

    [1] https://adaptive.live

  • peterldowns 2 hours ago ago

    This is really good, particularly the async tasks part. Hadn't thought about that. We'll be thinking about these lessons for the next version of our agent CLI.

  • isityettime 3 hours ago ago

    I broadly agree with the article. But I think it's wrong about the failures of past command-line interface design. The author writes:

    > There's a deeper assumption underneath all of it. The classic Command Line Interface Guidelines treat a human at a terminal as the primary user, with agents as a tolerated secondary audience. That's no longer the right default. Cloudflare puts it directly in their post: "Increasingly, agents are the primary customer of our APIs." Their whole schema approach is built around that. HeyGen launched their CLI with "agent" in the marketing copy. Design for agents first, and humans benefit. Designing for humans first and bolting on agent support is what produces the inconsistent, prompt-prone, stdout-only CLIs the first five principles exist to correct.

    I don't think that's true at all. If you're someone who has lived in the terminal for a few years, you will have a sense of taste that naturally leads you to do the right thing. If you've used Git and systemctl and you know why p7zip feels alien on Unix and you have cursed a command where `-h` doesn't mean help, nobody needs to tell you basically any of this. If you've ever met jq, you don't need anyone to tell you that `--json` is a very valuable thing to have. You also don't need anyone to tell you what a uniform hierarchy of flags and options with different scopes should look like; if you've used a program that uses subcommands, even a shitty one, you know what a good one should look like.

    When command-line tools (or inconsistent collections thereof) are difficult for AI in the ways the article describes, it's because they're shit. When command-line tools are shit, it's because nobody is taking the design of those interfaces seriously at all, typically some combination of:

      - the interface isn't "designed" at all, it's just naively evolved.
      - you're leaving writing a CLI tool to someone who tolerates the command-line but doesn't live in it
      - the object is treated as only a human/interactive interface or only a programming interface when in fact it's always both
      - your suite of tools has diffuse ownership and nobody thinks command-line interfaces are important enough to have standards for
    
    If you treat a GUI as unseriously as that, it invariably turns to a pile of shit, too!

    Anybody who ought to be writing one has already internalized all the right norms. Most of it comes for free from living in the shell. Put one person in charge and it'll be uniform. If you can't, writing a style guide and enforcing it with linters and tests is a great idea. But this is just taking command-line interfaces seriously as interfaces. It has pretty much nothing to do with AI except at the edges (e.g., json-flavored companion to --help).

    • olcay_ 2 hours ago ago

      ⅑3²

  • jiehong 4 hours ago ago

    This reminds me that agents sometimes really like heredoc in shells, and waste tokens retrying with a file.

  • sandermvanvliet 5 hours ago ago

    Is it me or are all these articles about using AI effectively and building for AI just, you know, things that we should have been doing all along?

    It feels like most of the “rules” are “don’t be an ass to your consumer”.

    • tom_ 3 hours ago ago

      Doing stuff for other people: generally low-status work, to one degree or another

      Doing stuff for the machine: the behaviour of a pragmatic, nuanced builder. A forward-thinking agentic AI pioneer, executing and shipping at the unexplored boundary of modes of human creativity #building #shipping #executing

    • bensyverson 5 hours ago ago

      Partially, but I think if you design for agents, their needs are different enough from a human's that you end up making different choices.

      I found myself nodding along to the linked tweet/article. Recently I did many rounds of iterative user-centered design with an agent to improve the CLI interface in Jobs [0], a task manager for LLMs. The resulting CLI follows most of these principles.

      One great idea from the tweet that I will be adding: a `feedback` subcommand, for the agent to capture feedback while they work.

      [0]: https://github.com/bensyverson/jobs

  • walski 4 hours ago ago

    Definitively super human ultra intelligence by the end of Q4!!!!11 Also not able to use tools, which are not explicitly built for machine consumption.

  • ChrisArchitect 4 hours ago ago