3 comments

  • ikidd 3 hours ago ago

    I remember the idea of "swear at the LLM to get better results" and I even think it somewhat worked, at least for a while.

    This is probably how we'll end up with a HAL9000 burning the world to the ground.

  • polotics 8 hours ago ago

    Pretty please ask Claude to start with benchmarks that measure your approach against other approaches. I did read all and only found:

    Selected numbers from live system runs:

    Scenario , Naive shell approach , Hollow API , Savings

    Code search , 21636 tokens , 987 tokens , 95%

    Agent drift (cons. rate) , 35% (cold start) , 70% (with handoff) , 2x

    That is a lot less than enough to justify a git clone.

    • ninjahawk1 4 minutes ago ago

      Still definitely a work in progress, I don’t have those numbers yet but will be getting them soon. It’s built on agentOS which is intended to be a way for the agents themselves to add improvements that would cut costs even more. If someone finds a new way to cut costs, the idea is that these agents find it on github or are told about it, then they implement it and start using it. That’s the self modification loop part of it. So numbers specifically are difficult to pinpoint at the moment. Cutting costs is definitely important but I wouldn’t say that’s what I’m trying to accomplish with this.

      The eventual goal is a self modifying system that humans don’t have to touch, like how ants build an ant hill, no single agent has to get the whole picture. They just need to know their immediate job. Throw a project at it and let them do it to save on tokens is more of the consumer bonus, a big bonus, but a bonus nonetheless.

      I’ve been making steady improvements, I’m hoping that by the end of the summer it’s much more robust than it already is.