"Tests are how you turn an unreliable agent into a reliable system" <— 100%.
Once you're running 5+ agents in parallel, review-as-trust stops being physically possible. There isn't time, and there are too many concurrent diffs. The trust mechanism has to move into the gates: precise specs, deterministic test suites, exit codes as ground truth.
It's roughly the same shift teams went through when they moved from "senior dev reads every PR" to "CI is the source of truth." Mechanical, unromantic, and the only thing that truly scales.
On the skill atrophy point, I think the analysis is directionally right. The senior engineering skill that agents reward most won’t be review. It's having the system design fundamentals and writing the quality gates to assure functional needs are met and non-functional fundamentals are in tact.
Why do so many people talk as if we have to increase the rate of code production without limit? In “It’s the only thing that scales”, you assume that scaling is good. Clearly LLM code generation has sped up a lot of things and that’s very useful, but there has to be a limit to this, otherwise humans gradually lose our claim to be doing anything like “engineering”. But I know, capital’s hunger can’t be sated, we’re all in service to the feedback loop.
If you’re not reviewing, your vibe coding imo.
You can trust CI without doing code reviews when who’s shipping the PR is trustworthy.
You don’t trust junior’s PR only because CI is green. In my experience AI is not 100% trustworthy yet.
Before AI, copy-pasting from the Internet was seen as a noob thing to do.
Immediately after ChatGPT, copy-pasting from ChatGPT was seen with the exact same sentiment.
Maybe the code is more tailored towards you, but it's average, context-less, thoughtless.
With agents reading the codebase and writing code in what appears to be a context that approximates a developer's, we've recently crossed the threshold where a serious software developer can say they don't write their own code any more and not get chastised.
When you claim reviewing is the golden threshold that must not be crossed: Sure, but we very recently started accepting that code can be written entirely by agents. I don't yet automate code review, but I've been reading about some code review agents called Hickey and Löwy: https://kolu.dev/blog/hickey-lowy/#what-the-two-lenses-are --
> A Hickey reviewer reads code the way a lockpicker reads a tumbler2 — looking for concepts that shouldn’t be in the same position. The output is always “split these apart.”
> A Löwy reviewer reads code the way an actuary reads a portfolio3 — looking for things coupled to unrelated schedules. The output is always “draw a boundary that encapsulates this volatility.”
So the art of reviewing can, perhaps, be synthesized. Perhaps not for complete human replacement.
I believe that a lot of wetware is still highly mechanical and can be extracted as agentic workflows.
The goal is to move the human towards "software architect" and "product owner".
The real danger is cutting off the bridge that qualifies current seniors to be hands-off.
I don't see the skill atrophy of being an AI manager paying back what took me 20 years of pain to qualify.
"Agentic Engineering" (not vibe-coding, or vibe-engineering) sounds like a far better fit on what the future is going to be (or already is).
You are still responsible for the code that the agent is producing. You still need to understand the outputs and the tests that make sense to create otherwise you are creating tests for the sake of testing, which is useless.
The best comparison of all of this is autopilot and advanced driver-assistance systems (ADAS) in cars. You still need to watch the road at all times and keep your hands on the wheel just in case you need to intervene.
The "vibe-coding" equivalent of this in cars means you are not looking at the road, nor you have hands on the wheel and allowing the car to drive itself without any intervention and no teleoperation. Great for short journeys but problematic for longer journeys and absolutely unacceptable when the AI goes down or is offline and you are stuck in the middle of the highway.
So "Agentic Engineering" is much better without losing the responsibility.
I could not read the whole piece. The phrasing smells like AI too much.
Indeed.
> As one engineer put it, “This isn’t engineering, it’s hoping.”
As one LLM put it, rather.
"Tests are how you turn an unreliable agent into a reliable system" <— 100%.
Once you're running 5+ agents in parallel, review-as-trust stops being physically possible. There isn't time, and there are too many concurrent diffs. The trust mechanism has to move into the gates: precise specs, deterministic test suites, exit codes as ground truth.
It's roughly the same shift teams went through when they moved from "senior dev reads every PR" to "CI is the source of truth." Mechanical, unromantic, and the only thing that truly scales.
On the skill atrophy point, I think the analysis is directionally right. The senior engineering skill that agents reward most won’t be review. It's having the system design fundamentals and writing the quality gates to assure functional needs are met and non-functional fundamentals are in tact.
Why do so many people talk as if we have to increase the rate of code production without limit? In “It’s the only thing that scales”, you assume that scaling is good. Clearly LLM code generation has sped up a lot of things and that’s very useful, but there has to be a limit to this, otherwise humans gradually lose our claim to be doing anything like “engineering”. But I know, capital’s hunger can’t be sated, we’re all in service to the feedback loop.
If you’re not reviewing, your vibe coding imo. You can trust CI without doing code reviews when who’s shipping the PR is trustworthy. You don’t trust junior’s PR only because CI is green. In my experience AI is not 100% trustworthy yet.
> If you’re not reviewing, your vibe coding imo.
Before AI, copy-pasting from the Internet was seen as a noob thing to do.
Immediately after ChatGPT, copy-pasting from ChatGPT was seen with the exact same sentiment.
Maybe the code is more tailored towards you, but it's average, context-less, thoughtless.
With agents reading the codebase and writing code in what appears to be a context that approximates a developer's, we've recently crossed the threshold where a serious software developer can say they don't write their own code any more and not get chastised.
When you claim reviewing is the golden threshold that must not be crossed: Sure, but we very recently started accepting that code can be written entirely by agents. I don't yet automate code review, but I've been reading about some code review agents called Hickey and Löwy: https://kolu.dev/blog/hickey-lowy/#what-the-two-lenses-are --
> A Hickey reviewer reads code the way a lockpicker reads a tumbler2 — looking for concepts that shouldn’t be in the same position. The output is always “split these apart.”
> A Löwy reviewer reads code the way an actuary reads a portfolio3 — looking for things coupled to unrelated schedules. The output is always “draw a boundary that encapsulates this volatility.”
So the art of reviewing can, perhaps, be synthesized. Perhaps not for complete human replacement.
I believe that a lot of wetware is still highly mechanical and can be extracted as agentic workflows.
The goal is to move the human towards "software architect" and "product owner".
The real danger is cutting off the bridge that qualifies current seniors to be hands-off.
I don't see the skill atrophy of being an AI manager paying back what took me 20 years of pain to qualify.
Exactly.
"Failure mode" ... bye bye
"Agentic Engineering" (not vibe-coding, or vibe-engineering) sounds like a far better fit on what the future is going to be (or already is).
You are still responsible for the code that the agent is producing. You still need to understand the outputs and the tests that make sense to create otherwise you are creating tests for the sake of testing, which is useless.
The best comparison of all of this is autopilot and advanced driver-assistance systems (ADAS) in cars. You still need to watch the road at all times and keep your hands on the wheel just in case you need to intervene.
The "vibe-coding" equivalent of this in cars means you are not looking at the road, nor you have hands on the wheel and allowing the car to drive itself without any intervention and no teleoperation. Great for short journeys but problematic for longer journeys and absolutely unacceptable when the AI goes down or is offline and you are stuck in the middle of the highway.
So "Agentic Engineering" is much better without losing the responsibility.
[flagged]
[flagged]