IF you don't check the code, you can't navigate correctly the UI. The AI is non-deterministic, and you can never be sure, that it will produce the same quality and use the same approach everytime. Even, with all the rules, lessons learned, documentation, a code review afterwards will find something. With that in mind, in order to write an assembler program, you need someone to teach the model how to do it. And in order to teach it, you need to be able to read what's generated ;)
the premise assumes nobody reads the code but thats not true. I review every AI generated diff and the model gets things wrong constantly, subtle stuff like changing a function signature that breaks another module or creating circular dependencies. If that was assembler I'd have zero chance of catching it
also you still need to maintain it. When something breaks in production you need to understand what the code does.
the real bottleneck with AI coding isnt the language its context. The model needs to understand your conventions, patterns, business logic. That gets exponentially worse with lower level languages not better.
That would be quite expensive in terms of time and token use. It would need to be tested, and you’d have so many repetitive tests you might as well encode the behavior they expect in generators of blocks of assembly, i.e. higher level languages and compilers.
First reason, LLMs are modeled from what humans have been doing, and the have been writing software that way recently so it's easier to mimick that to get straight to results. This reason might fade away in the future.
Second reason, something related to impedance (mis)match, a signal processing notion (when the interface between two media is not well-suited, it is difficult to have a signal pass through).
Going through intermediate levels makes a structured workflow where each steps follows the previous one "cheaply". On the contrary, straight generating something many layers away requires juggling with all the levels at once, hence more costly.
So "cheaply" above both means "better use of a LLM context" but also use regular tools where they are good instead of paying the high price (hardware+computation+environment) of doing it via LLM.
Interestingly, AIs are used to generate sample-level audio and some video, which may look like it contradicts the point. Still they are costly (especially video).
Layers of abstraction remain effective and valuable. Why reinvent state management, for example, with each application?
Runtime also matters; you can’t run assembly on the web.
Security mechanisms can also preclude assembly.
Etc.
FWIW, your question stopped short before the bottom turtle in the stack. Below assembly is machine code. So your question could rather be, why not emit machine code. Assembly is made for humans because we can understand it, but machine code is not really tractable for humans to engage with in a meaningful way.
Programming languages are not just for ergonomics. They are valuable abstractions that help us reason. And they also help LLMs reason in the same manner.
IF you don't check the code, you can't navigate correctly the UI. The AI is non-deterministic, and you can never be sure, that it will produce the same quality and use the same approach everytime. Even, with all the rules, lessons learned, documentation, a code review afterwards will find something. With that in mind, in order to write an assembler program, you need someone to teach the model how to do it. And in order to teach it, you need to be able to read what's generated ;)
the premise assumes nobody reads the code but thats not true. I review every AI generated diff and the model gets things wrong constantly, subtle stuff like changing a function signature that breaks another module or creating circular dependencies. If that was assembler I'd have zero chance of catching it
also you still need to maintain it. When something breaks in production you need to understand what the code does.
the real bottleneck with AI coding isnt the language its context. The model needs to understand your conventions, patterns, business logic. That gets exponentially worse with lower level languages not better.
That would be quite expensive in terms of time and token use. It would need to be tested, and you’d have so many repetitive tests you might as well encode the behavior they expect in generators of blocks of assembly, i.e. higher level languages and compilers.
Two reasons.
First reason, LLMs are modeled from what humans have been doing, and the have been writing software that way recently so it's easier to mimick that to get straight to results. This reason might fade away in the future.
Second reason, something related to impedance (mis)match, a signal processing notion (when the interface between two media is not well-suited, it is difficult to have a signal pass through).
Going through intermediate levels makes a structured workflow where each steps follows the previous one "cheaply". On the contrary, straight generating something many layers away requires juggling with all the levels at once, hence more costly. So "cheaply" above both means "better use of a LLM context" but also use regular tools where they are good instead of paying the high price (hardware+computation+environment) of doing it via LLM.
Interestingly, AIs are used to generate sample-level audio and some video, which may look like it contradicts the point. Still they are costly (especially video).
Dave Plummer claims to have successfully generated working executable PE binaries using ChatGPT.
https://x.com/davepl1968/status/2044482592620351955
Layers of abstraction remain effective and valuable. Why reinvent state management, for example, with each application?
Runtime also matters; you can’t run assembly on the web.
Security mechanisms can also preclude assembly.
Etc.
FWIW, your question stopped short before the bottom turtle in the stack. Below assembly is machine code. So your question could rather be, why not emit machine code. Assembly is made for humans because we can understand it, but machine code is not really tractable for humans to engage with in a meaningful way.
Programming languages are not just for ergonomics. They are valuable abstractions that help us reason. And they also help LLMs reason in the same manner.
AI doesn't actually know anything, just predicts, and as such most training data is in high level languages.
Because there’s not enough learning material? Like most of the code LLMs have stolen for training is highlevel code, not assembly
Ai are are not smart enough for that, its not real ai neither to do it
I wish I could test that and mass port from c++ to plain and simple C.
Any 'public' (rate limited) web API (using CURL) from current AI inferences services?