2 comments

  • D2OQZG8l5BI1S06 2 hours ago ago

    The post is AI-written, so I did not read it. But based on title and abstract I'll have to disagree.

    The native content LLMs understand is text. They were literally trained on it. They much prefer it to any arbitrary structure you could come up with.

    We're used to think computers prefer content that is structured and binary etc; but with LLMs that changed.

    • tardedmeme 17 minutes ago ago

      Their native content is semantic vectors. They had to be trained for a long time to convert between text and semantic vectors, and the conversion is very lossy. Seahorse emoji demonstrates this nicely, the LLM internally holds a semantic vector for seahorse+emoji but the output translation layer can't match it.