Llamafile Returns

(blog.mozilla.ai)

134 points | by aittalam 7 days ago ago

23 comments

jart 6 days ago ago

Really exciting to see Mozilla AI starting up and I can't wait to see where the next generation takes the project!

[-]
- bsenftner 5 days ago ago
  
  People are so uninformed, they don't know you are Mozilla AI's star employee.
  
  [-]
  - setheron 4 days ago ago
    
    I agree that the work they put out is A+ Whenever I see content produced by jart, it's always amazing.
    
    [-]
    - dingnuts 4 days ago ago
      
      sorry, I'm out of the loop. is this thread glazing a celebrity member commenting on an announcement from his own team to create hype?
      what the fuck is wrong with this website
  - rvz 5 days ago ago
    
    s/are/was
    I don't know if you were informed but you realize that jart is no longer at Mozilla anymore and now at Google Inc?
    
    [-]
    - setheron 4 days ago ago
      
      jart, you are back at Google?
      
      [-]
      - jart 4 days ago ago
        
        Yeah Google liked llamafile so much that they asked me to help them improve the LLM on their website too.
benatkin 4 days ago ago

> As the local and open LLM ecosystem has evolved over the years, time has come for llamafile to evolve too. It needs refactoring and upgrades to incorporate newer features available in llama.cpp and develop a refined understanding of the most valuable features for its users.
It seems people have moved on from Llamafile. I doubt Mozilla AI is going to bring it back.
This announcement didn't even come with a new code commit, just a wish. https://github.com/mozilla-ai/llamafile/commits/main/
swyx 5 days ago ago

justine tunney gave a great intro to Llamafile at AIE last year if it helps anyone: https://www.youtube.com/watch?v=-mRi-B3t6fA

thangalin 4 days ago ago

Tips:

    # Avoid issues when wine is installed.
    sudo su -c 'echo 0 > /proc/sys/fs/binfmt_misc/status'

And:

    # Capture the entirety of the instructions to obtain the input length.
    readonly INSTRUCT=$(
      join ${PATH_PREFIX_SYSTEM} ${PATH_PROMPT_SYSTEM} ${PATH_PREFIX_SYSTEM}
      join ${PATH_SUFFIX_USER} ${PATH_PROMPT_USER} ${PATH_SUFFIX_USER}
      join ${PATH_SUFFIX_ASSIST} "/dev/null" ${PATH_SUFFIX_ASSIST}
    )

    (
      echo ${INSTRUCT}
    ) | ./llamafile \
      -m "${LINK_MODEL}" \
      -e \
      -f /dev/stdin \
      -n 1000 \
      -c ${#INSTRUCT} \
      --repeat-penalty 1.0 \
      --temp 1.5 \
      --silent-prompt > output.txt

[-]

chrismorgan 4 days ago ago
> # Avoid issues when wine is installed.
> sudo su -c 'echo 0 > /proc/sys/fs/binfmt_misc/status'
Please don’t recommend this. If binfmt_misc is enabled, it’s probably for a reason, and disabling it will break things. I have a .NET/Mono app installed that it would break, for example—it’s definitely not just Wine.
If binfmt_misc is causing problems, the proper solution is to register the executable type. https://github.com/mozilla-ai/llamafile#linux describes steps.
I made myself a package containing /usr/bin/ape and the following /usr/lib/binfmt.d/ape.conf:
```
  :APE:M::MZqFpD::/usr/bin/ape:
  :APE-jart:M::jartsr::/usr/bin/ape:
```

michaelgiba 4 days ago ago

I’m glad to see llamafile being resurrected. A few things I hope for:
1. Curate a continuously extended inventory of prebuilt llamafiles for models as they are released 2. Create both flexible builds (with dynamic backend loading for cpu and cuda) and slim minimalist builds 3. Upstreaming as much as they can into llama.cpp and partner with the project

[-]
- michaelgiba 4 days ago ago
  
  Crazier ideas would be: - extend the concept to also have some sort of “agent mode” where the llamafiles can launch with their own minimal file system or isolated context - detailed profiling of main supported models to ensure deterministic outputs
  
  [-]
  - njbrake 4 days ago ago
    
    Love the idea!
dolmen 4 days ago ago

Cosmocc and Cosmopolitan are remarkable technical achievements and llamafile made me discover them.
The llamafile UX (CLI interface and web server with chat to quickly interact with the model) is great and make easy to download and play with a local LLM.
However I fail to see use cases where I would build a solution built on a llamafile. If I want to play with multiple models, I don't need to have the binary attached to the model data. If I want to play with a model on multiple operating systems, I'm fine downloading the llamafile tool binary for the platform separately from the model data (in fact, on Windows one have to download the llamafile.exe separately anyway because of a limit of the OS for executable files).
So Cosmopolitan is great tech, the llamafile command (the "UX for a model" part) is great, but I'm not convinced by the value of Cosmopolitan applied here.
romperstomper 4 days ago ago

While this is very cool and llamafiles are quite universal there is anyway a nuance for Window systems which is the size limit for a Windows executable which is 4Gb maximum. As LLM models are tend to be quite large this limit is reached pretty fast. So for such cases llamafile.exe will be required (which is also universal and runs everywhere). And at the end it could be just llama.cpp tools which released for all platforms + the LLM model file itself.
synergy20 5 days ago ago

how is this different from ollama? for me the more/open the merrier.

[-]
- ricardobeat 4 days ago ago
  
  Ollama is a model manager and pretty interface for llama.cpp, llamafile is a cross-platform packaging tool to distribute and run individual models also based on llama.cpp
apitman 5 days ago ago

This is great news. Given the proliferation of solid local models, it would be cool if llamafile had a way to build your own custom versions with the model of your choice.
FragenAntworten 4 days ago ago

The Discord link is broken, in that it links to the server directly rather than to an invitation to join the server, which prevents new members from joining.

[-]
- njbrake 4 days ago ago
  
  Fixed, thank you!
behindsight 5 days ago ago

great stuff, working on something around agentic tooling and hope to collab with Mozilla AI as well in the future as they share the same values I have
throawayonthe 5 days ago ago

go get that investor money i guess?