No description
  • C++ 61%
  • C 16.8%
  • Makefile 5.9%
  • Cuda 4.6%
  • Shell 4%
  • Other 7.7%
Find a file
Davide Eynard 7fca8b2a27
Some checks are pending
CI / ubuntu-focal-make (push) Waiting to run
Updated to v0.10.3 (#991)
2026-06-02 19:03:37 +01:00
.github Docs Updates (#949) 2026-04-30 12:01:13 +01:00
.llamafile_plugin Update llama.cpp submodule to dbe9c0c (+ embed real web UI) (#983) 2026-05-29 00:36:43 +01:00
build llamafile reloaded (v0.10.0) (#867) 2026-03-19 11:13:53 +00:00
diffusionfile Fix uncaught SIGSEGV when GPU init fails, restore CPU fallback (#988) (#989) 2026-06-02 13:16:31 +01:00
docs Fix uncaught SIGSEGV when GPU init fails, restore CPU fallback (#988) (#989) 2026-06-02 13:16:31 +01:00
llama.cpp@dbe9c0c8ce Update llama.cpp submodule to dbe9c0c (+ embed real web UI) (#983) 2026-05-29 00:36:43 +01:00
llama.cpp.patches Fix uncaught SIGSEGV when GPU init fails, restore CPU fallback (#988) (#989) 2026-06-02 13:16:31 +01:00
llamafile Updated to v0.10.3 (#991) 2026-06-02 19:03:37 +01:00
localscore disable for ascii 0 2025-04-09 16:11:42 -07:00
models github: add ci (#454) 2024-05-29 00:24:34 -07:00
scripts Migrate docs from MkDocs/GitHub Pages to GitBook (#946) 2026-04-16 10:32:30 -05:00
stable-diffusion.cpp@baf7eda1e4 Modernise Diffusionfile Support (#970) 2026-05-28 12:12:53 +01:00
stable-diffusion.cpp.patches Modernise Diffusionfile Support (#970) 2026-05-28 12:12:53 +01:00
tests Fix uncaught SIGSEGV when GPU init fails, restore CPU fallback (#988) (#989) 2026-06-02 13:16:31 +01:00
third_party llamafile reloaded (v0.10.0) (#867) 2026-03-19 11:13:53 +00:00
tools llamafile reloaded (v0.10.0) (#867) 2026-03-19 11:13:53 +00:00
whisper.cpp@2eeeba56e9 llamafile reloaded (v0.10.0) (#867) 2026-03-19 11:13:53 +00:00
whisper.cpp.patches llamafile reloaded (v0.10.0) (#867) 2026-03-19 11:13:53 +00:00
whisperfile Fix uncaught SIGSEGV when GPU init fails, restore CPU fallback (#988) (#989) 2026-06-02 13:16:31 +01:00
.gitbook-branch-readme.md Migrate docs from MkDocs/GitHub Pages to GitBook (#946) 2026-04-16 10:32:30 -05:00
.gitbook.yaml docs: rename example_llamfiles to pre-built-llamafiles for better seo (#972) 2026-05-20 17:26:44 +01:00
.gitignore Migrate docs from MkDocs/GitHub Pages to GitBook (#946) 2026-04-16 10:32:30 -05:00
.gitmodules llamafile reloaded (v0.10.0) (#867) 2026-03-19 11:13:53 +00:00
CONTRIBUTING.md Docs Updates (#949) 2026-04-30 12:01:13 +01:00
cosmocc-override.cmake llamafile reloaded (v0.10.0) (#867) 2026-03-19 11:13:53 +00:00
LICENSE Add known issue for Windows 2023-11-19 17:57:43 -08:00
Makefile Update release scripts (#990) 2026-06-02 13:04:44 +01:00
README.md docs: rename example_llamfiles to pre-built-llamafiles for better seo (#972) 2026-05-20 17:26:44 +01:00
README_0.10.0.md docs: rename example_llamfiles to pre-built-llamafiles for better seo (#972) 2026-05-20 17:26:44 +01:00
RELEASE.md llamafile reloaded (v0.10.0) (#867) 2026-03-19 11:13:53 +00:00

llamafile

[line drawing of llama animal head in front of slightly open manilla folder filled with files]

License ci status Based on llama.cpp Based on whisper.cpp Discord Mozilla Builders

llamafile lets you distribute and run LLMs with a single file.

llamafile is a Mozilla Builders project (see its announcement blog post), now revamped by Mozilla.ai.

Our goal is to make open LLMs much more accessible to both developers and end users. We're doing that by combining llama.cpp with Cosmopolitan Libc into one framework that collapses all the complexity of LLMs down to a single-file executable (called a "llamafile") that runs locally on most operating systems and CPU archiectures, with no installation.

llamafile also includes whisperfile, a single-file speech-to-text tool built on whisper.cpp and the same Cosmopolitan packaging. It supports transcription and translation of audio files across all the same platforms, with no installation required.

v0.10.*

llamafile versions starting from 0.10.0 use a new build system, aimed at keeping our code more easily aligned with the latest versions of llama.cpp. This means they support more recent models and functionalities, but at the same time they might be missing some of the features you were accustomed to (check out this doc for a high-level description of what has been done). If you liked the "classic experience" more, you will always be able to access the previous versions from our releases page. Our pre-built llamafiles always show which version of the server they have been bundled with (0.9.* example, 0.10.* example), so you will always know which version of the software you are downloading.

We want to hear from you! Whether you are a new user or a long-time fan, please share what you find most valuable about llamafile and what would make it more useful for you. Read more via the blog and add your voice to the discussion here.

Quick Start

Download and run your first llamafile in minutes:

# Download an example model (Qwen3.5 0.8B)
curl -LO https://huggingface.co/mozilla-ai/llamafile_0.10/resolve/main/Qwen3.5-0.8B-Q8_0.llamafile

# Make it executable (macOS/Linux/BSD)
chmod +x Qwen3.5-0.8B-Q8_0.llamafile

# Run it
./Qwen3.5-0.8B-Q8_0.llamafile

We chose this model because that's the smallest one we have built a llamafile for, so most likely to work out-of-the-box for you. If you have powerful hardware and/or GPUs, feel free to choose larger and more expressive models which should provide more accurate responses.

Windows users: Rename the file to add .exe extension before running.

Note - Only executables under 4GB can run on Windows, so any llamafile above 4GB won't work. Download the llamafile binary and run it with any external weights/models(GGUF).

Documentation

Check the full documentation at docs.mozilla.ai/llamafile, or directly jump into one of the following subsections:

Licensing

While the llamafile project is Apache 2.0-licensed, our changes to llama.cpp and whisper.cpp are licensed under MIT (just like the projects themselves) so as to remain compatible and upstreamable in the future, should that be desired.

The llamafile logo on this page was generated with the assistance of DALL·E 3.

Star History Chart