Releases: ollama/ollama
Releases · ollama/ollama
v0.0.15
📍 Ollama model list is now available
Ollama now supports a list of models published on ollama.ai/library. We are working on ways to allow anyone to push models to Ollama. Expect more news on this in the future.
Please join the community on Discord if you have any questions/concerns/ want to hang out.
What's Changed
- Target remote Ollama hosts with
OLLAMA_HOST=<host> ollama run llama2
- Fixed issue where
PARAMETER
values weren't correctly in Modelfiles - Fixed issue where a warning would show when parsing a Modelfile comment
- Ollama will now parse data from ggml format models and use them to make sure your system has enough memory to run a model with GPU support
- Experimental support for creating fine-tuned models via
ollama create
with Modelfiles: use theADAPTER
Modelfile instruction - Added documentation for the
num_gqa
parameter - Added tutorials and examples for using LangChain with Ollama
- Ollama will now log embedding eval timing
- Update llama.cpp to the latest version
- Add
context
to api documentation for/api/generate
- Fixed issue with resuming downloads via
ollama pull
- Using
EMBED
in Modelfiles will now skip regenerating embeddings if the input files have not changed - Ollama will now use an already loaded model for
/api/embeddings
if it is available - New example:
dockerit
– a tool to help you build and run your application in a Docker container - Retry download on network errors
New Contributors
- @asarturas made their first contribution in #326
- @gusanmaz made their first contribution in #340
- @bmizerany made their first contribution in #262
Full Changelog: v0.0.14...v0.0.15
v0.0.14
What's Changed
- Ollama 🦜️🔗 LangChain Integration! https://python.langchain.com/docs/integrations/llms/ollama
- API docs for Ollama: https://github.com/jmorganca/ollama/blob/main/docs/api.md
- Llama 2 70B model with Metal support (Recommend at least 64GB of memory)
ollama run llama2:70b
- Uncensored Llama 2 70B model with Metal support
ollama run llama2-uncensored:70b
- New models available! For a list of models you can directly pull from Ollama, please see https://gist.github.com/mchiang0610/b959e3c189ec1e948e4f6a1f737a1fc5
- Embeddings can now be generated for a model with
/api/embeddings
- Experimental
EMBED
instruction in the Modelfile - Configurable rope frequency parameters
OLLAMA_HOST
can now specify the entire address to serve on withollama serve
- Fixed issue where context was truncated incorrectly leading to poor output
ollama pull
can now be run in different terminal windows for the same model concurrently- Add an example on multiline input
- Fixed error not being checked on
ollama pull
New Contributors
- @cmiller01 made their first contribution in #301
- @soroushj made their first contribution in #316
- @findmyway made their first contribution in #311
Full Changelog: v0.0.13...0.0.14
v0.0.13
New improvements
- Using Ollama CLI without Ollama running will now start Ollama
- Changed the buffer limit so that conversations would continue until it is complete
- Models now stay loaded in memory automatically between messages, so series of prompts are extra fast!
- The white fluffy Ollama icon is back when using dark mode
- Ollama will now run on Intel Macs. Compatibility & performance improvements to come
- When running
ollama run
, the/show
command can be used to inspect the current model ollama run
can now take in multi-line strings:% ollama run llama2 >>> """ Is this a multi-line string? """ Thank you for asking! Yes, the input you provided is a multi-line string. It contains multiple lines of text separated by line breaks.
- More seamless updates: Ollama will now show a subtle hint that an update is ready in the tray menu, instead of a dialog window
ollama run --verbose
will now show load duration times
Bug fixes
- Fixed crashes on Macs with 8GB of shared memory
- Fixed issues in scanning multi-line strings in a
Modelfile
v0.0.12
New improvements
- You can now rename models you've pulled or created with
ollama cp
- Added support for running k-quant models
- Performance improvements from enabling Accelerate
- Ollama's API can now be accessed by websites hosted on
localhost
ollama create
will now automatically pull models in theFROM
instruction you don't have locally
Bug fixes
ollama pull
will now show a better error when pulling a model that doesn't exist- Fixed an issue where cancelling and resuming downloads with
ollama pull
would cause an error - Fixed formatting of different errors so they are readable when running
ollama
commands - Fixed an issue where prompt templates defined with the
TEMPLATE
instruction wouldn't be parsed correctly - Fixed error when a model isn't found
v0.0.11
-
ollama list
: stay organized: see which models you have and their size% ollama list NAME SIZE MODIFIED llama2:13b 7.3 GB 28 hours ago llama2:latest 3.8 GB 4 hours ago orca:latest 1.9 GB 35 minutes ago vicuna:latest 3.8 GB 35 minutes ago
-
ollama rm
: have a model you don't want anymore? Delete it withollama rm
-
ollama pull
will now check the integrity of the model you've downloaded against it's checksum -
Errors will now correctly print, instead of showing another error
-
Performance updates: run models faster!
v0.0.10
v0.0.9
v0.0.8
v0.0.7
- Performance improvements with
ollama create
: it now uses less memory and will create custom models in less time - Fixed an issue where running
ollama create name -f
requires an absolute file path to the model file; relative paths are now supported - Fixed an issue where running
ollama pull
for a model that is already downloaded would show0B