Skip to content

Commit

Permalink
feat: enable flash attention if supported
Browse files Browse the repository at this point in the history
  • Loading branch information
sammcj committed May 16, 2024
1 parent 6ff2dcc commit 5ab0d7b
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion llm/llama.cpp
Submodule llama.cpp updated 50 files
+77 −29 .github/workflows/build.yml
+18 −0 CMakeLists.txt
+45 −0 CMakePresets.json
+4 −1 README.md
+16 −0 cmake/arm64-windows-llvm.cmake
+6 −0 cmake/arm64-windows-msvc.cmake
+10 −0 common/common.cpp
+1 −0 common/common.h
+1 −1 common/grammar-parser.cpp
+6 −6 common/json-schema-to-grammar.cpp
+5 −5 common/log.h
+3 −3 convert-hf-to-gguf-update.py
+31 −47 convert-hf-to-gguf.py
+155 −25 convert.py
+3 −0 examples/CMakeLists.txt
+1 −0 examples/embedding/embedding.cpp
+21 −6 examples/llava/llava-cli.cpp
+0 −15 examples/llava/llava.cpp
+59 −1 examples/perplexity/README.md
+3 −1 examples/quantize/README.md
+2 −0 examples/rpc/CMakeLists.txt
+74 −0 examples/rpc/README.md
+130 −0 examples/rpc/rpc-server.cpp
+1 −1 examples/server/README.md
+7 −0 examples/server/server.cpp
+5 −2 examples/server/tests/features/steps/steps.py
+1 −1 examples/server/utils.hpp
+0 −1 ggml-backend.c
+1 −1 ggml-cuda.cu
+33 −30 ggml-cuda/upscale.cu
+7 −0 ggml-impl.h
+48 −35 ggml-metal.m
+33 −41 ggml-metal.metal
+2,195 −27 ggml-quants.c
+1,023 −0 ggml-rpc.cpp
+24 −0 ggml-rpc.h
+5 −24 ggml-sycl.cpp
+306 −161 ggml.c
+16 −2 ggml.h
+1 −0 gguf-py/gguf/__init__.py
+11 −5 gguf-py/gguf/gguf_writer.py
+20 −9 gguf-py/gguf/lazy.py
+109 −0 gguf-py/gguf/quants.py
+230 −108 llama.cpp
+3 −0 llama.h
+4 −0 scripts/sync-ggml-am.sh
+1 −1 scripts/sync-ggml.last
+2 −0 scripts/sync-ggml.sh
+44 −13 tests/test-backend-ops.cpp
+46 −0 tests/test-grammar-integration.cpp

0 comments on commit 5ab0d7b

Please sign in to comment.