Pull requests: ggerganov/llama.cpp
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
update HIP_UMA #7399
enhancement
New feature or request
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
performance
Speed related topics
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7414
opened May 20, 2024 by
Djip007
Loading…
perplexity: update README FP16 results [no ci]
documentation
Improvements or additions to documentation
examples
#7413
opened May 20, 2024 by
JohannesGaessler
Loading…
CUDA: quantized KV cache demo
demo
Demonstrate some concept or idea, not intended to be merged
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
research 🔬
review complexity : high
Generally require indepth knowledge of LLMs or GPUs
#7412
opened May 20, 2024 by
JohannesGaessler
•
Draft
rpc : track allocated buffers
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
server
#7411
opened May 20, 2024 by
rgerganov
Loading…
llama : remove Persimmon
python
python script changes
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7408
opened May 20, 2024 by
ggerganov
Loading…
Add Smaug 70B support to conversion
enhancement
New feature or request
model
Model specific
python
python script changes
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7402
opened May 20, 2024 by
bartowski1182
Loading…
CUDA: deduplicate mmq code
ggml
changes relating to the ggml tensor library for machine learning
Nvidia GPU
Issues specific to Nvidia GPUs
refactoring
Refactoring
review complexity : high
Generally require indepth knowledge of LLMs or GPUs
#7397
opened May 19, 2024 by
JohannesGaessler
Loading…
gguf : embed files to gguf model file
help wanted
Extra attention is needed
python
python script changes
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7392
opened May 19, 2024 by
katsu560
Loading…
Add alpaca chat template
enhancement
New feature or request
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
testing
Everything test related
#7383
opened May 19, 2024 by
jukofyork
Loading…
Automate vocab support and model conversion
enhancement
New feature or request
python
python script changes
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7379
opened May 19, 2024 by
teleprint-me
•
Draft
7 tasks
Tokenizer SPM fixes for phi-3 and llama-spm
examples
python
python script changes
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
server
testing
Everything test related
#7375
opened May 18, 2024 by
jaime-m-p
Loading…
Add minimal python client example for the server, streaming callback
examples
python
python script changes
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
server
#7373
opened May 18, 2024 by
chrismrutherford
Loading…
grammars: early exit when no next_candidates to reject
performance
Speed related topics
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7370
opened May 18, 2024 by
ochafik
Loading…
OpenELM support
model
Model specific
python
python script changes
review complexity : high
Generally require indepth knowledge of LLMs or GPUs
#7359
opened May 18, 2024 by
icecream95
•
Draft
examples: cache hf model when --model not provided
enhancement
New feature or request
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7353
opened May 17, 2024 by
amirzia
Loading…
SimpleChat: a simple and dumb web front end for testing /chat/completions and /completions end points and try chat
enhancement
New feature or request
examples
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
server
testing
Everything test related
#7350
opened May 17, 2024 by
hanishkvc
Loading…
Another threadpool: Avoid creating hundreds of threads in GGML
build
Compilation issues
performance
Speed related topics
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7342
opened May 17, 2024 by
besnardjb
Loading…
add Viking tokenizer support
model
Model specific
python
python script changes
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7329
opened May 16, 2024 by
jonabur
Loading…
Viking-7B tokenizer support
model
Model specific
python
python script changes
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
Fixed painfully slow single process builds.
build
Compilation issues
need feedback
Testing and feedback with results are needed
performance
Speed related topics
#7326
opened May 16, 2024 by
jboero
Loading…
sched : support async weight copy
performance
Speed related topics
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
Add phi-2 tokenizer
model
Model specific
review complexity : medium
Generally require more time to grok but manageable by beginner to medium expertise level
#7300
opened May 15, 2024 by
BramVanroy
Loading…
avoid to get prompt in infill mode and embedding mode
examples
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
server
common: free ctx_gguf when exiting llama_control_vector_load_one
bugfix
fixes an issue or bug
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7285
opened May 14, 2024 by
stevegrubb
Loading…
common, ngram_cache: added const reference for std::pair<> and std::tuple<> more 16 bytes:
refactoring
Refactoring
review complexity : low
Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7270
opened May 14, 2024 by
GermanAizek
•
Draft
Previous Next
ProTip!
Find all pull requests that aren't related to any open issues with -linked:issue.