Skip to content

Pull requests: ggerganov/llama.cpp

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

update HIP_UMA #7399 enhancement New feature or request ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs performance Speed related topics review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7414 opened May 20, 2024 by Djip007 Loading…
perplexity: update README FP16 results [no ci] documentation Improvements or additions to documentation examples
#7413 opened May 20, 2024 by JohannesGaessler Loading…
CUDA: quantized KV cache demo demo Demonstrate some concept or idea, not intended to be merged ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs research 🔬 review complexity : high Generally require indepth knowledge of LLMs or GPUs
#7412 opened May 20, 2024 by JohannesGaessler Draft
rpc : track allocated buffers review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level server
#7411 opened May 20, 2024 by rgerganov Loading…
llama : remove Persimmon python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7408 opened May 20, 2024 by ggerganov Loading…
Add Smaug 70B support to conversion enhancement New feature or request model Model specific python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7402 opened May 20, 2024 by bartowski1182 Loading…
CUDA: deduplicate mmq code ggml changes relating to the ggml tensor library for machine learning Nvidia GPU Issues specific to Nvidia GPUs refactoring Refactoring review complexity : high Generally require indepth knowledge of LLMs or GPUs
#7397 opened May 19, 2024 by JohannesGaessler Loading…
gguf : embed files to gguf model file help wanted Extra attention is needed python python script changes review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7392 opened May 19, 2024 by katsu560 Loading…
Add alpaca chat template enhancement New feature or request review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix testing Everything test related
#7383 opened May 19, 2024 by jukofyork Loading…
Automate vocab support and model conversion enhancement New feature or request python python script changes review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7379 opened May 19, 2024 by teleprint-me Draft
7 tasks
Tokenizer SPM fixes for phi-3 and llama-spm examples python python script changes review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level server testing Everything test related
#7375 opened May 18, 2024 by jaime-m-p Loading…
Add minimal python client example for the server, streaming callback examples python python script changes review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level server
#7373 opened May 18, 2024 by chrismrutherford Loading…
grammars: early exit when no next_candidates to reject performance Speed related topics review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7370 opened May 18, 2024 by ochafik Loading…
OpenELM support model Model specific python python script changes review complexity : high Generally require indepth knowledge of LLMs or GPUs
#7359 opened May 18, 2024 by icecream95 Draft
examples: cache hf model when --model not provided enhancement New feature or request review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7353 opened May 17, 2024 by amirzia Loading…
SimpleChat: a simple and dumb web front end for testing /chat/completions and /completions end points and try chat enhancement New feature or request examples review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level server testing Everything test related
#7350 opened May 17, 2024 by hanishkvc Loading…
Another threadpool: Avoid creating hundreds of threads in GGML build Compilation issues performance Speed related topics review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7342 opened May 17, 2024 by besnardjb Loading…
add Viking tokenizer support model Model specific python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7329 opened May 16, 2024 by jonabur Loading…
Viking-7B tokenizer support model Model specific python python script changes review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7328 opened May 16, 2024 by akx Draft
Fixed painfully slow single process builds. build Compilation issues need feedback Testing and feedback with results are needed performance Speed related topics
#7326 opened May 16, 2024 by jboero Loading…
sched : support async weight copy performance Speed related topics review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7315 opened May 15, 2024 by slaren Draft
Add phi-2 tokenizer model Model specific review complexity : medium Generally require more time to grok but manageable by beginner to medium expertise level
#7300 opened May 15, 2024 by BramVanroy Loading…
avoid to get prompt in infill mode and embedding mode examples review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix server
#7286 opened May 14, 2024 by woodx9 Draft
common: free ctx_gguf when exiting llama_control_vector_load_one bugfix fixes an issue or bug review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7285 opened May 14, 2024 by stevegrubb Loading…
common, ngram_cache: added const reference for std::pair<> and std::tuple<> more 16 bytes: refactoring Refactoring review complexity : low Trivial changes to code that most beginner devs (or those who want a break) can tackle. e.g. UI fix
#7270 opened May 14, 2024 by GermanAizek Draft
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.