llama.cpp

Commit Graph

Author	SHA1	Message	Date
jp-x-g	f732695cd5	Clarify console output in convert-pth-to-ggml.py (#512 ) "Processing part 1 of 3" instead of "Processing part 0"	1 year ago
Georgi Gerganov	f5a77a629b	Introduce C-style API (#370 ) * Major refactoring - introduce C-style API * Clean up * Add <cassert> * Add <iterator> * Add <algorithm> .... * Fix timing reporting and accumulation * Measure eval time only for single-token calls * Change llama_tokenize return meaning	1 year ago
Georgi Gerganov	3bfa3b43b7	Fix convert script, warnings alpaca instructions, default params	1 year ago
Mack Straight	c98ae02668	fix typo in comment (#318 )	1 year ago
Georgi Gerganov	eb34620aec	Add tokenizer test + revert to C++11 (#355 ) * Add test-tokenizer-0 to do a few tokenizations - feel free to expand * Added option to convert-pth-to-ggml.py script to dump just the vocabulary * Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests) * Added utility to load vocabulary file from previous point (temporary implementation) * Avoid using std::string_view and drop back to C++11 (hope I didn't break something) * Rename gpt_vocab -> llama_vocab * All CMake binaries go into ./bin/ now	1 year ago
Qingyou Meng	6b6d5b5024	Fixed tokenizer.model not found error when model dir is symlink (#325 )	1 year ago
Mack Straight	074bea2eb1	sentencepiece bpe compatible tokenizer (#252 ) * potential out of bounds read * fix quantize * style * Update convert-pth-to-ggml.py * mild cleanup * don't need the space-prefixing here rn since main.cpp already does it * new file magic + version header field * readme notice * missing newlines Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>	1 year ago
Georgi Gerganov	c1c7026b47	Fix python stuff (#109 )	1 year ago
qunash	467b149761	Refactoring `convert-pth-to-ggml.py`: more concise and readable (#109 ) * Refactor get_n_parts function to simplify code and improve readability * Use f-strings instead of concatenation * Refactoring: more concise and readable * modularize --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Bernat Vadell	2af23d3043	🚀 Dockerize llamacpp (#132 ) * feat: dockerize llamacpp * feat: split build & runtime stages * split dockerfile into main & tools * add quantize into tool docker image * Update .devops/tools.sh Co-authored-by: Georgi Gerganov <ggerganov@gmail.com> * add docker action pipeline * change CI to publish at github docker registry * fix name runs-on macOS-latest is macos-latest (lowercase) * include docker versioned images * fix github action docker * fix docker.yml * feat: include all-in-one command tool & update readme.md --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Ronsor	956dfda8ad	Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142 ) There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.	1 year ago
Val Kharitonov	2a20f48efa	Fix UTF-8 handling (including colors) (#79 )	1 year ago
Georgi Gerganov	7c9e54e55e	Revert "weights_only" arg - this causing more trouble than help	1 year ago
Oleksandr Nikitin	b9bd1d0141	python/pytorch compat notes (#44 )	1 year ago
deepdiffuser	a93120236f	use weights_only in conversion script (#32 ) this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries	1 year ago
Georgi Gerganov	007a8f6f45	Support all LLaMA models + change Q4_0 quantization storage	1 year ago
Georgi Gerganov	70bc0b8b15	Fix a bug in the rope calculation	1 year ago
Georgi Gerganov	26c0846629	Initial release	1 year ago

18 Commits (436e56193199a1625f8c561069f702e8840a9e08)