Commit Graph

18 Commits (436e56193199a1625f8c561069f702e8840a9e08)

Author SHA1 Message Date
jp-x-g f732695cd5
Clarify console output in convert-pth-to-ggml.py (#512)
"Processing part 1 of 3" instead of "Processing part 0"
1 year ago
Georgi Gerganov f5a77a629b
Introduce C-style API (#370)
* Major refactoring - introduce C-style API

* Clean up

* Add <cassert>

* Add <iterator>

* Add <algorithm> ....

* Fix timing reporting and accumulation

* Measure eval time only for single-token calls

* Change llama_tokenize return meaning
1 year ago
Georgi Gerganov 3bfa3b43b7
Fix convert script, warnings alpaca instructions, default params 1 year ago
Mack Straight c98ae02668
fix typo in comment (#318) 1 year ago
Georgi Gerganov eb34620aec
Add tokenizer test + revert to C++11 (#355)
* Add test-tokenizer-0 to do a few tokenizations - feel free to expand
* Added option to convert-pth-to-ggml.py script to dump just the vocabulary
* Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests)
* Added utility to load vocabulary file from previous point (temporary implementation)
* Avoid using std::string_view and drop back to C++11 (hope I didn't break something)
* Rename gpt_vocab -> llama_vocab
* All CMake binaries go into ./bin/ now
1 year ago
Qingyou Meng 6b6d5b5024
Fixed tokenizer.model not found error when model dir is symlink (#325) 1 year ago
Mack Straight 074bea2eb1
sentencepiece bpe compatible tokenizer (#252)
* potential out of bounds read

* fix quantize

* style

* Update convert-pth-to-ggml.py

* mild cleanup

* don't need the space-prefixing here rn since main.cpp already does it

* new file magic + version header field

* readme notice

* missing newlines

Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
1 year ago
Georgi Gerganov c1c7026b47
Fix python stuff (#109) 1 year ago
qunash 467b149761
Refactoring `convert-pth-to-ggml.py`: more concise and readable (#109)
* Refactor get_n_parts function to simplify code and improve readability

* Use f-strings instead of concatenation

* Refactoring: more concise and readable

* modularize

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Bernat Vadell 2af23d3043
🚀 Dockerize llamacpp (#132)
* feat: dockerize llamacpp

* feat: split build & runtime stages

* split dockerfile into main & tools

* add quantize into tool docker image

* Update .devops/tools.sh

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

* add docker action pipeline

* change CI to publish at github docker registry

* fix name runs-on macOS-latest is macos-latest (lowercase)

* include docker versioned images

* fix github action docker

* fix docker.yml

* feat: include all-in-one command tool & update readme.md

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Ronsor 956dfda8ad
Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142)
There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.
1 year ago
Val Kharitonov 2a20f48efa
Fix UTF-8 handling (including colors) (#79) 1 year ago
Georgi Gerganov 7c9e54e55e
Revert "weights_only" arg - this causing more trouble than help 1 year ago
Oleksandr Nikitin b9bd1d0141
python/pytorch compat notes (#44) 1 year ago
deepdiffuser a93120236f
use weights_only in conversion script (#32)
this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries
1 year ago
Georgi Gerganov 007a8f6f45
Support all LLaMA models + change Q4_0 quantization storage 1 year ago
Georgi Gerganov 70bc0b8b15
Fix a bug in the rope calculation 1 year ago
Georgi Gerganov 26c0846629
Initial release 1 year ago