Commit Graph

92 Commits (859fee6dfb00fab7ce6bc215b4adae78d82f4759)

Author SHA1 Message Date
Pavol Rusnak 859fee6dfb
quantize : use `map` to assign quantization type from `string` (#1191)
instead of `int` (while `int` option still being supported)

This allows the following usage:

`./quantize ggml-model-f16.bin ggml-model-q4_0.bin q4_0`

instead of:

`./quantize ggml-model-f16.bin ggml-model-q4_0.bin 2`
1 year ago
mgroeber9110 9b0a4d4214
examples/main README improvements and some light refactoring (#1131) 1 year ago
Pavol Rusnak c6524f46eb
readme : update gpt4all instructions (#980) 1 year ago
CRD716 834695fe3a
Minor: Readme fixed grammar, spelling, and misc updates (#1071) 1 year ago
Georgi Gerganov 7cd5c4a3e9
readme : add warning about Q4_2 and Q4_3 1 year ago
Georgi Gerganov 7faa7460f0
readme : update hot topics about new LoRA functionality 1 year ago
Atsushi Tatsuma e9298af389
readme : add Ruby bindings (#1029) 1 year ago
comex 723dac55fa
py : new conversion script (#545)
Current status: Working, except for the latest GPTQ-for-LLaMa format
  that includes `g_idx`.  This turns out to require changes to GGML, so
  for now it only works if you use the `--outtype` option to dequantize it
  back to f16 (which is pointless except for debugging).

  I also included some cleanup for the C++ code.

  This script is meant to replace all the existing conversion scripts
  (including the ones that convert from older GGML formats), while also
  adding support for some new formats.  Specifically, I've tested with:

  - [x] `LLaMA` (original)
  - [x] `llama-65b-4bit`
  - [x] `alpaca-native`
  - [x] `alpaca-native-4bit`
  - [x] LLaMA converted to 'transformers' format using
        `convert_llama_weights_to_hf.py`
  - [x] `alpaca-native` quantized with `--true-sequential --act-order
        --groupsize 128` (dequantized only)
  - [x] same as above plus `--save_safetensors`
  - [x] GPT4All
  - [x] stock unversioned ggml
  - [x] ggmh

  There's enough overlap in the logic needed to handle these different
  cases that it seemed best to move to a single script.

  I haven't tried this with Alpaca-LoRA because I don't know where to find
  it.

  Useful features:

  - Uses multiple threads for a speedup in some cases (though the Python
    GIL limits the gain, and sometimes it's disk-bound anyway).

  - Combines split models into a single file (both the intra-tensor split
    of the original and the inter-tensor split of 'transformers' format
    files).  Single files are more convenient to work with and more
    friendly to future changes to use memory mapping on the C++ side.  To
    accomplish this without increasing memory requirements, it has some
    custom loading code which avoids loading whole input files into memory
    at once.

  - Because of the custom loading code, it no longer depends in PyTorch,
    which might make installing dependencies slightly easier or faster...
    although it still depends on NumPy and sentencepiece, so I don't know
    if there's any meaningful difference.  In any case, I also added a
    requirements.txt file to lock the dependency versions in case of any
    future breaking changes.

  - Type annotations checked with mypy.

  - Some attempts to be extra user-friendly:

      - The script tries to be forgiving with arguments, e.g. you can
        specify either the model file itself or the directory containing
        it.

      - The script doesn't depend on config.json / params.json, just in
        case the user downloaded files individually and doesn't have those
        handy.  But you still need tokenizer.model and, for Alpaca,
        added_tokens.json.

      - The script tries to give a helpful error message if
        added_tokens.json is missing.
1 year ago
CRD716 ec29272175
readme : remove python 3.10 warning (#929) 1 year ago
Genkagaku.GPT 7e941b95eb
readme : llama node binding (#911)
* chore: add nodejs binding

* chore: add nodejs binding
1 year ago
Judd 4579af95e8
zig : update build.zig (#872)
* update

* update readme

* minimize the changes.

---------

Co-authored-by: zjli2019 <zhengji.li@ingchips.com>
1 year ago
Georgi Gerganov f76cb3a34d
readme : change "GPU support" link to discussion 1 year ago
Georgi Gerganov 782438070f
readme : update hot topics with link to "GPU support" issue 1 year ago
Nicolai Weitkemper 4dbbd40750
readme: link to sha256sums file (#902)
This is to emphasize that these do not need to be obtained from elsewhere.
1 year ago
Pavol Rusnak 8b679987cd
Fix whitespace, add .editorconfig, add GitHub workflow (#883) 1 year ago
qouoq a0caa34b16
Add BAIR's Koala to supported models (#877) 1 year ago
Pavol Rusnak d2beca95dc
Make docker instructions more explicit (#785) 1 year ago
Georgi Gerganov 3416298929
Update README.md 1 year ago
Georgi Gerganov 8d10406d6e
readme : change logo + add bindings + add uis + add wiki 1 year ago
Adithya Balaji 594cc95fab
readme : update with CMake and windows example (#748)
* README: Update with CMake and windows example

* README: update with code-review for cmake build
1 year ago
Thatcher Chamberlin d8d4e865cd
Add a missing step to the gpt4all instructions (#690)
`migrate-ggml-2023-03-30-pr613.py` is needed to get gpt4all running.
1 year ago
rimoliga d0a7f742e7
readme: replace termux links with homepage, play store is deprecated (#680) 1 year ago
Pavol Rusnak 9733104be5 drop quantize.py (now that models are using a single file) 1 year ago
Georgi Gerganov 3df890aef4
readme : update supported models 1 year ago
Georgi Gerganov b467702b87
readme : fix typos 1 year ago
Georgi Gerganov 516d88e75c
readme : add GPT4All instructions (close #588) 1 year ago
Stephan Walter b391579db9
Update README and comments for standalone perplexity tool (#525) 1 year ago
Georgi Gerganov 348d6926ee
Add logo to README.md 1 year ago
Georgi Gerganov 55ad42af84
Move chat scripts into "./examples" 1 year ago
Georgi Gerganov 4a7129acd2
Remove obsolete information from README 1 year ago
Gary Mulder f4f5362edb
Update README.md (#444)
Added explicit **bolded** instructions clarifying that people need to request access to models from Facebook and never through through this repo.
1 year ago
Georgi Gerganov b6b268d441
Add link to Roadmap discussion 1 year ago
Stephan Walter a50e39c6fe
Revert "Delete SHA256SUMS for now" (#429)
* Revert "Delete SHA256SUMS for now (#416)"

This reverts commit 8eea5ae0e5.

* Remove ggml files until they can be verified
* Remove alpaca json
* Add also model/tokenizer.model to SHA256SUMS + update README

---------

Co-authored-by: Pavol Rusnak <pavol@rusnak.io>
1 year ago
Gary Mulder 8a3e5ef801
Move model section from issue template to README.md (#421)
* Update custom.md

* Removed Model section as it is better placed in README.md

* Updates to README.md model section

* Inserted text that was removed from  issue template about obtaining models from FB and links to papers describing the various models

* Removed IPF down links for the Alpaca 7B models as these look to be in the old data format and probably shouldn't be directly linked to, anyway

* Updated the perplexity section to point at Perplexity scores #406 discussion
1 year ago
Georgi Gerganov 93208cfb92
Adjust repetition penalty .. 1 year ago
Georgi Gerganov 03ace14cfd
Add link to recent podcast about whisper.cpp and llama.cpp 1 year ago
Gary Linscott 40ea807a97
Add details on perplexity to README.md (#395) 1 year ago
Georgi Gerganov 56817b1f88
Remove temporary notice and update hot topics 1 year ago
Gary Mulder da0e9fe90c Add SHA256SUMS file and instructions to README how to obtain and verify the downloads
Hashes created using:

sha256sum models/*B/*.pth models/*[7136]B/ggml-model-f16.bin* models/*[7136]B/ggml-model-q4_0.bin* > SHA256SUMS
1 year ago
Georgi Gerganov 3366853e41
Add notice about pending change 1 year ago
Georgi Gerganov 1daf4dd712
Minor style changes 1 year ago
Georgi Gerganov dc6a845b85
Add chat.sh script 1 year ago
Georgi Gerganov 3bfa3b43b7
Fix convert script, warnings alpaca instructions, default params 1 year ago
Kevin Kwok e0ffc861fa
Update IPFS links to quantized alpaca with new tokenizer format (#352) 1 year ago
Mack Straight 074bea2eb1
sentencepiece bpe compatible tokenizer (#252)
* potential out of bounds read

* fix quantize

* style

* Update convert-pth-to-ggml.py

* mild cleanup

* don't need the space-prefixing here rn since main.cpp already does it

* new file magic + version header field

* readme notice

* missing newlines

Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
1 year ago
Suaj Carrot 7392f1cd2c
Improved quantize script (#222)
* Improved quantize script

I improved the quantize script by adding error handling and allowing to select many models for quantization at once in the command line. I also converted it to Python for generalization as well as extensibility.

* Fixes and improvements based on Matt's observations

Fixed and improved many things in the script based on the reviews made by @mattsta. The parallelization suggestion is still to be revised, but code for it was still added (commented).

* Small fixes to the previous commit

* Corrected to use the original glob pattern

The original Bash script uses a glob pattern to match files that have endings such as ...bin.0, ...bin.1, etc. That has been translated correctly to Python now.

* Added support for Windows and updated README to use this script

New code to set the name of the quantize script binary depending on the platform has been added (quantize.exe if working on Windows) and the README.md file has been updated to use this script instead of the Bash one.

* Fixed a typo and removed shell=True in the subprocess.run call

Fixed a typo regarding the new filenames of the quantized models and removed the shell=True parameter in the subprocess.run call as it was conflicting with the list of parameters.

* Corrected previous commit

* Small tweak: changed the name of the program in argparse

This was making the automatic help message to be suggesting the program's usage as being literally "$ Quantization Script [arguments]". It should now be something like "$ python3 quantize.py [arguments]".
1 year ago
Georgi Gerganov 160bfb217d
Update hot topics to mention Alpaca support 1 year ago
Georgi Gerganov a4e63b73df
Add instruction for using Alpaca (#240) 1 year ago
Pavol Rusnak 6f61c18ec9 Fix typo in readme 1 year ago
Pavol Rusnak 1e5a6d088d Add note about Python 3.11 to readme 1 year ago