Commit Graph

425 Commits (q4_0-q4_2-range-fix)
 

Author SHA1 Message Date
Georgi Gerganov 721311070e
Update README.md 1 year ago
Georgi Gerganov ac15de7895
Expand "Contributing" section 1 year ago
Georgi Gerganov 273abc47ff
Update hot topics - RMSnorm 1 year ago
Nebula 9b4a15b17d
Fix RMS norm in GGML (#191) 1 year ago
hoangmit 6eac39ba95
Add RMS norm and use it (#187)
* add ggml_rms_norm

* update op num
1 year ago
moritzbrantner 27944c4206
fixed typo (#178) 1 year ago
Rickey Bowers Jr 2d15d6c9a9
add SIGINT support for _WIN32 environments (#120)
* add SIGINT support for _WIN32 environments

* perhaps more consistent
1 year ago
Justin Suess 2d64715ad4
added ctx_size parameter (#148)
* added ctx_size parameter

* added it in more places

* Apply suggestions from code review

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Justin Suess 16b2c61a22
fixed color reset on exit (#149)
* fixed color reset on exit

* added sigint handler for ansi_color_reset

* Update main.cpp

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Musab Gultekin 977295c700
Fix potential licensing issue (#126)
* Update README.md

* Update README.md

remove facebook
1 year ago
Ronsor 956dfda8ad
Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142)
There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.
1 year ago
hoangmit 113e685d18
inline -> static inline for "bytesFromNibbles" (#161)
Without "static" prefix, it fails to compile in clang
1 year ago
Ronsor 47857e564c
Don't use vdotq_s32 if it's not available (#139)
* Don't use vdotq_s32 if it's not available

`dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available.

Reintroduces the code removed in 84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined.

* Update ggml.c

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Radoslav Gerganov 60f819a2b1
Add section to README on how to run the project on Android (#130) 1 year ago
Georgi Gerganov 97ab2b2578
Add Misc section + update hot topics + minor fixes 1 year ago
Sebastián A 2f700a2738
Add windows to the CI (#98) 1 year ago
Georgi Gerganov c09a9cfb06
CMake build in Release by default (#75) 1 year ago
Georgi Gerganov 7ec903d3c1
Update contribution section, hot topics, limitations, etc. 1 year ago
Georgi Gerganov 4497ad819c
Print system information 1 year ago
Sebastián A ed6849cc07
Initial support for CMake (#75) 1 year ago
Thomas Klausner 41be0a3b3d
Add NetBSD support. (#90) 1 year ago
Pavol Rusnak 671d5cac15
Use fprintf for diagnostic output (#48)
keep printf only for printing model output

one can now use ./main ... 2>dev/null to suppress any diagnostic output
1 year ago
Georgi Gerganov 84d9015c4a
Use vdotq_s32 to improve performance (#67)
* 10% performance boost on ARM

* Back to original change
1 year ago
uint256_t 63fd76fbb0
Reduce model loading time (#43)
* Use buffering

* Use vector

* Minor

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Val Kharitonov 2a20f48efa
Fix UTF-8 handling (including colors) (#79) 1 year ago
Pavol Rusnak d1f224712d
Add quantize script for batch quantization (#92)
* Add quantize script for batch quantization

* Indentation

* README for new quantize.sh

* Fix script name

* Fix file list on Mac OS

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Georgi Gerganov 1808ee0500
Add initial contribution guidelines 1 year ago
Matvey Soloviev a169bb889c Gate signal support on being on a unixoid system. (#74) 1 year ago
Matvey Soloviev 460c482540 Fix token count accounting 1 year ago
Georgi Gerganov c80e2a8f2a
Revert "10% performance boost on ARM"
This reverts commit 113a9e83eb.

There are some reports for illegal instruction.
Moved this stuff to vdotq_s32 branch until resolve
1 year ago
Georgi Gerganov 54a0e66ea0
Check for vdotq_s32 availability 1 year ago
Georgi Gerganov 543c57e991
Ammend to previous commit - forgot to update non-QRDMX branch 1 year ago
Georgi Gerganov 113a9e83eb
10% performance boost on ARM 1 year ago
Matvey Soloviev 404fac0d62
Fix color getting reset before prompt output done (#65)
(cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)
1 year ago
Georgi Gerganov 1a0a74300f
Update README.md 1 year ago
Matvey Soloviev 96ea727f47
Add interactive mode (#61)
* Initial work on interactive mode.

* Improve interactive mode. Make rev. prompt optional.

* Update README to explain interactive mode.

* Fix OS X build
1 year ago
Marc Köhlbrugge 9661954835
Fix typo in README (#45) 1 year ago
Ben Garney f385f8dee8
Allow using prompt files (#59) 1 year ago
beiller 02f0c6fe7f
Add back top_k (#56)
* Add back top_k

* Update utils.cpp

* Update utils.h

---------

Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com>
Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Sebastián A eb062bb012
Windows fixes (#31)
* Apply fixes suggested to build on windows

Issue: https://github.com/ggerganov/llama.cpp/issues/22

* Remove unsupported VLAs

* MSVC: Remove features that are only available on MSVC C++20.

* Fix zero initialization of the other fields.

* Change the use of vector for stack allocations.
1 year ago
Georgi Gerganov 7027a97837
Update README.md 1 year ago
Georgi Gerganov 2d555e5b42
Add CI (#60) 1 year ago
Georgi Gerganov 7c9e54e55e
Revert "weights_only" arg - this causing more trouble than help 1 year ago
Oleksandr Nikitin b9bd1d0141
python/pytorch compat notes (#44) 1 year ago
beiller 129c7d1ea8
Add repetition penalty (#20)
* Adding repeat penalization

* Update utils.h

* Update utils.cpp

* Numeric fix

Should probably still scale by temp even if penalized

* Update comments, more proper application

I see that numbers can go negative so a fix from a referenced commit

* Minor formatting

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Georgi Gerganov 702fddf5c5
Clarify meaning of hacking 1 year ago
Georgi Gerganov 7d86e25bf6
README: add "Supported platforms" + update hot topics 1 year ago
deepdiffuser a93120236f
use weights_only in conversion script (#32)
this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries
1 year ago
Pavol Rusnak 6a9a67f0be
Add LICENSE (#21) 1 year ago
Georgi Gerganov da1a4ff01f
Update README.md 1 year ago