Commit Graph

17 Commits (5cb63e2493c49bc2c3b9b355696e8dc26cdd0380)

Author SHA1 Message Date
Georgi Gerganov 22213a17b5
Change RMSNorm eps to 1e-6 (#173)
I think this is what is used in the Python code
1 year ago
Stephan Walter 367946c668
Don't tell users to use a bad number of threads (#243)
The readme tells people to use the command line option "-t 8", causing 8
threads to be started. On systems with fewer than 8 cores, this causes a
significant slowdown. Remove the option from the example command lines
and use /proc/cpuinfo on Linux to determine a sensible default.
1 year ago
Matvey Soloviev 904d2a8d6a
Q4_1 quantization (#193)
* Add AVX2 version of ggml_vec_dot_q4_1

* Small optimisations to q4_1 dot product (@Const-me)

* Rearrange Q4_1 quantization to work for multipart models. (Fix #152)

* Fix ggml_vec_mad_q4_1 too

* Fix non-vectorised q4_1 vec mul
1 year ago
Nebula 9b4a15b17d
Fix RMS norm in GGML (#191) 1 year ago
hoangmit 6eac39ba95
Add RMS norm and use it (#187)
* add ggml_rms_norm

* update op num
1 year ago
hoangmit 113e685d18
inline -> static inline for "bytesFromNibbles" (#161)
Without "static" prefix, it fails to compile in clang
1 year ago
Ronsor 47857e564c
Don't use vdotq_s32 if it's not available (#139)
* Don't use vdotq_s32 if it's not available

`dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available.

Reintroduces the code removed in 84d9015 if `__ARM_FEATURE_DOTPROD` isn't defined.

* Update ggml.c

---------

Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
1 year ago
Thomas Klausner 41be0a3b3d
Add NetBSD support. (#90) 1 year ago
Georgi Gerganov 84d9015c4a
Use vdotq_s32 to improve performance (#67)
* 10% performance boost on ARM

* Back to original change
1 year ago
Georgi Gerganov c80e2a8f2a
Revert "10% performance boost on ARM"
This reverts commit 113a9e83eb.

There are some reports for illegal instruction.
Moved this stuff to vdotq_s32 branch until resolve
1 year ago
Georgi Gerganov 54a0e66ea0
Check for vdotq_s32 availability 1 year ago
Georgi Gerganov 543c57e991
Ammend to previous commit - forgot to update non-QRDMX branch 1 year ago
Georgi Gerganov 113a9e83eb
10% performance boost on ARM 1 year ago
Sebastián A eb062bb012
Windows fixes (#31)
* Apply fixes suggested to build on windows

Issue: https://github.com/ggerganov/llama.cpp/issues/22

* Remove unsupported VLAs

* MSVC: Remove features that are only available on MSVC C++20.

* Fix zero initialization of the other fields.

* Change the use of vector for stack allocations.
1 year ago
Georgi Gerganov f1eaff4721 Add AVX2 support for x86 architectures thanks to @Const-me ! 1 year ago
Georgi Gerganov 007a8f6f45
Support all LLaMA models + change Q4_0 quantization storage 1 year ago
Georgi Gerganov 26c0846629
Initial release 1 year ago