llama.cpp

Commit Graph

Author	SHA1	Message	Date
Matvey Soloviev	904d2a8d6a	Q4_1 quantization (#193 ) * Add AVX2 version of ggml_vec_dot_q4_1 * Small optimisations to q4_1 dot product (@Const-me) * Rearrange Q4_1 quantization to work for multipart models. (Fix #152) * Fix ggml_vec_mad_q4_1 too * Fix non-vectorised q4_1 vec mul	1 year ago
Nebula	9b4a15b17d	Fix RMS norm in GGML (#191 )	1 year ago
hoangmit	6eac39ba95	Add RMS norm and use it (#187 ) * add ggml_rms_norm * update op num	1 year ago
hoangmit	113e685d18	inline -> static inline for "bytesFromNibbles" (#161 ) Without "static" prefix, it fails to compile in clang	1 year ago
Ronsor	47857e564c	Don't use vdotq_s32 if it's not available (#139 ) * Don't use vdotq_s32 if it's not available `dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available. Reintroduces the code removed in `84d9015` if `__ARM_FEATURE_DOTPROD` isn't defined. * Update ggml.c --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Thomas Klausner	41be0a3b3d	Add NetBSD support. (#90 )	1 year ago
Georgi Gerganov	84d9015c4a	Use vdotq_s32 to improve performance (#67 ) * 10% performance boost on ARM * Back to original change	1 year ago
Georgi Gerganov	c80e2a8f2a	Revert "10% performance boost on ARM" This reverts commit `113a9e83eb`. There are some reports for illegal instruction. Moved this stuff to vdotq_s32 branch until resolve	1 year ago
Georgi Gerganov	54a0e66ea0	Check for vdotq_s32 availability	1 year ago
Georgi Gerganov	543c57e991	Ammend to previous commit - forgot to update non-QRDMX branch	1 year ago
Georgi Gerganov	113a9e83eb	10% performance boost on ARM	1 year ago
Sebastián A	eb062bb012	Windows fixes (#31 ) * Apply fixes suggested to build on windows Issue: https://github.com/ggerganov/llama.cpp/issues/22 * Remove unsupported VLAs * MSVC: Remove features that are only available on MSVC C++20. * Fix zero initialization of the other fields. * Change the use of vector for stack allocations.	1 year ago
Georgi Gerganov	f1eaff4721	Add AVX2 support for x86 architectures thanks to @Const-me !	1 year ago
Georgi Gerganov	007a8f6f45	Support all LLaMA models + change Q4_0 quantization storage	1 year ago
Georgi Gerganov	26c0846629	Initial release	1 year ago

15 Commits (6b0df5ccf360fe5c015f6607f0375bfc6849005e)