llama.cpp

Commit Graph

Author	SHA1	Message	Date
Georgi Gerganov	721311070e	Update README.md	1 year ago
Georgi Gerganov	ac15de7895	Expand "Contributing" section	1 year ago
Georgi Gerganov	273abc47ff	Update hot topics - RMSnorm	1 year ago
Nebula	9b4a15b17d	Fix RMS norm in GGML (#191 )	1 year ago
hoangmit	6eac39ba95	Add RMS norm and use it (#187 ) * add ggml_rms_norm * update op num	1 year ago
moritzbrantner	27944c4206	fixed typo (#178 )	1 year ago
Rickey Bowers Jr	2d15d6c9a9	add SIGINT support for _WIN32 environments (#120 ) * add SIGINT support for _WIN32 environments * perhaps more consistent	1 year ago
Justin Suess	2d64715ad4	added ctx_size parameter (#148 ) * added ctx_size parameter * added it in more places * Apply suggestions from code review --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Justin Suess	16b2c61a22	fixed color reset on exit (#149 ) * fixed color reset on exit * added sigint handler for ansi_color_reset * Update main.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Musab Gultekin	977295c700	Fix potential licensing issue (#126 ) * Update README.md * Update README.md remove facebook	1 year ago
Ronsor	956dfda8ad	Use `tokenizer.vocab_size()` instead of hardcoding 32000 in convert-pth-to-ggml.py (#142 ) There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.	1 year ago
hoangmit	113e685d18	inline -> static inline for "bytesFromNibbles" (#161 ) Without "static" prefix, it fails to compile in clang	1 year ago
Ronsor	47857e564c	Don't use vdotq_s32 if it's not available (#139 ) * Don't use vdotq_s32 if it's not available `dotprod` extensions aren't available on some ARM CPUs (e.g. Raspberry Pi 4), so check for them and only use them if they're available. Reintroduces the code removed in `84d9015` if `__ARM_FEATURE_DOTPROD` isn't defined. * Update ggml.c --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Radoslav Gerganov	60f819a2b1	Add section to README on how to run the project on Android (#130 )	1 year ago
Georgi Gerganov	97ab2b2578	Add Misc section + update hot topics + minor fixes	1 year ago
Sebastián A	2f700a2738	Add windows to the CI (#98 )	1 year ago
Georgi Gerganov	c09a9cfb06	CMake build in Release by default (#75 )	1 year ago
Georgi Gerganov	7ec903d3c1	Update contribution section, hot topics, limitations, etc.	1 year ago
Georgi Gerganov	4497ad819c	Print system information	1 year ago
Sebastián A	ed6849cc07	Initial support for CMake (#75 )	1 year ago
Thomas Klausner	41be0a3b3d	Add NetBSD support. (#90 )	1 year ago
Pavol Rusnak	671d5cac15	Use fprintf for diagnostic output (#48 ) keep printf only for printing model output one can now use ./main ... 2>dev/null to suppress any diagnostic output	1 year ago
Georgi Gerganov	84d9015c4a	Use vdotq_s32 to improve performance (#67 ) * 10% performance boost on ARM * Back to original change	1 year ago
uint256_t	63fd76fbb0	Reduce model loading time (#43 ) * Use buffering * Use vector * Minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Val Kharitonov	2a20f48efa	Fix UTF-8 handling (including colors) (#79 )	1 year ago
Pavol Rusnak	d1f224712d	Add quantize script for batch quantization (#92 ) * Add quantize script for batch quantization * Indentation * README for new quantize.sh * Fix script name * Fix file list on Mac OS --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Georgi Gerganov	1808ee0500	Add initial contribution guidelines	1 year ago
Matvey Soloviev	a169bb889c	Gate signal support on being on a unixoid system. (#74 )	1 year ago
Matvey Soloviev	460c482540	Fix token count accounting	1 year ago
Georgi Gerganov	c80e2a8f2a	Revert "10% performance boost on ARM" This reverts commit `113a9e83eb`. There are some reports for illegal instruction. Moved this stuff to vdotq_s32 branch until resolve	1 year ago
Georgi Gerganov	54a0e66ea0	Check for vdotq_s32 availability	1 year ago
Georgi Gerganov	543c57e991	Ammend to previous commit - forgot to update non-QRDMX branch	1 year ago
Georgi Gerganov	113a9e83eb	10% performance boost on ARM	1 year ago
Matvey Soloviev	404fac0d62	Fix color getting reset before prompt output done (#65 ) (cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)	1 year ago
Georgi Gerganov	1a0a74300f	Update README.md	1 year ago
Matvey Soloviev	96ea727f47	Add interactive mode (#61 ) * Initial work on interactive mode. * Improve interactive mode. Make rev. prompt optional. * Update README to explain interactive mode. * Fix OS X build	1 year ago
Marc Köhlbrugge	9661954835	Fix typo in README (#45 )	1 year ago
Ben Garney	f385f8dee8	Allow using prompt files (#59 )	1 year ago
beiller	02f0c6fe7f	Add back top_k (#56 ) * Add back top_k * Update utils.cpp * Update utils.h --------- Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Sebastián A	eb062bb012	Windows fixes (#31 ) * Apply fixes suggested to build on windows Issue: https://github.com/ggerganov/llama.cpp/issues/22 * Remove unsupported VLAs * MSVC: Remove features that are only available on MSVC C++20. * Fix zero initialization of the other fields. * Change the use of vector for stack allocations.	1 year ago
Georgi Gerganov	7027a97837	Update README.md	1 year ago
Georgi Gerganov	2d555e5b42	Add CI (#60 )	1 year ago
Georgi Gerganov	7c9e54e55e	Revert "weights_only" arg - this causing more trouble than help	1 year ago
Oleksandr Nikitin	b9bd1d0141	python/pytorch compat notes (#44 )	1 year ago
beiller	129c7d1ea8	Add repetition penalty (#20 ) * Adding repeat penalization * Update utils.h * Update utils.cpp * Numeric fix Should probably still scale by temp even if penalized * Update comments, more proper application I see that numbers can go negative so a fix from a referenced commit * Minor formatting --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Georgi Gerganov	702fddf5c5	Clarify meaning of hacking	1 year ago
Georgi Gerganov	7d86e25bf6	README: add "Supported platforms" + update hot topics	1 year ago
deepdiffuser	a93120236f	use weights_only in conversion script (#32 ) this restricts malicious weights from executing arbitrary code by restricting the unpickler to only loading tensors, primitive types, and dictionaries	1 year ago
Pavol Rusnak	6a9a67f0be	Add LICENSE (#21 )	1 year ago
Georgi Gerganov	da1a4ff01f	Update README.md	1 year ago

... 5 6 7 8 9

425 Commits (q4_0-q4_2-range-fix) All Branches Search

425 Commits (q4_0-q4_2-range-fix)

All Branches