llama.cpp

Commit Graph

Author	SHA1	Message	Date
cocktailpeanut	da5303c1ea	bugfix: default should not be interactive (#304 )	1 year ago
Rickey Bowers Jr	5c19c70ba6	fix coloring of last `n_batch` of prompt, and refactor line input (#221 ) * fix coloring of last `n_batch` of prompt, and refactor line input * forgot the newline that needs to be sent to the model * (per #283) try to force flush of color reset in SIGINT handler	1 year ago
tjohnman	24568371ae	Support for multiple reverse prompts. (#299 ) Co-authored-by: Johnman <> Co-authored-by: Johnman <tjohnman@github>	1 year ago
tjohnman	ad5fd5b60c	Make prompt randomization optional. (#300 ) Co-authored-by: Johnman <>	1 year ago
tjohnman	368d0c8a9e	Respect the maximum number of tokens in interactive. (#298 ) Co-authored-by: Johnman <johnman@github> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
slaren	50fae10d03	Add --ignore-eos parameter (#181 ) Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Qingyou Meng	084e2f0ec0	interactive mode: print '\n' in sigint_handler, this flush stdout thus ensure color reset. (#283 )	1 year ago
Erik Scholz	0b366e7357	Command line switch to use F16 for memory_k and memory_v (refactor of #154 ) (#294 ) * Use F16 for memory_k and memory_v * add command line switch to use f16 instead of f32 for memory k+v --------- Co-authored-by: Ty Everett <ty@tyweb.us>	1 year ago
Georgi Gerganov	c494ed5b94	Fix off-by-one bug (#115 )	1 year ago
Georgi Gerganov	70f01cb863	Drop trailing new line from file prompts (#80 )	1 year ago
Georgi Gerganov	9e1707218a	Add "--instruct" argument for usage with Alpaca (#240 ) Also start adding prompts in "./prompts"	1 year ago
Ronsor	d7def1a752	Warn user if a context size greater than 2048 tokens is specified (#274 ) LLaMA doesn't support more than 2048 token context sizes, and going above that produces terrible results.	1 year ago
Alex Nguyen	d3f202d57b	Remove unused code since n_vocab is model.hparams.n_vocab (#262 )	1 year ago
Justin Suess	e03e359730	fixed warning with std::ignore about unused function result (#151 ) fixed warning with std::ignore about unused function result	1 year ago
thement	c9f670a177	Implement non-greedy tokenizer that tries to maximize token lengths (#242 ) * Implement non-greedy tokenizer that tries to maximize token lengths * Insert single space in front of the prompt - this is to match original llama tokenizer behavior --------- Co-authored-by: Jakub Horak <jakub.horak@ibawizard.net>	1 year ago
hoangmit	6eac39ba95	Add RMS norm and use it (#187 ) * add ggml_rms_norm * update op num	1 year ago
Rickey Bowers Jr	2d15d6c9a9	add SIGINT support for _WIN32 environments (#120 ) * add SIGINT support for _WIN32 environments * perhaps more consistent	1 year ago
Justin Suess	2d64715ad4	added ctx_size parameter (#148 ) * added ctx_size parameter * added it in more places * Apply suggestions from code review --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Justin Suess	16b2c61a22	fixed color reset on exit (#149 ) * fixed color reset on exit * added sigint handler for ansi_color_reset * Update main.cpp --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Georgi Gerganov	4497ad819c	Print system information	1 year ago
Pavol Rusnak	671d5cac15	Use fprintf for diagnostic output (#48 ) keep printf only for printing model output one can now use ./main ... 2>dev/null to suppress any diagnostic output	1 year ago
uint256_t	63fd76fbb0	Reduce model loading time (#43 ) * Use buffering * Use vector * Minor --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Val Kharitonov	2a20f48efa	Fix UTF-8 handling (including colors) (#79 )	1 year ago
Matvey Soloviev	a169bb889c	Gate signal support on being on a unixoid system. (#74 )	1 year ago
Matvey Soloviev	460c482540	Fix token count accounting	1 year ago
Matvey Soloviev	404fac0d62	Fix color getting reset before prompt output done (#65 ) (cherry picked from commit 7eb2987619feee04c40eff69b604017d09919cb6)	1 year ago
Matvey Soloviev	96ea727f47	Add interactive mode (#61 ) * Initial work on interactive mode. * Improve interactive mode. Make rev. prompt optional. * Update README to explain interactive mode. * Fix OS X build	1 year ago
beiller	02f0c6fe7f	Add back top_k (#56 ) * Add back top_k * Update utils.cpp * Update utils.h --------- Co-authored-by: Bill Hamilton <bill.hamilton@shopify.com> Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Sebastián A	eb062bb012	Windows fixes (#31 ) * Apply fixes suggested to build on windows Issue: https://github.com/ggerganov/llama.cpp/issues/22 * Remove unsupported VLAs * MSVC: Remove features that are only available on MSVC C++20. * Fix zero initialization of the other fields. * Change the use of vector for stack allocations.	1 year ago
beiller	129c7d1ea8	Add repetition penalty (#20 ) * Adding repeat penalization * Update utils.h * Update utils.cpp * Numeric fix Should probably still scale by temp even if penalized * Update comments, more proper application I see that numbers can go negative so a fix from a referenced commit * Minor formatting --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>	1 year ago
Georgi Gerganov	7d9ed7b25f	Bump memory buffer	1 year ago
Georgi Gerganov	007a8f6f45	Support all LLaMA models + change Q4_0 quantization storage	1 year ago
Georgi Gerganov	70bc0b8b15	Fix a bug in the rope calculation	1 year ago
Georgi Gerganov	319cdb3e1f	Final touches	1 year ago
Georgi Gerganov	26c0846629	Initial release	1 year ago

35 Commits (5cb63e2493c49bc2c3b9b355696e8dc26cdd0380)