Default Branch

master

a90e96b266 · Convert.py @staticmethod (#1327) · Updated 12 months ago

Branches

remove-vzip

b639b45cfd · ggml : 2x faster scalar implementations · Updated 12 months ago

11
6
readme

47bbd631f2 · readme: add missing info · Updated 12 months ago

2
1
ci_cublas

31ff9e2e83 · ci : add cublas to windows release · Updated 1 year ago

11
1
fix-eval-bos

cad6ff5d36 · scripts : add ppl-run-all.sh · Updated 1 year ago

12
2
q4_3-range-fix

102cd98074 · ggml : Q4_3c using 2x "Full range" approach · Updated 1 year ago

92
8
ik/rmse_quantization

6fd49ed050 · Minor, plus rebase on master · Updated 1 year ago

92
3
q4_0-q4_2-range-fix

71e6ae3779 · ggml : continue from #729 (wip) · Updated 1 year ago

92
7
gg/rmse_quantization

a0242a833c · Minor, plus rebase on master · Updated 1 year ago

92
2
quant-attn

4b8d5e3890 · llama : quantize attention results · Updated 1 year ago

97
1
mmap-pages-stats

1506737499 · Add mmap pages stats (disabled by default) · Updated 1 year ago

147
1
flash-attn

36ddd12924 · llama : add flash attention (demo) · Updated 1 year ago

213
1
mmap

c9c820ff36 · Added support for _POSIX_MAPPED_FILES if defined in source (#564) · Updated 1 year ago

447
8
q4_1_more_accel

4aeee216fd · Regroup q4_1 dot addition for better numerics. · Updated 1 year ago

328
2
q4_1_more_accel_kahan

66ea164e1d · Kahan summation on Q4_1 · Updated 1 year ago

355
2
q4_1_more_accel_loopsplit

711224708d · Break up loop for numeric stability · Updated 1 year ago

355
2
tcp_server

3a0dcb3920 · Implement server mode. · Updated 1 year ago

356
5
dev

a169bb889c · Gate signal support on being on a unixoid system. (#74) · Updated 1 year ago

462
0
Included