Commit Graph

4 Commits (e4881686b4160c74087ecc9d96df4ed0db6d70ef)

Author SHA1 Message Date
oKatanaaa e4881686b4
Make WIN32 mmap() improvements (#341)
Still not fully working yet.

Closes #341
1 year ago
Justine Tunney 5b8023d935
Implement prototype for instant mmap() loading
This change uses a custom malloc() implementation to transactionally
capture to a file dynamic memory created during the loading process.
That includes (1) the malloc() allocation for mem_buffer and (2) all
the C++ STL objects. On my $1000 personal computer, this change lets
me run ./main to generate a single token (-n 1) using the float16 7B
model (~12gb size) in one second. In order to do that, there's a one
time cost where a 13gb file needs to be generated. This change rocks
but it shouldn't be necessary to do something this heroic. We should
instead change the file format, so that tensors don't need reshaping
and realignment in order to be loaded.
1 year ago
Georgi Gerganov f60fa9e50a
.gitignore models/ 1 year ago
Georgi Gerganov 26c0846629
Initial release 1 year ago