diff --git a/README.md b/README.md index 86dd949..5675a92 100644 --- a/README.md +++ b/README.md @@ -248,7 +248,7 @@ cadaver, cauliflower, cabbage (vegetable), catalpa (tree) and Cailleach. ### Perplexity (Measuring model quality) -You can pass `--perplexity` as a command line option to measure perplexity over the given prompt. For more background, +You can use the `perplexity` example to measure perplexity over the given prompt. For more background, see https://huggingface.co/docs/transformers/perplexity. However, in general, lower perplexity is better for LLMs. #### Latest measurements @@ -271,10 +271,10 @@ Perplexity - model options #### How to run 1. Download/extract: https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip?ref=salesforce-research -2. Run `./main --perplexity -m models/7B/ggml-model-q4_0.bin -f wiki.test.raw` +2. Run `./perplexity -m models/7B/ggml-model-q4_0.bin -f wiki.test.raw` 3. Output: ``` -Calculating perplexity over 655 chunks +perplexity : calculating perplexity over 655 chunks 24.43 seconds per pass - ETA 4.45 hours [1]4.5970,[2]5.1807,[3]6.0382,... ``` diff --git a/examples/perplexity/perplexity.cpp b/examples/perplexity/perplexity.cpp index f617ba3..75d526d 100644 --- a/examples/perplexity/perplexity.cpp +++ b/examples/perplexity/perplexity.cpp @@ -19,7 +19,7 @@ std::vector softmax(const std::vector& logits) { void perplexity(llama_context * ctx, const gpt_params & params) { // Download: https://s3.amazonaws.com/research.metamind.io/wikitext/wikitext-2-raw-v1.zip?ref=salesforce-research - // Run `./main --perplexity -m models/7B/ggml-model-q4_0.bin -f wiki.test.raw` + // Run `./perplexity -m models/7B/ggml-model-q4_0.bin -f wiki.test.raw` // Output: `perplexity: 13.5106 [114/114]` auto tokens = ::llama_tokenize(ctx, params.prompt, true);