Description
Hello! When passing a prompt via --file, if there's a trailing LF at the very end of the file it appears to be ignored, and generation appends to the same line, rather than a new one.
Expected Behavior:
Given the following prompt:
A: One banana
B: Two apples
C: Three oranges
D: Four
Hex dump:
00000000 41 3a 20 4f 6e 65 20 62 61 6e 61 6e 61 0a 42 3a |A: One banana.B:|
00000010 20 54 77 6f 20 61 70 70 6c 65 73 0a 43 3a 20 54 | Two apples.C: T|
00000020 68 72 65 65 20 6f 72 61 6e 67 65 73 0a 44 3a 20 |hree oranges.D: |
00000030 46 6f 75 72 0a |Four.|
If there's a single trailing LF at the end of the file - as there is with every other line - you would expect generation to continue with "E:" on a new line, something like:
A: One banana
B: Two apples
C: Three oranges
D: Four
E: Five grapes
Current Behavior
Instead, the generation ignores the LF and continues the last line as if it had not been terminated:
A: One banana
B: Two apples
C: Three oranges
D: Four plums. [end of text]
To get the expected output beginning on a new line, you have to append a second LF (ie a blank line) to the end of the prompt file.
(Also of note is that the very first character of the displayed prompt/output is always a space. This happens regardless of whether the prompt is passed via cmdline, via file, or not at all. I don't know if this is a model thing, or a bug?)
Environment and Context
AMD Ryzen 5 5600G
128GB RAM
Ubuntu 22.04.2 LTS (jammy)
GNU Make 4.3
gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04)
Steps to Reproduce
Sample command using LLaMA 7B; also happens with LLaMa 30B. promptfile.txt is the above text.
./main -m ~/llama/models/7B/ggml-model-f16.bin --temp 0.40 --file promptfile.txt