Skip to content

llama.cpp acts too dumb while running on phone!! #802

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Shreyas-ITB opened this issue Apr 6, 2023 · 13 comments
Closed

llama.cpp acts too dumb while running on phone!! #802

Shreyas-ITB opened this issue Apr 6, 2023 · 13 comments
Labels
generation quality Quality of model output

Comments

@Shreyas-ITB
Copy link

I was trying llama.cpp on phone with termux installed. but look at this image
Screenshot_20230406-120404

Specifications
The phone has 8 gigs of RAM and 7 gigs is free and the CPU has 8 cores so its not the issue of the RAM and CPU.
Model used: alpaca-7B-lora
llama.cpp version: latest
prompt: chat-with-bob.txt

I really don't know what is causing the issue here. The problem happening is, when i ask a question to it, it just either answers the question in a very dumb way or it just repeats the same question not answering anything. With the same model, prompt and llama.cpp version on my PC with 4GB ram works as expected it answers every question with almost 98% accuracy. Can any of you guys help me out with this? or update the llama.cpp and fix the mobile issues please?

Thankyou

@BarfingLemurs
Copy link
Contributor

try playing with settings, like increasing temperature

@Shreyas-ITB
Copy link
Author

Okay ill try it and see then ill let you know

@gjmulder gjmulder added the generation quality Quality of model output label Apr 6, 2023
@Shreyas-ITB
Copy link
Author

@BarfingLemurs Nope it doesnt work, it stays the same (acts way way dumber and still repeats the question)
@gjmulder Please add a bug or some issue label here to this please. It needs some more development.

@gjmulder
Copy link
Collaborator

gjmulder commented Apr 6, 2023

You may want to log an issue with the Stanford Alpaca. It is the training set fo the Alpaca model you are using.

@Shreyas-ITB Shreyas-ITB changed the title llama.cpp acts why dumber on phone!! llama.cpp acts too dumb while running on phone!! Apr 7, 2023
@Shreyas-ITB
Copy link
Author

@gjmulder im using alpaca lora maybe thats the issue? I mean its not the problem of the alpaca model tho it worked fine on my laptop and for many users on their PCs on mobile it malfunctions

@FNsi
Copy link
Contributor

FNsi commented Apr 7, 2023

Maybe because of termux. I don't even know how to use the Firefox in termux to watching YouTube 720p without crash😅😂
I guess it's the ram restrictions from your phone to the termux app.

@Shreyas-ITB
Copy link
Author

Shreyas-ITB commented Apr 7, 2023

@FNsi lol its not due to termux. The same issue happens with userland (an app thats intended to run ubuntu on android without root)

@FNsi
Copy link
Contributor

FNsi commented Apr 7, 2023

@FNsi lol its not due to termux. The same issue happens with userland (an app thats intended to run ubuntu on android without root)

I think it's almost the same since there all the emulated terminal? I saw an android fork in Google market,maybe you can try it?

#750

@gjmulder
Copy link
Collaborator

gjmulder commented Apr 7, 2023

The same model should work the similarly with llama.cpp on any platform, if the same temp, top_k, etc. parameters are being passed to llama.cpp. The random number generator is different so you will never get the exact same output, but the outputs should be simular in quality. The only other other difference I could imagine would be performance.

@unbounded
Copy link
Contributor

There are various optimized code paths that are only enabled for certain platform and feature sets, there could be differences in the implementation of those.

Could you post the initial output with the system_info and model parameters?

@Shreyas-ITB
Copy link
Author

@unbounded the output i get is in the start of the issue (there is a screenshot of what the model is saying)
Model parameters are the same as the chat.sh file in the repository's example directory.

System Info
Arm cortex A53 octa core processor
8 GB RAM
Android 12 and there is no AVX or AVX2 Flags in the CPU as its an ARM processor.

@unbounded
Copy link
Contributor

Could be related to #876 which was fixed in 684da25

@unbounded
Copy link
Contributor

Closing as assumed fixed by 684da25 , feel free to reopen if this still happens with the latest version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
generation quality Quality of model output
Projects
None yet
Development

No branches or pull requests

5 participants