Skip to content

Auto-unload models on idle #13

@nathanlesage

Description

@nathanlesage

I recently discovered this due to the changelog of llama.cpp, and I love it – minimal space requirements and pretty direct access to llama.cpp. I'm a fan! One thing I would absolutely love is if there was the ability to automatically unload the model on idle. I would imagine the following:

  • A new setting allowing users to define an auto-unload threshold (maybe 5 minutes or so) after which the model gets unloaded again (essentially same function as clicking the model in the dropdown menu)
  • In a perfect world, maybe even a listener in the web UI that, when the user sends a new message, to reload the model again.

The first one would be pretty nifty, because that frees up memory even if you forget to unload the model. The second feature is probably a bit more difficult (since it will require stopping the llama-server, if I'm not mistaken?), and not as mission-critical, as re-starting the server with a simple click is relatively straight forward. It's really mostly about the unloading of the model to free up memory.

Thanks for consideration!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions