- Support offline local reasoning of gguf format text models.
- Easy to use
- Apps using HLlama can target: HarmonyOS Next (3.0.0.13) or later (API 12).
- ARM64 & x86_64 architecture.
- DevEco Studio NEXT Developer Beta1 (5.0.3.100) or later.
This is the fastest and recommended way to add HLlama to your project.
ohpm install hllama
Or, you can add it to your project manually.
-
Add the following lines to
oh-package.json5
on your app module."dependencies": { "hllama": "^0.0.2", }
-
Then run
ohpm install
- Initializing Llama asynchronously
import { HLlama } from 'hllama';
HLlama.initLlamaContext(filesDir + '/llama.gguf').then((initId) => {
// you can get initId,if initId > 0 is success.
});
- Initialize Llama synchronously
import { HLlama } from 'hllama';
HLlama.initLlamaContextSync(filesDir + '/llama.gguf',(initId)=>{
// you can get initId,if initId > 0 is success.
})
- Get model details synchronously
import { HLlama } from 'hllama';
let model = HLlama.getModelDetailSync(initId);
// interface ModelDetailInterface {
// desc: string;
// size: number;
// nParams: number;
// isChatTemplateSupported: boolean;
// }
- Start model inference synchronously
import { HLlama } from 'hllama';
HLlama.startCompletionSync(initId.toString(), prompt, (output) => {
//This callback outputs the final result
},(realTimeOutput)=>{
//This callback outputs real-time inference results
});
- Synchronously terminate model inference
import { HLlama } from 'hllama';
HLlama.stopCompletionSync(initId.toString());
- Synchronously release the inference process
import { HLlama } from 'hllama';
HLlama.freeLlamaContextSync(initId.toString());
HLlama is published under the MIT license. For details check out the LICENSE.
Check out the CHANGELOG.md for details of change history.
If you have any questions or suggestions, please contact me by email at [email protected].