Skip to content

Latest commit

 

History

History
200 lines (155 loc) · 8.09 KB

README_en.md

File metadata and controls

200 lines (155 loc) · 8.09 KB

mnn-llm

mnn-llm

License Download

Chinese

Example Projects

  • cli: Compile using the command line, for Android compilation refer toandroid_build.sh
  • web: Compile using the command line, runtime requires specifyingweb resources
  • android: Open with Android Studio for compilation;
  • ios: Open with Xcode for compilation; 🚀🚀🚀This sample code is 100% generated by ChatGPT🚀🚀🚀
  • python: mnn-llm python api mnnllm
  • other: Added capabilities for text embedding, vector querying, document parsing, memory bank, and knowledge base 🔥.

model export and download

For exporting the llm model to ONNX or mnn, please usellm-export

Download models from modelscope

qwen
glm
llama
others

Building

Current CI build status:

Build Status Build Status Build Status Build Status Build Status Build Status

Local Compilation

# clone
git clone --recurse-submodules https://github.com/wangzhaode/mnn-llm.git
cd mnn-llm

# linux
./script/build.sh

# windows msvc
./script/build.ps1

# python wheel
./script/py_build.sh

# android
./script/android_build.sh

# android apk
./script/android_app_build.sh

# ios
./script/ios_build.sh

The default backend used is CPU. If you want to use a different backend, you can add a MNN compilation macro within the script:

  • cuda: -DMNN_CUDA=ON
  • opencl: -DMNN_OPENCL=ON

4. Execution

# linux/macos
./cli_demo ./Qwen2-1.5B-Instruct-MNN/config.json # cli demo
./web_demo ./Qwen2-1.5B-Instruct-MNN/config.json ../web # web ui demo

# windows
.\Debug\cli_demo.exe ./Qwen2-1.5B-Instruct-MNN/config.json
.\Debug\web_demo.exe ./Qwen2-1.5B-Instruct-MNN/config.json ../web

# android
adb push libs/*.so build/libllm.so build/cli_demo /data/local/tmp
adb push model_dir /data/local/tmp
adb shell "cd /data/local/tmp && export LD_LIBRARY_PATH=. && ./cli_demo ./Qwen2-1.5B-Instruct-MNN/config.json"

Reference

reference