- cli: Compile using the command line, for Android compilation refer toandroid_build.sh
- web: Compile using the command line, runtime requires specifyingweb resources
- android: Open with Android Studio for compilation;
- ios: Open with Xcode for compilation; 🚀🚀🚀This sample code is 100% generated by ChatGPT🚀🚀🚀
- python: mnn-llm python api
mnnllm
; - other: Added capabilities for text embedding, vector querying, document parsing, memory bank, and knowledge base 🔥.
For exporting the llm model to ONNX
or mnn
, please usellm-export
Download models from modelscope
:
qwen
glm
llama
Current CI build status:
# clone
git clone --recurse-submodules https://github.com/wangzhaode/mnn-llm.git
cd mnn-llm
# linux
./script/build.sh
# windows msvc
./script/build.ps1
# python wheel
./script/py_build.sh
# android
./script/android_build.sh
# android apk
./script/android_app_build.sh
# ios
./script/ios_build.sh
The default backend used is CPU
. If you want to use a different backend, you can add a MNN compilation macro within the script:
- cuda:
-DMNN_CUDA=ON
- opencl:
-DMNN_OPENCL=ON
# linux/macos
./cli_demo ./Qwen2-1.5B-Instruct-MNN/config.json # cli demo
./web_demo ./Qwen2-1.5B-Instruct-MNN/config.json ../web # web ui demo
# windows
.\Debug\cli_demo.exe ./Qwen2-1.5B-Instruct-MNN/config.json
.\Debug\web_demo.exe ./Qwen2-1.5B-Instruct-MNN/config.json ../web
# android
adb push libs/*.so build/libllm.so build/cli_demo /data/local/tmp
adb push model_dir /data/local/tmp
adb shell "cd /data/local/tmp && export LD_LIBRARY_PATH=. && ./cli_demo ./Qwen2-1.5B-Instruct-MNN/config.json"
reference
- chatglm-6b
- chatglm2-6b
- chatglm3-6b
- codegeex2-6b
- Baichuan2-7B-Chat
- Qwen-7B-Chat
- Qwen-VL-Chat
- Qwen-1.8B-Chat
- Llama-2-7b-chat-ms
- internlm-chat-7b
- phi-2
- bge-large-zh
- TinyLlama-1.1B-Chat-v0.6
- Yi-6B-Chat
- Qwen1.5-0.5B-Chat
- Qwen1.5-1.8B-Chat
- Qwen1.5-4B-Chat
- Qwen1.5-7B-Chat
- cpp-httplib
- chatgpt-web
- ChatViewDemo
- nlohmann/json