Questions tagged [llama3]
The llama3 tag has no usage guidance.
llama3
33
questions
-5
votes
0
answers
24
views
I want to build RAG application using llama3 and qdrant vector db [closed]
Certainly! Here's a refined version of your request:
I am looking to develop a private chatbot using LLAMA3 and Qdrant as the vector data store. Here are my current resources: 16GB RAM, 1TB SSD, and ...
0
votes
0
answers
24
views
ConnectError: All connection attempts failed when connecting indexing to neo4j database using PropertyGraphIndex from llama3
I am working on knowledge graph and all connection to neo4j browser is a success(using neo4j desktop windows not docker deployed). however with llama3 i am running the same notebooks as in property ...
0
votes
1
answer
23
views
Does langchain with llama-cpp-python fail to work with very long prompts?
I'm trying to create a service using the llama3-70b model by combining langchain and llama-cpp-python on a server workstation. While the model works well with short prompts(question1, question2), it ...
0
votes
0
answers
22
views
'LlamaForCausalLM' object has no attribute 'max_seq_length'
I'm fine-tuning llama3 using unsloth , I trained my model and saved it successfully but when I tried loading using AutoPeftModelForCausalLM.from_pretrained ,then I used TextStreamer from transformer ...
-1
votes
0
answers
14
views
Got `disk_offload` error while trying to get the LLma3 model from Hugging face
import torch
from transformers import AutoModelForCausalLM,AutoTokenizer
from llama_index.llms.huggingface import HuggingFaceLLM
from accelerate import disk_offload
tokenizer = AutoTokenizer....
0
votes
0
answers
32
views
How should I use Llama-3 properly?
I downloaded the Meta-Llama-3-70B-Instruct model using the download.sh and the url provided by Meta email, and this is all the files in the folder.
enter image description here
And when I tried to use ...
0
votes
0
answers
33
views
How to merge multiple (at least two) existing LlamaIndex VectorStoreIndex instances?
I'm working with LlamaIndex and have created two separate VectorStoreIndex instances, each from different documents. Now, I want to merge these two indexes into a single index. Here's my current setup:...
0
votes
0
answers
22
views
How to increase maximum limit for completion_tokens in AWS Sagemaker invoke endpoint
I have deployed the meta-llama/Meta-Llama-3-8B-Instruct model using HuggingFaceModel. The model responds with the full output when I make a call using HuggingFaceModel's predictor method. Here is the ...
0
votes
0
answers
38
views
Llama-3-70B with pipeline cannot generate new tokens (texts)
I have sucessfully downloaded Llama-3-70B, and when I want to test its "text-generation" ability, it always outputs my prompt and no other more texts.
Here is my demo code (copied from ...
0
votes
0
answers
23
views
How to configure ollama setup exe from its source code
I was required to install Ollama setup exe from the source code in windows
I found the steps as
Note: The Windows build for Ollama is still under development.
First, install required tools:
MSVC ...
0
votes
0
answers
45
views
"Sizes of tensors must match" error in Meta-Llama-3-8B-Instruct
I'm trying to use the pre-trained Meta-Llama-3-8B-Instruct LLM from Hugging Face for fine tuning on my own data. As a very first step, I'm just trying to interact with the model as is.
My system specs:...
0
votes
0
answers
6
views
How Can I Use Run Manager to Stream Response on RetrievalQA?
I'm working with the langchain library and transformers to build a language model application. I want to integrate a CallbackManagerForLLMRun to stream responses in my RetrievalQA chain. Below is the ...
0
votes
0
answers
65
views
In Pytorch and Huggingface transformers, why does loading Llama3 to CPU and then using .to use so much more memory than loading with device_map
I've tried loading Huggingface transformers models to MPS in two different ways:
llm = AutoModelForCausalLM.from_pretrained(
"meta-llama/Meta-Llama-3-8B-Instruct",
torch_dtype=torch....
0
votes
0
answers
42
views
Impossible to get replies out of LLama3
As the question says, I am trying to run Llama3 but to no avail. Most people recommend using Ollama, which I cannot use due to personal circumstances (I am running the code on a cluster and I cannot ...
0
votes
1
answer
297
views
How to set eos_token_id in llama3 in HuggingFaceLLM?
I wanna set my eos_token_id, and pad_token_id. I googled alot, and most are suggesting to use e.g. tokenizer.pad_token_id (like from here https://huggingface.co/meta-llama/Meta-Llama-3-8B/discussions/...