Newest 'llama3' Questions - Stack Overflow

-5 votes

0 answers

24 views

I want to build RAG application using llama3 and qdrant vector db [closed]

Certainly! Here's a refined version of your request: I am looking to develop a private chatbot using LLAMA3 and Qdrant as the vector data store. Here are my current resources: 16GB RAM, 1TB SSD, and ...

deep learning

1

asked yesterday

0 votes

0 answers

24 views

ConnectError: All connection attempts failed when connecting indexing to neo4j database using PropertyGraphIndex from llama3

I am working on knowledge graph and all connection to neo4j browser is a success(using neo4j desktop windows not docker deployed). however with llama3 i am running the same notebooks as in property ...

Kcndze

21

asked Jul 19 at 13:46

0 votes

1 answer

23 views

Does langchain with llama-cpp-python fail to work with very long prompts?

I'm trying to create a service using the llama3-70b model by combining langchain and llama-cpp-python on a server workstation. While the model works well with short prompts(question1, question2), it ...

bibiibibin

1

asked Jul 18 at 15:39

0 votes

0 answers

22 views

'LlamaForCausalLM' object has no attribute 'max_seq_length'

I'm fine-tuning llama3 using unsloth , I trained my model and saved it successfully but when I tried loading using AutoPeftModelForCausalLM.from_pretrained ,then I used TextStreamer from transformer ...

Sarra Ben Messaoud

1

asked Jul 18 at 10:47

-1 votes

0 answers

14 views

Got `disk_offload` error while trying to get the LLma3 model from Hugging face

import torch from transformers import AutoModelForCausalLM,AutoTokenizer from llama_index.llms.huggingface import HuggingFaceLLM from accelerate import disk_offload tokenizer = AutoTokenizer....

Vins Shaji

1

asked Jul 18 at 6:57

0 votes

0 answers

32 views

How should I use Llama-3 properly?

I downloaded the Meta-Llama-3-70B-Instruct model using the download.sh and the url provided by Meta email, and this is all the files in the folder. enter image description here And when I tried to use ...

Joey1205

1

asked Jul 17 at 13:29

0 votes

0 answers

33 views

How to merge multiple (at least two) existing LlamaIndex VectorStoreIndex instances?

I'm working with LlamaIndex and have created two separate VectorStoreIndex instances, each from different documents. Now, I want to merge these two indexes into a single index. Here's my current setup:...

林抿均

43

asked Jul 16 at 12:07

0 votes

0 answers

22 views

How to increase maximum limit for completion_tokens in AWS Sagemaker invoke endpoint

I have deployed the meta-llama/Meta-Llama-3-8B-Instruct model using HuggingFaceModel. The model responds with the full output when I make a call using HuggingFaceModel's predictor method. Here is the ...

keerti4p

1

asked Jul 16 at 9:29

0 votes

0 answers

38 views

Llama-3-70B with pipeline cannot generate new tokens (texts)

I have sucessfully downloaded Llama-3-70B, and when I want to test its "text-generation" ability, it always outputs my prompt and no other more texts. Here is my demo code (copied from ...

Martin

11

asked Jul 11 at 16:10

0 votes

0 answers

23 views

How to configure ollama setup exe from its source code

I was required to install Ollama setup exe from the source code in windows I found the steps as Note: The Windows build for Ollama is still under development. First, install required tools: MSVC ...

Diksha Gupta

1

asked Jul 10 at 8:39

0 votes

0 answers

45 views

"Sizes of tensors must match" error in Meta-Llama-3-8B-Instruct

I'm trying to use the pre-trained Meta-Llama-3-8B-Instruct LLM from Hugging Face for fine tuning on my own data. As a very first step, I'm just trying to interact with the model as is. My system specs:...

Raul Marquez

1,108

asked Jul 9 at 4:14

0 votes

0 answers

6 views

How Can I Use Run Manager to Stream Response on RetrievalQA?

I'm working with the langchain library and transformers to build a language model application. I want to integrate a CallbackManagerForLLMRun to stream responses in my RetrievalQA chain. Below is the ...

rahul raj

21

asked Jul 8 at 17:27

0 votes

0 answers

65 views

In Pytorch and Huggingface transformers, why does loading Llama3 to CPU and then using .to use so much more memory than loading with device_map

I've tried loading Huggingface transformers models to MPS in two different ways: llm = AutoModelForCausalLM.from_pretrained( "meta-llama/Meta-Llama-3-8B-Instruct", torch_dtype=torch....

Owen D

55

asked Jul 4 at 2:16

0 votes

0 answers

42 views

Impossible to get replies out of LLama3

As the question says, I am trying to run Llama3 but to no avail. Most people recommend using Ollama, which I cannot use due to personal circumstances (I am running the code on a cluster and I cannot ...

Anonymous

41

asked Jul 2 at 18:15

0 votes

1 answer

297 views

How to set eos_token_id in llama3 in HuggingFaceLLM?

I wanna set my eos_token_id, and pad_token_id. I googled alot, and most are suggesting to use e.g. tokenizer.pad_token_id (like from here https://huggingface.co/meta-llama/Meta-Llama-3-8B/discussions/...

yts61

1,509

asked Jun 30 at 11:11

Collectives™ on Stack Overflow

Questions tagged [llama3]

I want to build RAG application using llama3 and qdrant vector db [closed]

ConnectError: All connection attempts failed when connecting indexing to neo4j database using PropertyGraphIndex from llama3

Does langchain with llama-cpp-python fail to work with very long prompts?

'LlamaForCausalLM' object has no attribute 'max_seq_length'

Got `disk_offload` error while trying to get the LLma3 model from Hugging face

How should I use Llama-3 properly?

How to merge multiple (at least two) existing LlamaIndex VectorStoreIndex instances?

How to increase maximum limit for completion_tokens in AWS Sagemaker invoke endpoint

Llama-3-70B with pipeline cannot generate new tokens (texts)

How to configure ollama setup exe from its source code

"Sizes of tensors must match" error in Meta-Llama-3-8B-Instruct

How Can I Use Run Manager to Stream Response on RetrievalQA?

In Pytorch and Huggingface transformers, why does loading Llama3 to CPU and then using .to use so much more memory than loading with device_map

Impossible to get replies out of LLama3

How to set eos_token_id in llama3 in HuggingFaceLLM?

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [llama3]

Related Tags