Newest 'large-language-model' Questions

0 votes

0 answers

14 views

How to prune a fine-tuned Mistral-7B model?

So I have already fine-tuned a Mistral-7B model with my data and have the model within my local files. Now I'm looking to reduce the generation time by pruning it. However, I have no idea what ...

aureum

1

asked 1 hour ago

-4 votes

0 answers

19 views

Developing an ai assistant [closed]

I wanna develop an ai assistant that is highly scalable meaning everytime i need to build a model I just retrain my ai assistant on a new data. It can genertate text and content , it has a feedback ...

Khoubaib Bourbia

1

asked 3 hours ago

-2 votes

0 answers

17 views

How does AI understands the tree diagrams in text format?

How does AI/LLM understand and interpret tree diagrams represented in text format, such as the following example? (Ex: prompt given to Chatgpt) Is the AI trained on specific datasets that explicitly ...

Jayanth

183

asked 5 hours ago

-4 votes

0 answers

13 views

I want the model to generate an exact number of tokens, no more, no less [closed]

Are there any tips or best practices to achieve this? I have tried few-shot prompting are there any open source models which can perform this? I have tried few-shot prompting it was not giving best ...

Rohit Behera

1

asked 10 hours ago

-1 votes

0 answers

9 views

How to Modify and Replace Embeddings in a Large Language Model (LLM)? [closed]

I am a beginner in large language models (LLMs) and I am working on a project. I have a question regarding embeddings in an LLM. How can I modify the embeddings of an LLM? Are they stored in a ...

Steven Thorn

1

asked 12 hours ago

-6 votes

0 answers

39 views

Using LLM to convert a word document that contains tables to an excel spreadsheet (could be csv too) [closed]

I am successfully able to use ChatGPT and upload a word document that is a course script and then upload a spreadsheet with a few exact examples showing the LLM how I want the word doc script to be ...

James Cochrane

1

asked 23 hours ago

1 vote

0 answers

14 views

TRL SFTTrainer clarification on truncation

I am currently finetuning LLama models using SFTTrainer in huggingface. However, I came up with a question, I can not answer through the documentations (atleast, it is a bit ambigious). My dataset ...

iiiiiiiiiiiiiiiiiiii

335

asked yesterday

-2 votes

0 answers

25 views

What is the best language model for fine tuning with dataset in Persian language? [closed]

I try to fine tune llama2 language model with dataset that I created in Persian language. But when I tokenize this dataset I noticed that llama2 tokenizer tokenized dataset in character level not word ...

user23446017

1

asked 2 days ago

-1 votes

0 answers

19 views

How to Estimate GPU Memory for training and inference, Data Requirements, and Training Time for Large Language Models?

This is a very concrete and well-defined computer engineering question. I don't understand why someone would want to close it. Today, I faced this question during an interview for an ML Engineer ...

maplemaple

1,435

asked 2 days ago

0 votes

0 answers

25 views

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

am trying to output the response of the llama2 model that i installed locally, but when i try to execute the following lines: output = model.generate(**inputs, streamer=streamer, use_cache=True, ...

noureddine

3

asked Jul 19 at 10:55

0 votes

0 answers

15 views

Training LLM uses unexpected amount of GPU memory

I'm training model with self-implemented training loops. A 1.5B Qwen2 occupies 40G of GPU memory. When I did the same training using llama factory, it only takes about 24G. I tried to delete some ...

StaEx_G

13

asked Jul 19 at 10:02

0 votes

0 answers

20 views

How to evaluate LLM response [closed]

I am retrieving response using QWEN 72B model. I want to validate my response and don’t have ground truth answers. How can I evaluate my response without help of ground truth answers. I want to use ...

Prashanth Kolaneru

15

asked Jul 19 at 9:32

-1 votes

0 answers

13 views

How to resolve ``` backticks error that occur while generating sql query in gemini llm to build a NL2SQL chatbot building

I am using llm to fetch data from my postgres db table This is the output that is being generated , Even though i have mentioned in the prompt to not add backticks while generating sql queries This is ...

Lad99

1

asked Jul 19 at 6:27

0 votes

0 answers

16 views

Unable to import SentenceTransformer

I am using Colab, I am trying to import SentenceTransformer: from sentence_transformers import SentenceTransformer However, I got this error: ttributeError Traceback (most ...

A1iMansour

11

asked Jul 18 at 22:24

-2 votes

0 answers

19 views

training help hybrid based model that integrates contextual and numerical features for a classification problem [closed]

I want a critical production RISK analysis problem. So, based on a record I want to risk rank each record from 0 to 5. The training set is fairly imbalanced. > "0.0 964 > 1.0 393 &...

wayne halks

5

asked Jul 18 at 21:51

Collectives™ on Stack Overflow

Questions tagged [large-language-model]

How to prune a fine-tuned Mistral-7B model?

Developing an ai assistant [closed]

How does AI understands the tree diagrams in text format?

I want the model to generate an exact number of tokens, no more, no less [closed]

How to Modify and Replace Embeddings in a Large Language Model (LLM)? [closed]

Using LLM to convert a word document that contains tables to an excel spreadsheet (could be csv too) [closed]

TRL SFTTrainer clarification on truncation

What is the best language model for fine tuning with dataset in Persian language? [closed]

How to Estimate GPU Memory for training and inference, Data Requirements, and Training Time for Large Language Models?

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

Training LLM uses unexpected amount of GPU memory

How to evaluate LLM response [closed]

How to resolve ``` backticks error that occur while generating sql query in gemini llm to build a NL2SQL chatbot building

Unable to import SentenceTransformer

training help hybrid based model that integrates contextual and numerical features for a classification problem [closed]

Hot Network Questions

Collectives™ on Stack Overflow

Questions tagged [large-language-model]

Related Tags