Skip to main content

Questions tagged [large-language-model]

A general tag for large language model (LLM)-related subjects. Please ALWAYS use the more specific tags if available (GPT variants, PaLM , LLaMa, BLOOM, Claude etc..)

large-language-model
0 votes
0 answers
14 views

How to prune a fine-tuned Mistral-7B model?

So I have already fine-tuned a Mistral-7B model with my data and have the model within my local files. Now I'm looking to reduce the generation time by pruning it. However, I have no idea what ...
aureum's user avatar
  • 1
-4 votes
0 answers
19 views

Developing an ai assistant [closed]

I wanna develop an ai assistant that is highly scalable meaning everytime i need to build a model I just retrain my ai assistant on a new data. It can genertate text and content , it has a feedback ...
Khoubaib Bourbia's user avatar
-2 votes
0 answers
17 views

How does AI understands the tree diagrams in text format?

How does AI/LLM understand and interpret tree diagrams represented in text format, such as the following example? (Ex: prompt given to Chatgpt) Is the AI trained on specific datasets that explicitly ...
Jayanth's user avatar
  • 183
-4 votes
0 answers
13 views

I want the model to generate an exact number of tokens, no more, no less [closed]

Are there any tips or best practices to achieve this? I have tried few-shot prompting are there any open source models which can perform this? I have tried few-shot prompting it was not giving best ...
Rohit Behera's user avatar
-1 votes
0 answers
9 views

How to Modify and Replace Embeddings in a Large Language Model (LLM)? [closed]

I am a beginner in large language models (LLMs) and I am working on a project. I have a question regarding embeddings in an LLM. How can I modify the embeddings of an LLM? Are they stored in a ...
Steven Thorn's user avatar
-6 votes
0 answers
39 views

Using LLM to convert a word document that contains tables to an excel spreadsheet (could be csv too) [closed]

I am successfully able to use ChatGPT and upload a word document that is a course script and then upload a spreadsheet with a few exact examples showing the LLM how I want the word doc script to be ...
James Cochrane's user avatar
1 vote
0 answers
14 views

TRL SFTTrainer clarification on truncation

I am currently finetuning LLama models using SFTTrainer in huggingface. However, I came up with a question, I can not answer through the documentations (atleast, it is a bit ambigious). My dataset ...
iiiiiiiiiiiiiiiiiiii's user avatar
-2 votes
0 answers
25 views

What is the best language model for fine tuning with dataset in Persian language? [closed]

I try to fine tune llama2 language model with dataset that I created in Persian language. But when I tokenize this dataset I noticed that llama2 tokenizer tokenized dataset in character level not word ...
user23446017's user avatar
-1 votes
0 answers
19 views

How to Estimate GPU Memory for training and inference, Data Requirements, and Training Time for Large Language Models?

This is a very concrete and well-defined computer engineering question. I don't understand why someone would want to close it. Today, I faced this question during an interview for an ML Engineer ...
maplemaple's user avatar
  • 1,435
0 votes
0 answers
25 views

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

am trying to output the response of the llama2 model that i installed locally, but when i try to execute the following lines: output = model.generate(**inputs, streamer=streamer, use_cache=True, ...
noureddine's user avatar
0 votes
0 answers
15 views

Training LLM uses unexpected amount of GPU memory

I'm training model with self-implemented training loops. A 1.5B Qwen2 occupies 40G of GPU memory. When I did the same training using llama factory, it only takes about 24G. I tried to delete some ...
StaEx_G's user avatar
  • 13
0 votes
0 answers
20 views

How to evaluate LLM response [closed]

I am retrieving response using QWEN 72B model. I want to validate my response and don’t have ground truth answers. How can I evaluate my response without help of ground truth answers. I want to use ...
Prashanth Kolaneru's user avatar
-1 votes
0 answers
13 views

How to resolve ``` backticks error that occur while generating sql query in gemini llm to build a NL2SQL chatbot building

I am using llm to fetch data from my postgres db table This is the output that is being generated , Even though i have mentioned in the prompt to not add backticks while generating sql queries This is ...
Lad99's user avatar
  • 1
0 votes
0 answers
16 views

Unable to import SentenceTransformer

I am using Colab, I am trying to import SentenceTransformer: from sentence_transformers import SentenceTransformer However, I got this error: ttributeError Traceback (most ...
A1iMansour's user avatar
-2 votes
0 answers
19 views

training help hybrid based model that integrates contextual and numerical features for a classification problem [closed]

I want a critical production RISK analysis problem. So, based on a record I want to risk rank each record from 0 to 5. The training set is fairly imbalanced. > "0.0 964 > 1.0 393 &...
wayne halks's user avatar

15 30 50 per page
1
2 3 4 5
105