Questions tagged [large-language-model]
A general tag for large language model (LLM)-related subjects. Please ALWAYS use the more specific tags if available (GPT variants, PaLM , LLaMa, BLOOM, Claude etc..)
large-language-model
1,563
questions
0
votes
0
answers
14
views
How to prune a fine-tuned Mistral-7B model?
So I have already fine-tuned a Mistral-7B model with my data and have the model within my local files. Now I'm looking to reduce the generation time by pruning it. However, I have no idea what ...
-4
votes
0
answers
19
views
Developing an ai assistant [closed]
I wanna develop an ai assistant that is highly scalable meaning everytime i need to build a model I just retrain my ai assistant on a new data. It can genertate text and content , it has a feedback ...
-2
votes
0
answers
17
views
How does AI understands the tree diagrams in text format?
How does AI/LLM understand and interpret tree diagrams represented in text format, such as the following example? (Ex: prompt given to Chatgpt)
Is the AI trained on specific datasets that explicitly ...
-4
votes
0
answers
13
views
I want the model to generate an exact number of tokens, no more, no less [closed]
Are there any tips or best practices to achieve this? I have tried few-shot prompting
are there any open source models which can perform this?
I have tried few-shot prompting it was not giving best ...
-1
votes
0
answers
9
views
How to Modify and Replace Embeddings in a Large Language Model (LLM)? [closed]
I am a beginner in large language models (LLMs) and I am working on a project. I have a question regarding embeddings in an LLM. How can I modify the embeddings of an LLM? Are they stored in a ...
-6
votes
0
answers
39
views
Using LLM to convert a word document that contains tables to an excel spreadsheet (could be csv too) [closed]
I am successfully able to use ChatGPT and upload a word document that is a course script and then upload a spreadsheet with a few exact examples showing the LLM how I want the word doc script to be ...
1
vote
0
answers
14
views
TRL SFTTrainer clarification on truncation
I am currently finetuning LLama models using SFTTrainer in huggingface. However, I came up with a question, I can not answer through the documentations (atleast, it is a bit ambigious).
My dataset ...
-2
votes
0
answers
25
views
What is the best language model for fine tuning with dataset in Persian language? [closed]
I try to fine tune llama2 language model with dataset that I created in Persian language. But when I tokenize this dataset I noticed that llama2 tokenizer tokenized dataset in character level not word ...
-1
votes
0
answers
19
views
How to Estimate GPU Memory for training and inference, Data Requirements, and Training Time for Large Language Models?
This is a very concrete and well-defined computer engineering question. I don't understand why someone would want to close it.
Today, I faced this question during an interview for an ML Engineer ...
0
votes
0
answers
25
views
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
am trying to output the response of the llama2 model that i installed locally, but when i try to execute the following lines:
output = model.generate(**inputs, streamer=streamer,
use_cache=True, ...
0
votes
0
answers
15
views
Training LLM uses unexpected amount of GPU memory
I'm training model with self-implemented training loops. A 1.5B Qwen2 occupies 40G of GPU memory. When I did the same training using llama factory, it only takes about 24G.
I tried to delete some ...
0
votes
0
answers
20
views
How to evaluate LLM response [closed]
I am retrieving response using QWEN 72B model. I want to validate my response and don’t have ground truth answers. How can I evaluate my response without help of ground truth answers. I want to use ...
-1
votes
0
answers
13
views
How to resolve ``` backticks error that occur while generating sql query in gemini llm to build a NL2SQL chatbot building
I am using llm to fetch data from my postgres db table
This is the output that is being generated , Even though i have mentioned in the prompt to not add backticks while generating sql queries
This is ...
0
votes
0
answers
16
views
Unable to import SentenceTransformer
I am using Colab, I am trying to import SentenceTransformer:
from sentence_transformers import SentenceTransformer
However, I got this error:
ttributeError Traceback (most ...
-2
votes
0
answers
19
views
training help hybrid based model that integrates contextual and numerical features for a classification problem [closed]
I want a critical production RISK analysis problem. So, based on a record I want to risk rank each record from 0 to 5. The training set is fairly imbalanced.
> "0.0 964
> 1.0 393
&...