Questions tagged [language-model]
The language-model tag has no usage guidance.
language-model
258
questions
0
votes
0
answers
35
views
DSPy: How to get the number of tokens available for the input fields?
This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule.
When running a DSPy module with a given ...
-1
votes
0
answers
39
views
Language Model providing incomplete responses
I am trying to use the language model API in my vscode extension to refactor code. However, the LLM sometimes provides incomplete responses. I could really use some help in figuring out why this is ...
0
votes
0
answers
20
views
What are the key quality metrics for large language model releases?
I am a first year PhD student working on improving the release practices of Machine Learning Models, especially pre-trained large language models. I want to understand the above concept for a ...
0
votes
1
answer
81
views
'SymbolicTensor' object cannot be interpreted as an integer
I have been trying to implement Peephole LSTM using Tensorflow, and I am getting the error below
Error
below is my model and I am not sure why I cant get the input layer in my model summary
Model
and ...
1
vote
0
answers
171
views
Using Language Model Phi-3-Mini quantized version in Jupyter Notebook
I am trying to use a small language model in my jupyter notebook and am not able to find a working solution. I want to use the quantized version of Phi-3-mini as that is small enough to fit on my GPU ...
0
votes
0
answers
18
views
ValueError: The model did not return a loss from the inputs: During Further Pretraining ARBERT Model
I am applying further pretraining for the ARBERT model for both the Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) tasks. However, I am receiving the below error. I know this error ...
0
votes
2
answers
210
views
Issues with Generating Text from Fine-Tuned Mistral 7B Model on Georgian Dataset
I've fine-tuned the Mistral 7B model using a Georgian dataset with approximately 100,000 articles, including custom tokenizer fine-tuning. The fine-tuning process took about 9 hours. However, when I ...
0
votes
0
answers
232
views
What are the differences between 'fairseq' and 'fairseq2'?
What are the differences between fairseq and fairseq2?
Quotes from the github pages are not very clear
Fairseq(-py) is a sequence modeling toolkit that allows researchers
and developers to train ...
0
votes
0
answers
38
views
Adding Conversation Memory to Xenova/LaMini-T5-61M Browser-based Model in JS
I'm currently working with the browser-based model in JavaScript, specifically 'text2text-generation' by Xenova/LaMini-T5-61M. My goal is to implement conversation memory functionality using Langchain....
2
votes
1
answer
425
views
specify task_type for embeddings in Vertex AI
Has someone tried the last update of GCP TextEmbeddingInput that allows to specify the task_type of your application? Theoretically it should allows you to use different fine tuned models to generate ...
0
votes
0
answers
37
views
Why do unmasked tokens of a sequence change when passed through a language model?
Why passing a sequence of tokens, say ["A", "B", "C", "D"] through a masked language model without any masking does not result in the same sequence being output ...
1
vote
1
answer
400
views
Why do we add |V| in the denominator in the Add-One smoothing for n-gram language models?
In NLP when we use Laplace(Add-one) smoothing technique we assume that the every word is seen one more time than the actual count and the formula is like this
where V is the size of the vocabulary. ...
0
votes
0
answers
239
views
How to vectorize text data in Pandas.DataFrame and then one_hot encoode it "inside" the model
I try to implement sequence model (trained to predict next word) built on one-hot encoded vector sequences. My custom one-hot encoder works well. But just as exercise I want to do all things with ...
0
votes
1
answer
362
views
With a HuggingFace trainer, how do I show the training loss versus the eval data set?
I'm running:
#original training script
trainer = transformers.Trainer(
model=model,
train_dataset=train_dataset,
eval_dataset=test_dataset, #turn on the eval dataset for comparisons
...
2
votes
1
answer
511
views
GPT4All Metal Library Conflict during Embedding on M1 Mac
I am trying to run GPT4All's embedding model on my M1 Macbook with the following code:
import json
import numpy as np
from gpt4all import GPT4All, Embed4All
# Load the cleaned JSON data
with open('...