Skip to main content
The 2024 Developer Survey results are live! See the results

Questions tagged [language-model]

The tag has no usage guidance.

language-model
0 votes
0 answers
35 views

DSPy: How to get the number of tokens available for the input fields?

This is a cross-post from Issue #1245 of DSPy GitHub Repo. There were no responses in the past week, am I am working on a project with a tight schedule. When running a DSPy module with a given ...
Tom Lin's user avatar
  • 74
-1 votes
0 answers
39 views

Language Model providing incomplete responses

I am trying to use the language model API in my vscode extension to refactor code. However, the LLM sometimes provides incomplete responses. I could really use some help in figuring out why this is ...
Advay Balakrishnan's user avatar
0 votes
0 answers
20 views

What are the key quality metrics for large language model releases?

I am a first year PhD student working on improving the release practices of Machine Learning Models, especially pre-trained large language models. I want to understand the above concept for a ...
Eyinlojuoluwa's user avatar
0 votes
1 answer
81 views

'SymbolicTensor' object cannot be interpreted as an integer

I have been trying to implement Peephole LSTM using Tensorflow, and I am getting the error below Error below is my model and I am not sure why I cant get the input layer in my model summary Model and ...
Ramin sh's user avatar
1 vote
0 answers
171 views

Using Language Model Phi-3-Mini quantized version in Jupyter Notebook

I am trying to use a small language model in my jupyter notebook and am not able to find a working solution. I want to use the quantized version of Phi-3-mini as that is small enough to fit on my GPU ...
Christoph's user avatar
0 votes
0 answers
18 views

ValueError: The model did not return a loss from the inputs: During Further Pretraining ARBERT Model

I am applying further pretraining for the ARBERT model for both the Masked Language Modeling (MLM) and Next Sentence Prediction (NSP) tasks. However, I am receiving the below error. I know this error ...
Ghada Mansour's user avatar
0 votes
2 answers
210 views

Issues with Generating Text from Fine-Tuned Mistral 7B Model on Georgian Dataset

I've fine-tuned the Mistral 7B model using a Georgian dataset with approximately 100,000 articles, including custom tokenizer fine-tuning. The fine-tuning process took about 9 hours. However, when I ...
SabaKhupenia's user avatar
0 votes
0 answers
232 views

What are the differences between 'fairseq' and 'fairseq2'?

What are the differences between fairseq and fairseq2? Quotes from the github pages are not very clear Fairseq(-py) is a sequence modeling toolkit that allows researchers and developers to train ...
Long's user avatar
  • 1,709
0 votes
0 answers
38 views

Adding Conversation Memory to Xenova/LaMini-T5-61M Browser-based Model in JS

I'm currently working with the browser-based model in JavaScript, specifically 'text2text-generation' by Xenova/LaMini-T5-61M. My goal is to implement conversation memory functionality using Langchain....
Zeenath's user avatar
  • 339
2 votes
1 answer
425 views

specify task_type for embeddings in Vertex AI

Has someone tried the last update of GCP TextEmbeddingInput that allows to specify the task_type of your application? Theoretically it should allows you to use different fine tuned models to generate ...
Asia Salpa's user avatar
0 votes
0 answers
37 views

Why do unmasked tokens of a sequence change when passed through a language model?

Why passing a sequence of tokens, say ["A", "B", "C", "D"] through a masked language model without any masking does not result in the same sequence being output ...
Anshul's user avatar
  • 71
1 vote
1 answer
400 views

Why do we add |V| in the denominator in the Add-One smoothing for n-gram language models?

In NLP when we use Laplace(Add-one) smoothing technique we assume that the every word is seen one more time than the actual count and the formula is like this where V is the size of the vocabulary. ...
hxdshell's user avatar
0 votes
0 answers
239 views

How to vectorize text data in Pandas.DataFrame and then one_hot encoode it "inside" the model

I try to implement sequence model (trained to predict next word) built on one-hot encoded vector sequences. My custom one-hot encoder works well. But just as exercise I want to do all things with ...
x3mEr's user avatar
  • 33
0 votes
1 answer
362 views

With a HuggingFace trainer, how do I show the training loss versus the eval data set?

I'm running: #original training script trainer = transformers.Trainer( model=model, train_dataset=train_dataset, eval_dataset=test_dataset, #turn on the eval dataset for comparisons ...
Ronan McGovern's user avatar
2 votes
1 answer
511 views

GPT4All Metal Library Conflict during Embedding on M1 Mac

I am trying to run GPT4All's embedding model on my M1 Macbook with the following code: import json import numpy as np from gpt4all import GPT4All, Embed4All # Load the cleaned JSON data with open('...
user20140267's user avatar

15 30 50 per page
1
2 3 4 5
18