All Questions
Tagged with machine-learning nlp
753
questions
0
votes
0
answers
11
views
NER with custom tags and no training data, zero shot approach help
I am building a "field tagger" for documents. Basically, a document, in my case something like a proposal or sales quote, would have a bunch of entities scattered throughout it, and we want ...
0
votes
0
answers
20
views
NLP: how to handle bad tokenization
I get nonsense when trying to translate the following german sentence to swedish using google/madlad400-3b-mt:
a. Natürliche Personen: BundID mit ELSTER-Zertifikat oder nPA/eID/eAT-Authentifizierung
...
0
votes
0
answers
16
views
Latest Tree-based models
What are the latest Tree-based models that are used in machine learning? Tell the new models except the old ones such as the Decision tree, Random Forest, Gradient Boosting, LightGBM, XGBoost, and ...
1
vote
1
answer
62
views
Improving GPU Utilization in LLM Inference System
I´m trying to build a distributed LLM inference platform with Huggingface support. The implementation involves utilizing Python for model processing and Java for interfacing with external systems. ...
0
votes
0
answers
4
views
Leveraging Extra Data to Enhance Text Clustering
I want to cluster thousands of text data (called corpus A) and find a label for each cluster. Accuary of clustering is significantly important, because I want to use the texts and their labels for ...
0
votes
0
answers
11
views
jar files downloading very slowly in jupyter notebook in Mac Book(M2 pro)
Required jar files are downloading from maven repository in Jupyter notebook are very slow in Mac book (M2 pro). how can i increase the speed of download?
0
votes
0
answers
31
views
Multilabel Classification - Flat Binary Classifiers vs Hierarchical Binary Classifiers
Was researching on multi label classification to solve the problem of tagging news articles with topics and countries, where tags follow the syntax <topic>-<country>, and would like to ...
0
votes
1
answer
22
views
Question about contextual embeddings?
How do BERT and RoBERTa generate contextual embeddings? The articles I've read keep saying that transformer encoders work bidirectionally. Because of self-attention, they can look at every token, ...
0
votes
0
answers
71
views
Stream response from custom RASA actions to the chatbot
I am using RASA PRO with CALM.
I was thinking of using openai api within a custom action and stream the streaming response coming from openai to my chatbot. Openai is giving me streaming response and ...
0
votes
1
answer
37
views
What's the purpose of using MLM when pretraining?
If BERT is a stack of transformer encoders, and the encoder already operates bidirectionally, understanding both left and right contexts and generating contextual embeddings, what is the purpose of ...
0
votes
1
answer
37
views
How do transformer-based architectures generate contextual embeddings?
How do transformer-based architectures, such as Roberta, etc., generate contextual embeddings? The issue is, I haven't found any articles that explain this process.
0
votes
1
answer
59
views
Fine tuning or just feature extraction or both using Roberta?
I'm reading a program that use the pre-trained Roberta model (roberta-base). The code first extracts word embeddings from each caption in the batch, using the last hidden state of the Roberta model. ...
0
votes
0
answers
9
views
Reducing language bias for text classification, transformer model
I am working on a text classification model predicting classes for text. We have languages from many parts of the world and some of our classes are dominated by specific languages. The model we are ...
0
votes
0
answers
128
views
RAG - how to deal with numerical data
I have a car marker companies data . I am creating chunks for different car models in llama index and using vector store index and it is giving decent outputs when asked questions . It fails poorly ...
1
vote
0
answers
39
views
Training Models Directly with Transformer Attention Weights: A Viable Strategy?
I'm currently using pre-trained transformers to extract embeddings for sequence analysis, which are then used in downstream tasks. My process involves using the extracted embeddings as features for ...