Skip to main content

All Questions

Tagged with
0 votes
0 answers
11 views

NER with custom tags and no training data, zero shot approach help

I am building a "field tagger" for documents. Basically, a document, in my case something like a proposal or sales quote, would have a bunch of entities scattered throughout it, and we want ...
redbull_nowings's user avatar
0 votes
0 answers
20 views

NLP: how to handle bad tokenization

I get nonsense when trying to translate the following german sentence to swedish using google/madlad400-3b-mt: a. Natürliche Personen: BundID mit ELSTER-Zertifikat oder nPA/eID/eAT-Authentifizierung ...
Mathermind's user avatar
0 votes
0 answers
16 views

Latest Tree-based models

What are the latest Tree-based models that are used in machine learning? Tell the new models except the old ones such as the Decision tree, Random Forest, Gradient Boosting, LightGBM, XGBoost, and ...
Madhes Monnish's user avatar
1 vote
1 answer
62 views

Improving GPU Utilization in LLM Inference System

I´m trying to build a distributed LLM inference platform with Huggingface support. The implementation involves utilizing Python for model processing and Java for interfacing with external systems. ...
Cardstdani's user avatar
0 votes
0 answers
4 views

Leveraging Extra Data to Enhance Text Clustering

I want to cluster thousands of text data (called corpus A) and find a label for each cluster. Accuary of clustering is significantly important, because I want to use the texts and their labels for ...
Mohammadreza Riahi's user avatar
0 votes
0 answers
11 views

jar files downloading very slowly in jupyter notebook in Mac Book(M2 pro)

Required jar files are downloading from maven repository in Jupyter notebook are very slow in Mac book (M2 pro). how can i increase the speed of download?
Tovlk's user avatar
  • 43
0 votes
0 answers
31 views

Multilabel Classification - Flat Binary Classifiers vs Hierarchical Binary Classifiers

Was researching on multi label classification to solve the problem of tagging news articles with topics and countries, where tags follow the syntax <topic>-<country>, and would like to ...
curious-24-7's user avatar
0 votes
1 answer
22 views

Question about contextual embeddings?

How do BERT and RoBERTa generate contextual embeddings? The articles I've read keep saying that transformer encoders work bidirectionally. Because of self-attention, they can look at every token, ...
user avatar
0 votes
0 answers
71 views

Stream response from custom RASA actions to the chatbot

I am using RASA PRO with CALM. I was thinking of using openai api within a custom action and stream the streaming response coming from openai to my chatbot. Openai is giving me streaming response and ...
Avatar's user avatar
  • 1
0 votes
1 answer
37 views

What's the purpose of using MLM when pretraining?

If BERT is a stack of transformer encoders, and the encoder already operates bidirectionally, understanding both left and right contexts and generating contextual embeddings, what is the purpose of ...
user avatar
0 votes
1 answer
37 views

How do transformer-based architectures generate contextual embeddings?

How do transformer-based architectures, such as Roberta, etc., generate contextual embeddings? The issue is, I haven't found any articles that explain this process.
user avatar
0 votes
1 answer
59 views

Fine tuning or just feature extraction or both using Roberta?

I'm reading a program that use the pre-trained Roberta model (roberta-base). The code first extracts word embeddings from each caption in the batch, using the last hidden state of the Roberta model. ...
user avatar
0 votes
0 answers
9 views

Reducing language bias for text classification, transformer model

I am working on a text classification model predicting classes for text. We have languages from many parts of the world and some of our classes are dominated by specific languages. The model we are ...
Carl Rynegardh's user avatar
0 votes
0 answers
128 views

RAG - how to deal with numerical data

I have a car marker companies data . I am creating chunks for different car models in llama index and using vector store index and it is giving decent outputs when asked questions . It fails poorly ...
Pulkit Mehta's user avatar
1 vote
0 answers
39 views

Training Models Directly with Transformer Attention Weights: A Viable Strategy?

I'm currently using pre-trained transformers to extract embeddings for sequence analysis, which are then used in downstream tasks. My process involves using the extracted embeddings as features for ...
pparker's user avatar
  • 402

15 30 50 per page
1
2 3 4 5
51