Skip to main content

All Questions

1 vote
0 answers
64 views

Why does latent dirichlet allocation (LDA) fail when dealing with large and heavy-tailed vocabularies?

I'm reading the 2019 paper Topic Modeling in Embedding Spaces which claims that the embedded topic model improves on these limitations of LDA. But why does LDA have these limitations—why does it fail ...
seanmachinelearning's user avatar
1 vote
1 answer
49 views

In Latent Dirichlet allocation, is the following formula the probability of observing a single document, or an entire corpus?

This is the formula in question: Source: https://en.wikipedia.org/wiki/Latent_Dirichlet_allocation
Bob Odenkirk's user avatar
0 votes
1 answer
158 views

LDA alpha equivalent in structural topic model

I'm using an implementation of the structural topic model (stm), written in R using the stm package. I want to reduce the number of topics that are prevalent in ...
James's user avatar
  • 25
12 votes
0 answers
2k views

Is sparsity of topics a necessary condition for latent Dirichlet allocation (LDA) to work

I have been playing with the hyper-parameters of the latent Dirichlet allocation (LDA) model and am wondering how sparsity of topic priors play a role in inference. I have not performed these ...
kedarps's user avatar
  • 3,592
4 votes
1 answer
1k views

Correlation of Dirichlet distribution in Latent Dirichlet Allocation

Latent Dirichlet Allocation uses as prior for topic distribution the Dirichlet prior. However this model doesn't provide a correlation between topics and for this reason it was introduced Correlated ...
Gio_cor's user avatar
  • 68
4 votes
2 answers
768 views

Topic Models: Latent Dirichlet Allocations

I am trying to figure out the details of LDA and have been stuck for a while now. While reading the paper by Blei, I came across this - Latent Dirichlet allocation (LDA) is a generative ...
Clock Slave's user avatar
  • 1,087
4 votes
1 answer
367 views

which classifier to choose for probability histogram-like features

I have a populations of 500 elements. Each element is represented by a 10 dimension feature vector which sum of element is equal to 1 (you can think about it as a histogram of probabilities). In ...
gabboshow's user avatar
  • 683
1 vote
1 answer
96 views

About LDA model, I need a true expert to tell me that what is the real benefits of the Dirichlet prior? [closed]

Well,you know ,the only difference between pLSI and LDA is that the latter has a Dirichlet prior,thus the number of model parameters do not increase with the size of corpus,and this avoid the ...
lynnjohn's user avatar
  • 191
4 votes
1 answer
2k views

Hierarchical Dirichlet Processes in topic modeling

I think I understand the main ideas of hierarchical dirichlet processes, but I don't understand the specifics of its application in topic modeling. Basically, the idea is that we have the following ...
r_31415's user avatar
  • 3,351
1 vote
1 answer
5k views

How do you estimate $\alpha$ parameter of a latent dirichlet allocation model?

Blei has shown that it is possible to estimate $\alpha$ in a LDA model, but I have yet to find a library (any library; C, C++, Java, ...) to do so. Usually, implementations (including Blei's) treat $\...
Kang Min Yoo's user avatar
0 votes
1 answer
209 views

Can dummy variables be used to represent space in latent Dirichlet allocation?

Can dummy variables be used to represent space in latent Dirichlet allocation? I have a set of geocoded textual documents. I would like to use LDA to generate a topic model for the documents. ...
mech's user avatar
  • 3
5 votes
1 answer
2k views

understanding of effect of $\alpha$ in Dirichlet distribution

When reading the topic modeling tutorial written by Blei, KDD 2011 tutorial I was confused about a set of diagrams which aim to show the effect of $\alpha$ in Dirichlet distribution. For example, for ...
user3269's user avatar
  • 5,222
5 votes
0 answers
1k views

Gibbs sampling for LDA -- does a small Dirichlet concentration parameter make a difference?

I'm using a Gibbs sampler for Latent Dirichlet allocation as described by Griffiths and Steyvers (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC387300/). The sampling of a new topic $j$ for word $i$ is ...
Ben's user avatar
  • 473
2 votes
2 answers
1k views

Implementing Latent Dirichlet Allocation - notation confusion

I am trying to implement LDA using the collapsed Gibbs sampler from http://www.uoguelph.ca/~wdarling/research/papers/TM.pdf the main algorithm is shown below I'm a bit confused about the notation ...
user1893354's user avatar
  • 1,895
0 votes
0 answers
179 views

Posterior in latent Dirichlet analysis

I have a question regarding LDA (Latent Dirichlet Analysis) - what is the correct formulation of the posterior? In http://www.cs.princeton.edu/~blei/papers/Blei2011.pdf‎ it is $p(\beta_{1:K}, \theta_{...
user1315305's user avatar
  • 1,309

15 30 50 per page