I was watching this video over here (https://www.youtube.com/watch?v=UBiaLq5V7mE) that discussed a Non-Parametric based Bayesian approach for deciding the number of clusters in a dataset.
Essentially, the Dirichlet Probability Distribution can be used to simulate "customers entering a restaurant and deciding whether to sit at empty table vs. a non-empty table relative to current seating arrangement at the restaurant" (Chinese Restaurant Process). In this analogy, individual data points are considered as "customers", and the number of clusters are considered as "tables" - individual data points are probabilistically assigned to existing or new clusters in such a way that so that total number of clusters does not need to be specified in the beginning.
My Question: What are the advantages of doing this compared to the standard methods used to decide the number of clusters such as a "elbow plot" or a "silhouette plot"?
Does anyone know why such complicated Bayesian Non-Parametric methods (Dirichlet Distribution via Chinese Restaurant Process) need to be used to infer the true number of clusters in a dataset, compared to the more standard methods?
Are these Bayesian Non-Parametric methods more "powerful" in higher dimensional data compared to the standard methods? Do the Bayesian Non-Parametric methods allow you to place "probabilistic uncertainty" on the number of clusters? Do the Bayesian Non-Parametric methods attempt to better account for the fact that new data might not belong to any of the existing clusters?
Thanks!