DP mixture concerns clustering data in a group/document;
HDP mixture concerns sharing clusters among multiple groups/documents;
A HDP mixture is built upon multiple DP mixtures, where each group has a DP mixture. The HDP needs a mechanism to share clusters among multiple DP mixtures. Since clusters are represented by their parameters, sharing clusters is equivalent to sharing cluster parameters .
Recall that the cluster parameters is drawn from a Dirichlet process. Can we let these Dirichlet process, , be generated from the common distribution ? If is a continuous distribution, this idea will not work: each drawing of from will be distinctive, thus this no way to force sharing among different . (The figure on the right is from Yeh's tutorial 2007 slides)
The solution is to force as a discrete function, then drawing of from will have chance to sharecommon .
On the other hand, we need the possibility of infinite number of from . Thus a natural solution is to let be a Dirichlet process with its own prior distribution. This leads to HDP mixture:
Making discrete forces sharing clusters between :
atoms in all 's are from . Different has different weights on these atoms.
(The above figure is from Yeh's tutorial 2007 slides)
This leads to the HDP mixture in stick-breaking representation:
(The figure on the right is from Yeh's technical report "Hierarchical Dirichlet Processes" 2004)
The sharing cluster can be visualized by hierarchical Polya urn scheme:
(The above figure is from Yeh's tutorial 2007 slides)
or a Chinese restaurant franchise scheme:
different restaurants share a common menu = {A, B, C, D, E, ...}
(The above figure is from Yeh's tutorial 2007 slides)
The note is very cute and pretty good, I am reading some papers about dirichlet process mixture models etc. But after reading your notes I have a lot of new understanding about the stochastic process, esp, you give the two-level explanation of HDP and the table of each variables with their metaphors. but in your post,you said:
ReplyDelete'The solution is to force as a discrete function, then drawing of from will have chance to sharecommon .
On the other hand, we need the possibility of infinite number of from . Thus a natural solution is to let be a Dirichlet process with its own prior distribution. This leads to HDP mixture.'
I think that you want to say that Gj is discrete not H. As we know the base measure H, is continous some times.
In the sentence:
ReplyDelete'The solution is to force H as a discrete function, then drawing of from will have chance to sharecommon .
On the other hand, we need the possibility of infinite number of from . Thus a natural solution is to let be a Dirichlet process with its own prior distribution. This leads to HDP mixture.'
the H refers to the one in the first figure, which is equivalent to Gj in the later models. So my original note is correct.