Friday, September 11, 2009

from DP mixture to hierarchical DP mixture

The slides by Yee Whye Teh "A Tutorial on Dirichlet Processes and Hierarchical Dirichlet Processes" 2007 explains the evolution from DP mixture to HDP mixture very well.

DP mixture concerns clustering data in a group/document;
HDP mixture concerns sharing clusters among multiple groups/documents;


A HDP mixture is built upon multiple DP mixtures, where each group has a DP mixture. The HDP needs a mechanism to share clusters among multiple DP mixtures. Since clusters are represented by their parameters, sharing clusters is equivalent to sharing cluster parameters .

Recall that the cluster parameters is drawn from a Dirichlet process. Can we let these Dirichlet process, , be generated from the common distribution ? If is a continuous distribution, this idea will not work: each drawing of from will be distinctive, thus this no way to force sharing among different .  (The figure on the right is from Yeh's tutorial 2007 slides)

The solution is to force
as a discrete function, then drawing of from will have chance to sharecommon .
On the other hand, we need the possibility of infinite number of
from . Thus a natural solution is to let be a Dirichlet process with its own prior distribution. This leads to HDP mixture:


(The figure on the right is from Yeh's technical report "Hierarchical Dirichlet Processes" 2004)

Making discrete forces sharing clusters between
:
atoms in all 's are from . Different has different weights on these atoms.



(The above figure is from Yeh's tutorial 2007 slides)

This leads to the HDP mixture in stick-breaking representation:



(The figure on the right is from Yeh's technical report "Hierarchical Dirichlet Processes" 2004
)

The sharing cluster can be visualized by hierarchical Polya urn scheme:

(The above figure is from Yeh's tutorial 2007 slides)

or a Chinese restaurant franchise scheme:

different restaurants share a common menu = {A, B, C, D, E, ...}

(The above figure is from Yeh's tutorial 2007 slides)

2 comments:

  1. The note is very cute and pretty good, I am reading some papers about dirichlet process mixture models etc. But after reading your notes I have a lot of new understanding about the stochastic process, esp, you give the two-level explanation of HDP and the table of each variables with their metaphors. but in your post,you said:
    'The solution is to force as a discrete function, then drawing of from will have chance to sharecommon .
    On the other hand, we need the possibility of infinite number of from . Thus a natural solution is to let be a Dirichlet process with its own prior distribution. This leads to HDP mixture.'
    I think that you want to say that Gj is discrete not H. As we know the base measure H, is continous some times.

    ReplyDelete
  2. In the sentence:
    'The solution is to force H as a discrete function, then drawing of from will have chance to sharecommon .
    On the other hand, we need the possibility of infinite number of from . Thus a natural solution is to let be a Dirichlet process with its own prior distribution. This leads to HDP mixture.'

    the H refers to the one in the first figure, which is equivalent to Gj in the later models. So my original note is correct.

    ReplyDelete