Xiaodong's tech notes on computer vision and machine learning: from DP mixture to hierarchical DP mixture

Friday, September 11, 2009

from DP mixture to hierarchical DP mixture

The slides by Yee Whye Teh "A Tutorial on Dirichlet Processes and Hierarchical Dirichlet Processes" 2007 explains the evolution from DP mixture to HDP mixture very well.

DP mixture concerns clustering data in a group/document;
HDP mixture concerns sharing clusters among multiple groups/documents;

A HDP mixture is built upon multiple DP mixtures, where each group has a DP mixture. The HDP needs a mechanism to share clusters among multiple DP mixtures. Since clusters are represented by their parameters, sharing clusters is equivalent to sharing cluster parameters

.

Recall that the cluster parameters

is drawn from a Dirichlet process. Can we let these Dirichlet process,

, be generated from the common distribution

? If

is a continuous distribution, this idea will not work: each drawing of

from

will be distinctive, thus this no way to force sharing

among different

. (The figure on the right is from Yeh's tutorial 2007 slides)

The solution is to force

as a discrete function, then drawing of

from

will have chance to sharecommon

On the other hand, we need the possibility of infinite number of

from

. Thus a natural solution is to let

be a Dirichlet process with its own prior distribution. This leads to HDP mixture:

(The figure on the right is from Yeh's technical report "Hierarchical Dirichlet Processes" 2004)

Making

discrete forces sharing clusters between

:
atoms

in all

's are from

. Different

has different weights on these atoms.

(The above figure is from Yeh's tutorial 2007 slides)

This leads to the HDP mixture in stick-breaking representation:

(The figure on the right is from Yeh's technical report "Hierarchical Dirichlet Processes" 2004)

The sharing cluster can be visualized by hierarchical Polya urn scheme:

(The above figure is from Yeh's tutorial 2007 slides)

or a Chinese restaurant franchise scheme:
different restaurants share a common menu = {A, B, C, D, E, ...}

(The above figure is from Yeh's tutorial 2007 slides)

2 comments:

jianyingzhouOctober 9, 2009 at 9:25 AM
The note is very cute and pretty good, I am reading some papers about dirichlet process mixture models etc. But after reading your notes I have a lot of new understanding about the stochastic process, esp, you give the two-level explanation of HDP and the table of each variables with their metaphors. but in your post,you said:
'The solution is to force as a discrete function, then drawing of from will have chance to sharecommon .
On the other hand, we need the possibility of infinite number of from . Thus a natural solution is to let be a Dirichlet process with its own prior distribution. This leads to HDP mixture.'
I think that you want to say that Gj is discrete not H. As we know the base measure H, is continous some times.
ReplyDelete
Replies
xiaodong-yuNovember 29, 2009 at 7:46 PM
In the sentence:
'The solution is to force H as a discrete function, then drawing of from will have chance to sharecommon .
On the other hand, we need the possibility of infinite number of from . Thus a natural solution is to let be a Dirichlet process with its own prior distribution. This leads to HDP mixture.'

the H refers to the one in the first figure, which is equivalent to Gj in the later models. So my original note is correct.
ReplyDelete
Replies

Add comment

Xiaodong's tech notes on computer vision and machine learning

Friday, September 11, 2009

from DP mixture to hierarchical DP mixture

2 comments:

Labels

Blog Archive

About Me

My Blog List