Xiaodong's tech notes on computer vision and machine learning: evolutions of topic models for modling document and image

Wednesday, August 26, 2009

evolutions of topic models for modling document and image

I summarized the evolutions of the topic models for modeling documents and images using the above figure. Here are the notations:

NB-BoW: Naive Bayes bag of words, i.e., mixture of unigram
pLSA: probabilistic latent semantic analysis
LDA: latent Dirichlet allocation
FMM: finite mixture model
FHMM: finite hierarchical mixture model
DPMM: Dirichlet mixture model
HDP: hierarchical Dirichlet process mixture model

The texts on the arrows mean the changes that need to be done to evolve from one model to the other:

w-> x denotes to generalize the word x from a categorical variable to a
real variable, x, that can be either discrete or continuous
hierarcy denotes to add a hierarchy to the original model
K topics denotes to extend from one topic per document to multiple topics
K -> \infty denotes to derive an infinite limit of the original model.

This is not a complete summary. For example, HMM is not included.

Some interesting observations: there are three paths to evolve a NB-BoW model to a
HDP-MM model, which we always need to perform all the above four extensions
no matter which path we choose.

Xiaodong's tech notes on computer vision and machine learning

Wednesday, August 26, 2009

evolutions of topic models for modling document and image

No comments:

Post a Comment

Labels

Blog Archive

About Me

My Blog List