Syntactic constraint is an important ingredient in NLP. At the beginning, topic models, such as LDA, assume bag-of-word model and thus ignore the syntax. Later on, this constraint is added to the topic model to improve the modeling power. Here are a few papers regarding this issue:
- Integrating topics and syntax, NIPS 2005
- Style and Topic Language Model Adaptation Using HMM-LDA, EMNLP 2006
- Hidden Topic Markov Models, AISTATS 2007, presentation video
- Topic Modeling: Beyond Bag-of-Words, ICML 2006, slides
- Syntactic Topic Models, NIPS 2008, supplement materials
- Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors, In Proceedings of the Workshop on Prior Knowledge for Text and language (held in conjunction with ICML/UAI/COLT), 2008
- Topical N-grams: Phrase and Topic Discovery, with an Application to Information Retrieval, ICML 2007, technical report
No comments:
Post a Comment