Doc number in corpus
Doc number in corpus
Token number in corpus
Token number in corpus
the multiplcation between word distribution among all topics and the corresponding doc distribution among all topics: p(w)=\sum_{k}{p(k|d)*p(w|k)}= \sum_{k}{\frac{{n}_{kw}+{\beta }_{w}} {{n}_{k}+\bar{\beta }} \frac{{n}_{kd}+{\alpha }_{k}}{\sum{{n}_{k}}+ \bar{\alpha }}}
the multiplcation between word distribution among all topics and the corresponding doc distribution among all topics: p(w)=\sum_{k}{p(k|d)*p(w|k)}= \sum_{k}{\frac{{n}_{kw}+{\beta }_{w}} {{n}_{k}+\bar{\beta }} \frac{{n}_{kd}+{\alpha }_{k}}{\sum{{n}_{k}}+ \bar{\alpha }}}
\bar{\alpha }}} \sum_{k} \frac{{\alpha }_{k}{\beta }_{w} + {n}_{kw}{\alpha }_{k} + {n}_{kd}{\beta }_{w} + {n}_{kw}{n}_{kd}} {{n}_{k}+\bar{\beta }} \frac{1}{\sum{{n}_{k}}+\bar{\alpha }}} \exp^{-(\sum{\log(p(w))})/N} N is the number of tokens in corpus