# Dirichlet Multinomial Model

Dirichlet multinomial distribution model refers to a multivariate discreet random variable’s probability distribution in statistics and probability. It is also known as the multivariate Polya distribution or Dirichlet compound multinomial distribution (DCM). In simple terms, it is the compound probability distribution in which the probability vector P comes from the Dirichlet distribution whose parameter vector, as well as discrete samples set is drawn from a categorical distribution of the probability vector P. Compounding corresponds to the Polya urn scheme. For instance, in the classification of documents, the used distribution represents word counts’ distribution for various document types.

## When Dirichlet multinomial distribution model is used

Though simple, this model causes confusion due to the use of the terminology in a sloppily overloaded way in the internet. This model is usually used as the distribution for multinomial variables or the prior categorical variables in the mixture models of the Bayesian models. It should be noted that there are several categorical variables in most fields such as the natural language processing, which imprecisely are called multinomial variables. It is possible that such use may at times cause confusion when Bernoulli and binomial distributions are conflated. There is inference that is made over the hierarchical Bayesian by using Gibbs sampling. In such cases, the distribution model is marginalized from the integration of random variable. This leads to drawing categorical variables similar to the Dirichlet distribution.

#### How it is used

This distribution model occurs in the Bayesian network with a distribution mode as part of the larger network. The Dirichlet priors may collapse provided that there are categorical distributions in form of nodes that rely on them. This collapsing occurs for all Dirichlet distribution nodes within a separate fashion. The collapsing occurs whether other nodes that rely on the categorical distributions exist or not. It also occurs regardless of whether the categorical distributions that rely on the nodes in Dirichlet priors exist or not. Ideally, every categorical distribution that depends on the nodes of Dirichilet distribution becomes connected to the joint of the multinomial Dirichlet distribution.

### Why Dirichlet multinomial distribution model applies

Certain instances can lead to the existence of a hierarchical model with the Dirichlet priors which act as the dependent variables. Usually, this presents a situation that is tricky although the relationship between dependent variables as well as priors is usually not fixed. Instead, the choice of prior depends on varying categorical variables. This occurs when topic models as well as variables’ names are above what matches the Dirichlet allocation.

#### What it means

For instance, an observation in a variables’ sequence from x1 to XN where Xi refers to a number ranging between 1 and K, the sequence can be summarized as the K vector count ranging from variables n1 to nK . Thus, nk = PN i I[xi = k]. To estimate the next observation probability, xN+1 becomes a value k, P (k|x). To the maximum, the estimate likelihood of the probability is what would be expected; P (k|x) = nk N. Zero probability is assigned to the events that are yet to occur in training data x by this estimator. Thus, the Dirichlet multinomial distribution model provides an important means of adding “smoothing” to a predictive distribution. By itself, Dirichlet distribution is a significant density over the K’s positive numbers θ1,…, θK summing up to one. Thus, it can be used in drawing parameters for the multinomial distribution. The Dirichlet distribution parameters are real, positive numbers α1,…, αK.