It seems the

**prior**(or

**prior probability**) is the measure of uncertainness of an event without taking any evidence (specific features) into account.

Apparently the

**posterior**(or again,

**posterior probability**) is the conditional probability assigned after relevant evidence is taken into account.

So, now to construct an example that illustrates what I currently believe about these concepts: If, in a given corpus, 50% of the tokens are determiners, then the chance of selecting a token you know nothing about it and finding it to be a determiner is 50%. I believe that's the prior. However, if 70% of tokens occurring after verbs are determiners, then the posterior probability is the conditional probability P(determiner|verb) ["probability of a determiner given a verb"], so 70%.

In Bayes Theorem, which is the one part of Bayesian anything that I am one might almost say *too* familiar with, the prior is multiplied by the likelihood function and then normalized to obtain the posterior. So:

or, equivalently:

However, one confusing segment of the Wikipedia entry for a prior is:

"of an uncertain quantity

*p*(for example, suppose

*p*is the proportion of voters who will vote for the politician named Smith in a future election) is the probability distribution that would express one's uncertainty about

*p*before the "data" (for example, an opinion poll) is taken into account."

That seems to suggest that we can't take *any* data into account in order to find it. Don't we then just have to guess? Sounds like more reading may be in order...

## No comments:

## Post a Comment