Bayesian inference
Bayesian inference is a statistical inference in which probabilities are interpreted not as frequencies or proportions or the like, but rather as degrees of belief. The name comes from the frequent use of Bayes' theorem in this discipline.
Evidence and the scientific method
Bayesian statisticians claim that methods of Bayesian inference are a formalisation of the scientific method involving collecting evidence that points towards or away from a given hypothesis. There can never be certainty, but as evidence accumulates, the degree of belief in a hypothesis changes; with enough evidence it will often become very high (almost 1) or very low (near 0).
Related Topics:
Scientific method - Evidence - Hypothesis
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
:As an example, this reasoning might be
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
::The sun has risen and set for billions of years. The sun has set tonight. With very high probability, the sun will rise tomorrow.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Bayesian statisticians believe that Bayesian inference is the most suitable logical basis for discriminating between conflicting hypotheses. It uses an estimate of the degree of belief in a hypothesis before the advent of some evidence to give a numerical value to the degree of belief in the hypothesis after the advent of the evidence. Because it relies on subjective degrees of belief, however, it is not able to provide a completely objective account of induction. See scientific method.
Related Topics:
Hypotheses - Induction - Scientific method
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Bayes' theorem also provides a method for adjusting degrees of belief in the light of new information. Bayes' theorem is
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
:P(H_0|E) = rac{P(E|H_0);P(H_0)}{P(E)}.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
For our purposes, H_0 can be taken to be a hypothesis that may have been developed ab initio or induced from some preceding set of observations, but before the new observation or evidence E.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
- The term is called the prior probability of .
- The term is the conditional probability of seeing the observation given that the hypothesis is true; as a function of given , it is called the likelihood function.
- The term is called the marginal probability of ; i.e., the probability of E given no other information. It is a normalizing constant and can be calculated as the sum of all mutually exclusive hypotheses .
- The term is called the posterior probability of given .
The scaling factor P(E|H_0) / P(E) gives a measure of the impact that the observation has on belief in the hypothesis. If it is unlikely that the observation will be made unless the particular hypothesis being considered is true, then this scaling factor will be large. Multiplying this scaling factor by the prior probability of the hypothesis being correct gives a measure of the posterior probability of the hypothesis being correct given the observation.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
It is fairly easy to prove that multiplying the prior probability P(H_0) by the scaling factor will never yield a probability that is greater than 1. This is shown by observing that P(E) is at least as great as P(E and H_0), which is P(E|H_0) * P(H_0). Since replacing P(E) with P(E and H_0) in the scaling factor will yield a posterior probability of 1, the posterior probability formula could only yield a probability greater than 1 if P(E) was less than P(E and H_0), but this is impossible.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
The keys to making the inference work is the assigning of the prior probabilities given to the hypothesis and possible alternatives, and the calculation of the conditional probabilities of the observation under different hypotheses.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Some Bayesian statisticians believe that if the prior probabilities can be given some objective value, then the theorem can be used to provide an objective measure of the probability of the hypothesis. But to others there is no clear way in which to assign objective probabilities. Indeed, doing so appears to require one to assign probabilities to all possible hypotheses.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Alternately, and more often, the probabilities can be taken as a measure of the subjective degree of belief on the part of the participant, and to restrict the potential hypotheses to a constrained set within a model. The theorem then provides a rational measure of the degree to which some observation should alter the subject's belief in the hypothesis. But in this case the resulting posterior probability remains subjective. So the theorem can be used to rationally justify belief in some hypothesis, but at the expense of rejecting objectivism.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
It is unlikely that two individuals will start with the same subjective degree of belief. Supporters of Bayesian method argue that even with very different assignments of prior probabilities sufficient observations are likely to bring their posterior probabilities closer together. This assumes that they do not completely reject each other's initial hypotheses; and that they assign similar conditional probabilities. Thus Bayesian methods are useful only in situations in which there is already a high level of subjective agreement.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
In many cases, the impact of observations as evidence can be summarised in a likelihood ratio, as expressed in the law of likelihood. This can be combined with the prior probability to reflect the original degree of belief and any earlier evidence already taken into account. For example, if we have the likelihood ratio
Related Topics:
Likelihood - The law of likelihood - Prior probability
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
:Lambda = rac{L(H_0mid E)}{L(mbox{not } H_0|E)} = rac{P(E mid H_0)}{P(E mid mbox{not } H_0)}
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
then we can rewrite Bayes' theorem as
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
:P(H_0|E) = rac{Lambda P(H_0)}{Lambda P(H_0) + P(mbox{not } H_0)} = rac{P(H_0)}{P(H_0) +left(1-P(H_0) ight)/Lambda }.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
With two independent pieces of evidence E_1 and E_2, one possible approach is to move from the prior to the posterior probability on the first evidence and then use that posterior as a new prior and produce a second posterior with the second piece of evidence; an arithmetically equivalent alternative is to multiply the likelihood ratios. So
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
:if P(E_1, E_2 | H_0) = P(E_1 | H_0) imes P(E_2 | H_0)
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
:and P(E_1, E_2 | mbox{not }H_0) = P(E_1 | mbox{not }H_0) imes P(E_2 | mbox{not }H_0)
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
:then P(H_0|E_1, E_2) = rac{Lambda_1 Lambda_2 P(H_0)}{Lambda_1 Lambda_2 P(H_0) + P(mbox{not } H_0)} ,
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
and this can be extended to more pieces of evidence.
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
Before a decision is made, the loss function also needs to be considered to reflect the consequences of making an erroneous decision.
Related Topics:
Decision - Loss function
~ ~ ~ ~ ~ ~ ~ ~ ~ ~
~ Table of Content ~
| ► | Introduction |
| ► | Evidence and the scientific method |
| ► | Simple examples of Bayesian inference |
| ► | More mathematical examples |
| ► | rac{egin{pmatrix} n+m m end{pmatrix} a^m (1-a)^n,p(a)} |
~ What's Hot ~
~ Community ~
| ► | History Forum Come and discuss about History, Civilizations, Historical Events and Figures |
| ► | History Web-Ring A community of sites, blogs and forums dedicated to History. Do not hesitate to submit your site. |
and are licensed under the GNU Free Documentation License.
Lexicon - Privacy Policy - Spiritus-Temporis.com ©2005.