Data Mining Bayesian Classifiers
- Bayesian classifiers are statistical classifiers with Bayesian probability understandings. Bayesian classification uses Bayes theorem to predict the occurrence of any event.
- Bayes theorem came into existence after Thomas Bayes, who first utilized conditional probability to provide an algorithm that uses evidence to calculate limits on an unknown parameter.
Bayes's theorem
DataMining Bayesian Classifiers
- Where X and Y are the events and P (Y) ≠ 0
- P(X/Y) is a conditional probability that describes the occurrence of event X is given that Y is true.
- P(Y/X) is a conditional probability that describes the occurrence of event Y is given that X is true.
- P(X) and P(Y) are the probabilities of observing X and Y independently of each other. This is known as the marginal probability.
Bayesian Interpretation
In the Bayesian interpretation, probability determines a "degree of belief."
For example, Lets us consider an example of the coin. If we toss a coin, then we get either heads or tails, and the percent of occurrence of either heads and tails is 50%. If the coin is flipped numbers of times, and the outcomes are observed, the degree of belief may rise, fall, or remain the same depending on the outcomes.
Proposition X and evidence Y,
- P(X), the prior, is the primary degree of belief in X
- P(X/Y), the posterior is the degree of belief having accounted for Y.
- The quotient P(Y/X) / P(Y) represents the supports Y provides for X.
Bayes theorem can be derived from conditional probability:
- Where P (X⋂Y) is the joint probability of both X and Y being true, because
Bayesian network
- Bayesian Network falls under classification of Probabilistic Graphical Modelling (PGM) that is utilized to compute uncertainties by utilizing the probability concept.
- A Directed Acyclic Graph show a Bayesian Network, like some other statistical graph, a DAG consists of a set of nodes and links, where links signify connection between nodes.
Directed Acyclic Graph
- Here nodes represent random variables, and the edges define the relationship between these variables.