Bayesian Network
Introduction
In this article, I will introduce the Bayesian Network and basic concepts with related equations and examples. Bayesian network is a graphical model (DAG — Directed Acyclic Graph)based on the probability definitions and reviews the conditional dependencies between the variables and events (chance of happening an event). Before continuing on the Bayesian Network, let me define some basic definitions.
Terminology
- Joint Probability: Computing the likelihood of two events together at the same time is called Joint probability.
- Conditional Probability: Conditional probability is a likelihood of an event (from a random variable) based on the other event. To have the conditional probability we need to compute the joint probability and probability of the preceding event. The conditional distribution comes when there is some evidence about the related variables.
- Marginal probability: The probability of marginal distribution of a random variable collection. It is in contrast to conditional probability, as it does not depend on the values of other variables.
- Normalization: is a scaling method that scales the subset of the variable to a specific range (for instance, between zero and one.)
- Marginalization: is defined as a summation of the likely values of a variable to review the marginal contributions of the other events.
Bayesian Network
Bayesian Network is a network including the connection between different nodes, in which each node is a random variable, and the type of connection defines the conditional probability between the variables. We could have a simple Bayesian network as the next figure.
The probability of Z given W would be as follows;
There are two important aspects of the bayesian network, determining the joint distribution in the problem and computing the queries
Here we have a conditional distribution that needs to compute the probabilities of the queries.
There are some strategies to find the result of the query in the above probability, including variable elimination, particle filtering, and the Gibbs approach.
Note, that it is important to normalize the conditional probability during the calculation.
Variable elimination
Variable elimination is a useful approach in probabilistic network models such as Markov chain and Bayesian networks. The idea is to apply the factoring over the variables. You may factor over the specific variable to convert a part of the probability equation to a function. However, this method would be time and resource-consuming.
Gibbs sampling
In the Gibbs sampling, we select a random variable and take a sample from the variable on all the other conditioned or the Markov blanket property. The sample of the variable is checked for all the possibilities of the Markov Blanket.
As a summary:
The Bayesian Network is a Directed Acyclic Graph that reveals the conditional dependencies between different variables. The nodes describe the variables, and the edges mean the conditional relationship.
Implementation
To implement a causal model based on the DAG structure, the following steps should be considered;
- Fit the dataset into the DAG
- Return conditional probabilities
- Compute the query using one of the above methods
- Predict the causal effect
You can construct the DAG by specifying the edges. and using tons of Python methods to obtain it. One example of the DAG is as follows which shows that the dataset has four features and the DAG has one edge.
By having the dependencies, you can write the conditional probability. And apply a Python or R library to compute the inference.
R Example
{r}
# R code
library(bnlearn)
data = read.csv()
fit <- bn.fit(net, data, method="bayes")
fit
Python Example
df = pandas.read_csv()
bn_model = bnlearn.structure_learning.fit(df, methodtype='naivebayes', root_node="0")DAG = bn.make_DAG(bn_model)
query = bnlearn.inference.fit(DAG, variables=['1','2'], evidence={'0':1}, verbose=0)