Markov chain

AI Maverick
5 min readJan 25, 2023

--

Markov chain is a mathematical model that describes a sequence of possible events in which the probability of each event depends only on the state attained in the previous event. It is a type of stochastic process, and it is often used to model random processes in physics, chemistry, and other fields. Markov chains have many practical applications, such as in computer science, operations research, and finance.

Introduction

The concept of a Markov chain was first introduced by the Russian mathematician Andrei Andreyevich Markov in his 1906 paper “Extension of the law of large numbers.” In the paper, Markov considered a sequence of trials where the probability of success on each trial depends only on the outcome of the previous trial. He used this concept to extend the law of large numbers, which was a fundamental result in probability theory at the time.

Markov’s work was primarily focused on the mathematical theory of the Markov chain, and it did not immediately find many practical applications. However, in the 1920s and 1930s, Markov chains began to be used in a variety of fields, including genetics, economics, and electrical engineering.

In the 1940s and 1950s, the study of Markov chains was further developed by mathematicians such as Norbert Wiener, John von Neumann, and Claude Shannon. Wiener used Markov chains to study the behavior of electrical noise in communication systems, while von Neumann and Shannon used them to analyze the behavior of computer memories.

In the 1960s and 1970s, Markov chains were applied in a wide range of fields including physics, chemistry, and engineering. In the field of computer science, Markov chains were used to model the behavior of computer systems, and in the field of operations research, they were used to model the behavior of manufacturing and transportation systems.

In recent years, Markov chains have found many new applications in fields such as natural language processing, finance, and image processing. With the advancement of technology, the ability to collect, store, and analyze large amounts of data has led to an increase in the use of Markov chains in various fields.

Overall, Markov Chain has evolved over the century and found many applications in various fields. It is considered one of the most important concepts in the field of Probability and Stochastic Processes.

Markov Chain

In a Markov chain, the states of the system change over time according to a set of probabilistic rules. The key property of a Markov chain is that no matter how the system arrived at its current state, the possible future states are fixed. In other words, the probability of transitioning to any particular state is dependent solely on the current state and time elapsed.

A Markov chain can be represented using a graph, with the states of the system represented as nodes and the transitions between states represented as edges, with probabilities of those edges. These probabilities can be represented using a transition probability matrix.

Markov Chain can be of two types, Discrete-time Markov Chain (DTMC) and Continuous-time Markov Chain (CTMC) depending on the time scales of the system. There are also different variations of the Markov Chain, like regular Markov Chain, Absorbing Markov Chain, Ergodic Markov Chain, etc.

Markov Chain is widely used in various fields like natural language processing, image processing, queueing theory, financial modeling, etc. In Natural Language Processing, they are used in language modeling and speech recognition. In finance, they are used to model stock prices and other financial time series.

Markov Chain can be solved using various methods, like solving the steady-state probability, finding the first passage time and hitting time, etc.

Methodology

A Markov chain is a sequence of random variables $X_1, X_2, X_3, …,$ where the probability of transitioning from state $X_n$ to state $X_(n+1)$ depends only on the current state $X_n$ and the elapsed time n. This can be mathematically represented using a set of transition probabilities p(i, j) which represent the probability of transitioning from state $i$ to state j. The set of all transition probabilities for all possible states is called the transition probability matrix.

For a discrete-time Markov chain, the probability of being in state $i$ at time n, given that the chain started in state j at time 0, can be represented by the n-step probability matrix P^n(i,j). This probability can be calculated using the following formula:

$P^n(i,j) = \sum(P^(n-1)(i,k) * P(k,j))$

where $P(i,j)$ is the transition probability from state $i$ to state j, and k is the intermediate state. This formula can be derived from the law of total probability.

For a continuous-time Markov chain, the probability of being in state $i$ at time t, given that the chain started in state j at time 0, can be represented by the matrix P(t) (i,j). This probability can be calculated using the following formula:

$P(t) (i,j) = Integral(P(s) (i,k) * P(t-s) (k,j))$

where P(s) (i,j) is the probability of being in state $i$ at time $s$, given that the chain started in state j at time 0, and k is the intermediate state.

Both of the above formulas are the key mathematical representation of the Markov Chain. These formulas are used to calculate the probabilistic behavior of the Markov Chain in different situations. There are other mathematical concepts and formulas also used to solve Markov Chain like steady state probability, first passage time, hitting time, etc.

Implementation in Python

There are several Python libraries that can be used to implement Markov chains, some of the most popular ones include:

  1. pymc3: PyMC3 is a powerful library for building probabilistic models and performing Bayesian inference. It includes functions for building Markov chains and other stochastic processes.
  2. markovify: Markovify is a simple, extensible Markov chain generator. It can be used to generate random text, such as poetry or song lyrics, based on an existing text corpus.
  3. networkx: NetworkX is a powerful library for working with graphs and networks. It includes functions for building Markov chains and other types of graphs, as well as for analyzing and visualizing graph data.
  4. transitions: Transitions is a Python library for creating simple finite-state machines, with an emphasis on parallelism. It also provides a simple way to create a Markov Chain
  5. markov: A simple python library for working with Markov chains. It provides a simple way to create, manipulate, and analyze Markov chains.

markovify implementation

an example of how to use the markovify library to generate random text based on a given corpus of text:

import markovify

# Read in a corpus of text
with open("corpus.txt", "r") as f:
text = f.read()

# Build the Markov chain
model = markovify.Text(text)

# Generate a random sentence from the Markov chain
print(model.make_sentence())

This code reads in the text from a file called “corpus.txt” and creates a Markov chain model using the markovify.Text class. The make_sentence() method is then used to generate a random sentence based on the model.

You can also set some parameters like the state size, which determine the number of words that influence the next word’s probability.

model = markovify.Text(text, state_size=3)

--

--