Chapter 3 Markov Chain: Definition and Basic Properties (Lecture on 01/12/2021)

The next few chapters will be mainly about discrete time, discrete state space stochastic process, mainly from the context of Markov chain.

Let \(X=\{X(t,\omega):t\in T,\omega\in\Omega\}\), \(X(t,\omega)\in S\), consider the case both \(T\) and \(S\) are discrete. Define the random variabel \(X_t(\omega)=X(t,\omega)\) and consider the sequence of random variables \(\{X_0,X_1,\cdots\}\) which take values in some countable set \(S\), called the state space. Each \(X_n\) is a discrete random variable that takes one of the \(N\) possible values, where \(N=|S|\). We are allowing \(N=\infty\). This is the set up for discrete time, discrete state stochastic process.

Definition 3.1 (Markov Chain) The process \(X=\{X_0,X_1,\cdots\}\) is a Markov chain if it satisfies the Markov condition: \[\begin{equation} P(X_{n}=s|X_0=x_0,X_1=x_1,\cdots,X_{n-1}=x_{n-1})=P(X_{n}=s|X_{n-1}=x_{n-1}),\quad\forall s \tag{3.1} \end{equation}\]

The Markov property described in this way is equivalent to
\[\begin{equation} P(X_{n}=s|X_{n_1}=x_{n_1},\cdots,X_{n_k}=x_{n_k})=P(X_{n}=s|X_{n_k}=x_{n_k}),\quad\forall s \tag{3.2} \end{equation}\] for all \(n_1<n_2<\cdots<n_k\leq n-1\).

We have asssumed that \(X\) takes values in some countable set \(S\). Since \(S\) is countable, it can be put in one-to-one correspondence with some subset \(S^{\prime}\) of the integers. Thus, without loss of generality, we can say the following: if \(X_n=i\), it actually means that the chain is in the \(i\)th state at the \(n\)th time points.

The evolution of a chain is described by the transition probabilities, defined as \(P(X_{n+1}=j|X_n=i)\). This probability may depend on \(n,i,j\). We will restric our attention to the case when transition probabilities do not depend on \(n\).

Definition 3.2 (Homogenous Chain) The chain \(X=\{X_0,\cdots\}\) is called homogenous (time homogenous) if \(P(X_{n+1}=j|X_n=i)=P(X_1=j|X_0=i),\forall n,i,j\). For a homogenous chain, we define the transition matrix \(P=\{p_{ij}\}_{i,j=1}^{|S|}\) as an \(|S|\times|S|\) matrix of transition probabilities \(p_{ij}=P(X_{n+1}=j|X_n=i)\).
The beauty of homogeneity assumption is that, we can specify the distribution of the whole stochastic process by specify the transition matrix. Suppose the stochastic process have infinite index space \(T\) and finite state space \(S\). If we assume homogeneity, the problem of specifying the infinite dimensional distribution \((x_0,\cdots,)\) becomes to the problem of specifying a finite dimensional matrix \(P\). It simplifies the problem a lot.

Theorem 3.1 (Properties of Transition Probability Matrix) If \(\mathbf{P}\) is a transition probability matrix, then

  1. \(0\leq p_{ij}\leq 1\), \(\forall i,j\).

  2. \(\sum_{j}p_{ij}=1\), \(\forall i\).
Definition 3.3 (The n-step Transition Probability Matrix) The n-step transition probability matrix is defined as \(P(m,m+n)=\{p_{ij}(m,m+n)\}_{i,j=1}^{|S|}\), where \(p_{ij}(m,m+n)=P(X_{m+n}=j|X_m=i)\).

Theorem 3.2 (Chapman-Kolmogorov Equation) \[\begin{equation} p_{ij}(m,m+n+r)=\sum_{k}p_{ik}(m,m+n)p_{kj}(m+n,m+n+r),\quad \forall r \tag{3.3} \end{equation}\]

Intuitively, this means the \(n+r\) step transition probability matrix can be decomposed into a \(n\) step and a \(r\) step transication probability matrix.
Proof. \[\begin{equation} \begin{split} p_{ij}(m,m+n+r)&=P(X_{n+m+r}=j|X_m=i)=\sum_k P(X_{m+n+r}=j,X_{m+n}=k|X_m=i)\\ &=\sum_k P(X_{m+n+r}=j|X_{m+n}=k,X_m=i)P(X_{m+n}=k|X_m=i)\\ &=\sum_k P(X_{m+n+r}=j|X_{m+n}=k)P(X_{m+n}=k|X_m=i) \quad (By\,Markov\,property)\\ &=\sum_k p_{kj}(m+n,m+n+r)p_{ik}(m,m+n) \end{split} \end{equation}\]

Since the transition probability matrix is \(P(m,m+n+r)=\{p_{ij}(m,m+n+r)\}_{i,j=1}^{|S|}\), the Chapman-Kolmogorov equation tells us \[\begin{equation} P(m,m+n+r)=P(m,m+n)P(m+n,m+n+r),\quad \forall n,m,r \tag{3.4} \end{equation}\] Specificly, we can take \(r=n=1\) and we have \[\begin{equation} P(m,m+2)=P(m,m+1)P(m+1,m+2),\quad \forall n,m,r \tag{3.5} \end{equation}\] If we further assume time homogeneity, then \(P(m,m+1)=P(m+1,m+2)=P\) and (3.5) becomes \[\begin{equation} P(m,m+2)=p^2 \tag{3.6} \end{equation}\] In general, we have \[\begin{equation} P(m,m+n)=p^n,\quad \forall n \tag{3.7} \end{equation}\] That is, if the one step transition probability matrix is \(P=\{p_{ij}\}_{i,j=1}^{|S|}\) where \(p_{ij}=P(X_{n+1}=j|X_n=i)\), then the \(i,j\)th entry of the n-step transition probability matrix \(P(X_{m+n}=j|X_m=i)=(P^n)_{i,j}\) where \((P^n)_{i,j}\) denotes the \(i,j\)th entry of \(P^n\).

Lemma 3.1 Let \(\mu_i(n)=P(X_n=i)\), that is the marginal probability of \(X_n\) takes the \(i\)th state. Write \(\boldsymbol{\mu}(n)\) as the row vector \((\mu_i(n),i\in S)\), then \[\begin{equation} \boldsymbol{\mu}(m+n)=\boldsymbol{\mu}(m)P^n \tag{3.8} \end{equation}\]
This lemma gives the relationship between the marginal probability vector of \(X\) at time \(m\) and at time \(m+n\).
Proof. \[\begin{equation} \begin{split} \mu_j^{m+n}&=P(X_{m+n}=j)=\sum_iP(X_{m+n}=j|X_m=i)P(X_m=i)\\ &=\sum_ip_{ij}(m,m+n)\mu_i(m)\\ &=\sum_i(P^n)_{i,j}\mu_i(m)=(\boldsymbol{\mu}(m)P^n)_j \end{split} \end{equation}\] Since this is true for all \(j\in S\), we have \(\boldsymbol{\mu}(m+n)=\boldsymbol{\mu}(m)P^n\)
Example 3.1 (Simple Random Walk) Suppose \(X_n=\left\{\begin{aligned} & 1 & p \\ & -1 & 1-p \end{aligned}\right.\) for all \(n\in\mathbb{N}\). Consider the stochastic process given by \(S_n(\omega)=X_1(\omega)+\cdots+X_n(\omega)\). The state space of this stochastic process is \(S=\{0,\pm 1,\pm 2,\cdots\}\). Then \(S_n\) is a Markov chain. The one step transition probability is given by \(P(S_n=j|S_{n-1}=i)=\left\{\begin{aligned} & p & j=i+1 \\ & 1-p & j=i-1 \\ & 0 & o.w. \end{aligned}\right.\) Now for the n-step transition probability, we are interested in \(P_{ij}(n)=P(X_n=j|X_0=i)\). Suppose there are \(a\) upward move and \(b\) downward moves, we have \[\begin{equation} \left\{\begin{aligned} & a+b=n \\ & a-b=j-i \end{aligned}\right.\Longrightarrow \left\{\begin{aligned} & a=\frac{n+j-i}{2} \\ & b=\frac{n-j+i}{2} \end{aligned}\right. \tag{3.9} \end{equation}\] Then, the n-step transition probability is given by \[\begin{equation} p_{ij}(n)=P(X_n=j|X_0=i)=\left\{\begin{aligned} & {n \choose a}p^a(1-p)^b & n+j-i\, even \\ & 0 & n+j-i\,odd \end{aligned}\right. \tag{3.10} \end{equation}\] where \(a\) is given by (3.9).
Example 3.2 (Ehrenfest Diffusion Models) Suppose there are a total of 2A balls in 2 boxes, labeled \(b\) and \(B\). At each time, we choose a ball at random, and shifted it from its box of origin to the other box. Let \(X_n\) be the number of balls at time n in box \(b\), then \(X_n\) is a Markov chain. We have \[\begin{equation} P(X_{n+1}=A+j|X_n=A+i)=\left\{\begin{aligned} & \frac{A-i}{2A} & j=i+1 \\ & \frac{A+i}{2A} & j=i-1 \end{aligned}\right. \tag{3.11} \end{equation}\] for all \(i=-A,\cdots,A\).

Definition 3.4 (Persistent State) State \(i\) is called persistent (recurrent) if \(P(X_n=i\, \text{for some}\, n\geq 1|X_0=i)=1\). This is to say that the probability of the chain eventually return to i, having started from i, is 1.

If this probability is less than 1, state \(i\) is known as the transient state.

We will be interested in the first passage time defined as \(f_{ij}(n)=P(X_1\neq j,\cdots,X_{n-1}\neq j,X_n=j|X_0=i)\). This is the probability that state \(j\) is first visited from state \(i\) at time n. Write \(f_{ij}=\sum_{n=1}^{\infty}f_{ij}(n)\), it is the probability that state \(j\) is ever visited from state \(i\). If \(f_{ij}=1\), we are interested in the constraints it imples on the transition probability.