Recall the Gambler’s ruin problem from the last session. Suppose someone plays the following gambling game. They bet \(1\) pound on a fair flip of a coin and win \(1\) pound if it comes up heads, and lose if come up tails. They start with \(1\) pound, and stop playing once they have reached \(5\) pounds or have no money left. Express this problem up as a Markov chain. In particular:
Consider the transition matrix \(P\) on the states \(\{1,2,3,4\}\) given by:
P <- matrix(c(0, 1/2, 1/2, 0, 0, 1/2, 0,0,0,1/2, 1/2,0,0, 1/2, 0, 0,0,0,1/2, 1/2, 1/3, 1/3, 0, 0, 1/3), nrow =5)
P <-t(P)
P
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.0000000 0.5000000 0.5 0.0 0.0000000
## [2,] 0.5000000 0.0000000 0.0 0.0 0.5000000
## [3,] 0.5000000 0.0000000 0.0 0.5 0.0000000
## [4,] 0.0000000 0.0000000 0.0 0.5 0.5000000
## [5,] 0.3333333 0.3333333 0.0 0.0 0.3333333
Compute powers of \(P\) until you see that all the entries are positive.
What does it mean if all the entries are positive?
Recall that \(\lambda\) is a probability measure if
Show that \[\rho = (1-t)\lambda + t \mu\] is a probability measure on \(\Omega\) for all \(t \in [0,1]\).
Let \(P\) be a transition matrix on a state space \(S\). Show that if \(\lambda\) and \(\mu\) are stationary measures of \(P\), then any convex combination of \(\lambda\) and \(\mu\) is a stationary measure for \(P\).
Consider the transition matrix on four states given by \[ \left( \begin{array}{cccc} 1/2 & 1/2 & 0 & 0 \\ 1/2 & 1/2 & 0 & 0 \\ 0 & 0 & 1/3 & 2/3 \\ 0 & 0 & 1/2 & 1/2 \\ \end{array} \right) \]
Find three stationary measures for \(P\).
Does \(P\) have finitely many stationary measures?
Consider the transition matrix \(P\) on the states \(\{1,2,3\}\) given by:
P <- matrix( c( 0,0,1,0, 0,0,0,1, 1/2,1/2,0,0, 1,0,0,0), nrow =4)
P <- t(P)
P
## [,1] [,2] [,3] [,4]
## [1,] 0.0 0.0 1 0
## [2,] 0.0 0.0 0 1
## [3,] 0.5 0.5 0 0
## [4,] 1.0 0.0 0 0
Q=P%*%P
P%*%P %*% P
## [,1] [,2] [,3] [,4]
## [1,] 0.00 0.00 0.5 0.5
## [2,] 0.00 0.00 1.0 0.0
## [3,] 0.75 0.25 0.0 0.0
## [4,] 0.50 0.50 0.0 0.0
P%*%P %*% P %*% P
## [,1] [,2] [,3] [,4]
## [1,] 0.75 0.25 0.00 0.00
## [2,] 0.50 0.50 0.00 0.00
## [3,] 0.00 0.00 0.75 0.25
## [4,] 0.00 0.00 0.50 0.50
P%*%P %*% P %*% P %*% P
## [,1] [,2] [,3] [,4]
## [1,] 0.000 0.000 0.75 0.25
## [2,] 0.000 0.000 0.50 0.50
## [3,] 0.625 0.375 0.00 0.00
## [4,] 0.750 0.250 0.00 0.00
tP = t(P)
E=eigen(tP)
E
## eigen() decomposition
## $values
## [1] 1+0.0000000i -1+0.0000000i 0+0.7071068i 0-0.7071068i
##
## $vectors
## [,1] [,2] [,3] [,4]
## [1,] 0.6324555+0i 0.6324555+0i 0.0000000-0.4082483i 0.0000000+0.4082483i
## [2,] 0.3162278+0i 0.3162278+0i 0.0000000+0.4082483i 0.0000000-0.4082483i
## [3,] 0.6324555+0i -0.6324555+0i -0.5773503+0.0000000i -0.5773503+0.0000000i
## [4,] 0.3162278+0i -0.3162278+0i 0.5773503+0.0000000i 0.5773503+0.0000000i
EE = E$vectors[,1]
EE
## [1] 0.6324555+0i 0.3162278+0i 0.6324555+0i 0.3162278+0i
F= EE /sum(EE)
F
## [1] 0.3333333+0i 0.1666667+0i 0.3333333+0i 0.1666667+0i
F %*% P
## [,1] [,2] [,3] [,4]
## [1,] 0.3333333+0i 0.1666667+0i 0.3333333+0i 0.1666667+0i
step <- function(i){
q = P[i,]
x=-1
u = runif(1)
j=0
cumq = cumsum(q)
while(x==-1){
j<-j+1
if(u <= cumq[j]){x <-j}
}
x
}
steps <- function(n){
x = 1
for (i in 1:n){
x <- c(x, step(x[i]))
}
x
}
z = steps(10000)
mean(z==1)
## [1] 0.3321668
Q
## [,1] [,2] [,3] [,4]
## [1,] 0.5 0.5 0.0 0.0
## [2,] 1.0 0.0 0.0 0.0
## [3,] 0.0 0.0 0.5 0.5
## [4,] 0.0 0.0 1.0 0.0
Q %*% Q
## [,1] [,2] [,3] [,4]
## [1,] 0.75 0.25 0.00 0.00
## [2,] 0.50 0.50 0.00 0.00
## [3,] 0.00 0.00 0.75 0.25
## [4,] 0.00 0.00 0.50 0.50
Q %*% Q %*% Q
## [,1] [,2] [,3] [,4]
## [1,] 0.625 0.375 0.000 0.000
## [2,] 0.750 0.250 0.000 0.000
## [3,] 0.000 0.000 0.625 0.375
## [4,] 0.000 0.000 0.750 0.250
tQ = t(Q)
E=eigen(tQ)
E
## eigen() decomposition
## $values
## [1] 1.0 1.0 -0.5 -0.5
##
## $vectors
## [,1] [,2] [,3] [,4]
## [1,] 0.8944272 0.0000000 -0.7071068 0.0000000
## [2,] 0.4472136 0.0000000 0.7071068 0.0000000
## [3,] 0.0000000 0.8944272 0.0000000 -0.7071068
## [4,] 0.0000000 0.4472136 0.0000000 0.7071068
EE = E$vectors[,1]
EE/sum(EE)
## [1] 0.6666667 0.3333333 0.0000000 0.0000000
Let \(X\) and \(Y\) be discrete random variables taking values on the space \(S\), with probability mass functions \(p\) and \(q\). Show that there exists a coupling of \((X', Y')\) of \(X\) and \(Y\) such that \[d_{TV}(X, Y) = d_{TV}(X', Y') = 2 \mathbb{P}(X' \not = Y'),\] so that equality is obtained in the usual coupling inequality.
Hint: consider a joint distribution for \(X'\) and \(Y'\), where maximize the probability that \(X'=Y'\) and you spread out the rest of the remaining probability as equally as possible.
Consider the joint distribution (possibly infinite matrix) \(r\) defined as follows. We set \[r_{ii}= t_i:= \min(p_i,q_i).\] Let \[ \theta = \sum_{i\in S} t_i.\] If \(\theta = 1\), then we set all other off-diagonal entries in \(r\) to be zero. Otherwise, set for \(i \not =j\), set \[r_{ij} = \frac{(p_i- t_i) ( q_j - t_j)}{1-\theta}.\]
We easily verify that \(r_{ij} \geq 0\) is a joint distribution for \(X\) and \(Y\). For example, we show that the first marginal is \(p\), the distribution of \(X\).
\[ \begin{eqnarray*} \sum_{j \in S} r_{ij} &=& t_i + \sum_{j \in S \setminus \{i\} } r_{ij} \\ &=& t_i + \frac{(p_i- t_i)}{1-\theta}\sum_{j \in S \setminus \{i\} } (q_j - t_j) \\ &=& t_i + \frac{(p_i- t_i)}{1-\theta}(1-q_i - \theta + t_i ). \end{eqnarray*} \]
If \(p_i = t_i\), then we obtain that
\[\sum_{j \in S} r_{ij} = t_i = p_i= \mathbb{P}(X=i);\] otherwise, \(q_i = t_i\), and we still have
\[\sum_{j \in S} r_{ij} = t_i + p_i - t_i = p_i = \mathbb{P}(X=i).\] Recall that \[2\min(a,b) = a+b - |a-b|.\] Also note that that if \(X'\) and \(Y'\) have joint distribution \(r\), then \(\mathbb{P}(X' = Y') = \theta\). Thus
\[ \begin{eqnarray*} d_{TV}(X',Y') &=& \sum_{i \in S}|p_i-q_i| \\ &=& \sum_{i\in S} [p_i+q_i - 2 \min(p_i, q_i)] \\ &=& 2 - 2\theta \\ &=& 2\mathbb{P}(X' \not = Y'). \end{eqnarray*} \]
Let \(X_1, \ldots, X_n\) be independent integer-valued random variables. Also let \(Y_1, \ldots, Y_n\) be independent integer-valued random variables. Set \(S=X_1 + \cdots + X_n\) and \(W = Y_1 + \cdots + Y_n\). Show that
\[d_{TV}(S, W) \leq \sum_{i=1}^n d_{TV}(X_i, Y_i).\]
Hint: Use a maximal coupling, along with independence.
Let \(Z_i'=(X_i', Y_i')\) be a maximal coupling of \(X_i\) and \(Y_i\), which exists by the previous exercise. Note that \[ d_{TV}(X_i,Y_i) = 2 \mathbb{P}(X_i' \not = Y_i').\] Furthermore, we require \(Z_1', \ldots, Z_n'\) to be independent; this ensures that \(X_1', \ldots, X'_n\) are independent and \(Y_1', \ldots, Y_n'\) are independent. Hence by assumption that \(X_1, \ldots, X_n\) are independent and \(Y_1, \ldots, Y_n\) are independent, we have that if \[ S' = X_1' + \cdots + X_n'\] and
\[ W' = Y_1' + \cdots + Y_n',\]
then \(S'\) has the same distribution as \(S\) and \(W'\) has the same distribution as \(W\); in other words, \((S',W')\) is a coupling of \(S\) and \(W\). Hence starting with the coupling inequality, we have
\[ \begin{eqnarray*} d_{TV}(S, W) &=& d_{TV}(S', W') \\ &\leq& 2 \mathbb{P}(S' \not = W') \\ &\leq & 2 \mathbb{P}\Big( \bigcup_{i=1}^n \{ X_i' \not = Y_i'\} \Big) \\ &\leq& \sum_{i=1} ^n 2\mathbb{P}(X_i' \not = Y_i') \\ &=& \sum_{i=1} ^n d_{TV}(X_i, Y_i), \end{eqnarray*} \] as desired.