Consider again the a Markov chain corresponding to the card shuffling procedure given at the end of the previous homework.

Consider the case \(n=4\), so that there are four cards labeled: \(1,2,3,4\).

- How many elements are in the state space?
- What is the probability of going from the order \((1,2,3,4)\) to the order \((1,2,4,3)\), in one step?
- What is the probability of going from the order \((1,2,3,4)\) to the order \((2,1, 4,3)\), in one step?
- Is the corresponding transition matrix irreducible?
- Is the corresponding transition matrix aperiodic? Hint: the answer is no.

- Augment the procedure, and consider the
*lazy shuffle*by allowing with probability \(1/2\) that we simply leave the deck alone. - Start the Markov chain \(X\) with the cards in order \(X_0 = (1,2,3,4)\) and simulate the lazy shuffling.
- Consider the Markov chain \(Y\) started with in one of the \(24\) orders, uniformly at random. Simulate the Doeblin coupling of \(X\) and \(Y\).
- How long on average does it take for the coupling to succeed?

Let \(P\) be a transition matrix. Show that if \(P^n\) has all positive entries for some \(n\), then \(P^m\) has positive entries for all \(m \geq n\).

Consider the Markov chain started on five states \(\{1,2,3,4,5\}\) started at \(1\) with transition matrix given by

```
P <- matrix(c(1/4, 1/4, 1/2, 0,0,
1/4, 1/4,0,0,1/2,
1/4,0,0, 1/2, 1/4,
0,0,0,1/2, 1/2, 1/3, 1/3, 0, 0, 1/3), nrow =5)
P <-t(P)
P
```

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.2500000 0.2500000 0.5 0.0 0.0000000
## [2,] 0.2500000 0.2500000 0.0 0.0 0.5000000
## [3,] 0.2500000 0.0000000 0.0 0.5 0.2500000
## [4,] 0.0000000 0.0000000 0.0 0.5 0.5000000
## [5,] 0.3333333 0.3333333 0.0 0.0 0.3333333
```

- Show that \(P\) is irreducible and aperiodic.
- Simulate this Markov chain \(X_0, X_1, \ldots, X_n\) for large values of \(n\)
- Find the average number of times the Markov chain is in state \(3\).
- Is your answer consistent with theory? Discuss.
- Find the average number of times the chain goes for state \(1\) state \(3\).
- Guess a version of the large of numbers based on your previous observation.

Often in Bayesian statistics one needs to able to sample from the joint density
\(j\) of two random variables \((W,Z)\) but only has access to their conditional densities
\(f(w|z)\) and \(g(z|w)\). One then considers the Markov chain defined in the following way.
We start with an initial value \(X_0 = x_0\) and *pretend* that we have sampled from \(W\).
Now we define \(Y_0=y_0\) to be a random variable with density \(g(\cdot|w)\) which we can simulate in R.
Next, we simulate \(X_1\) using the density \(f(\cdot|y_0)\), and repeat.
We obtain a Markov chain of pairs

\[ \Bigg( (X_0, Y_0), (X_1, Y_1), \ldots, (X_n, Y_n) \Bigg).\] A stationary distribution is \(j\), and we hope that when \(n\) is large \((X_n, Y_n)\) has a joint distribution that is close to \(j\).

In what follows we will play with a toy example. Suppose \(W\) and \(Z\) are given in the following way: we flip a fair coin \(W\), and if we get a heads, we flip a fair coin for \(Z\), otherwise, we flip a \(1/4\) coin for \(Z\).

- Compute the joint distribution \((W, Z)\)
- Compute the conditional distribution of \(W\) given \(Z\)
- Compute the conditional distribution of \(Z\) given \(W\)
- Try the Gibbs sampler.
- Check the joint distribution of the output from the Gibbs sampler.

- There are \(4! = 24\) elements in the state space
- There are \({4 \choose 2} = 6\) transpositions available, so \(1/6\)
- Yes, since any permutation can be obtained as a product of transpositions.
- No! Some permutations can only be written as an
*even*number of transpositions. - The code below simulates the
*lazy*shuffling.

```
shuffle <- function(x){
if(rbinom(1,1,0.5) ==1 ){
t=sample(4, 2, replace=FALSE)
a=t[1]
b=t[2]
da = x[a]
db = x[b]
x[a] <- db
x[b] <- da
}
x
}
coupled <- function(x){
deckx <- x
decky <- sample(4,4,replace=FALSE)
n =0
hisx <- deckx
hisy <- decky
while( all(deckx== decky)==FALSE ) {
deckx <- shuffle(deckx)
decky <- shuffle(decky)
n <-n+1
hisx <- rbind(hisx, deckx)
hisy <- rbind(hisy, decky)
}
outlist = list('x' =hisx, 'y'=hisy, 'n'=n)
outlist
}
coupled(c(1,2,3,4))
```

```
## $x
## [,1] [,2] [,3] [,4]
## hisx 1 2 3 4
## deckx 1 2 3 4
## deckx 3 2 1 4
## deckx 3 2 4 1
## deckx 1 2 4 3
## deckx 1 2 4 3
## deckx 1 2 4 3
## deckx 3 2 4 1
## deckx 2 3 4 1
##
## $y
## [,1] [,2] [,3] [,4]
## hisy 3 2 1 4
## decky 3 4 1 2
## decky 3 4 1 2
## decky 1 4 3 2
## decky 4 1 3 2
## decky 4 2 3 1
## decky 4 2 3 1
## decky 4 3 2 1
## decky 2 3 4 1
##
## $n
## [1] 8
```

`mean(replicate(1000, coupled(c(1,2,3,4))$n))`

`## [1] 32.5`

Observe that

\[p_{ij}(n+1) = \sum_{k} p_{ik}(n) p_{kj}.\]

Since all the entries \(p_{ik}(n)\) are positive, we just need one positive \(p_{kj}\): of course they cannot all be zero, otherwise every power \(P\) would have a zero column.

- We check that

`P %*% P %*% P %*% P`

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.2282986 0.1970486 0.1041667 0.1562500 0.3142361
## [2,] 0.2572338 0.2207755 0.1371528 0.1041667 0.2806713
## [3,] 0.2314815 0.2132523 0.1076389 0.1302083 0.3174190
## [4,] 0.2476852 0.2268519 0.1111111 0.1041667 0.3101852
## [5,] 0.2519290 0.2172068 0.1365741 0.1111111 0.2831790
```

has all positive entries.

The following code simulates this Markov chain. We find that average number of times that Markov chain is in state \(i\) is approximated by \(\pi(i)\) where \(\pi\) is the stationary distribution; this is expected by the law of large numbers, for Markov chains.

We guess that \[ \frac{1}{n} \sum_{i=0} ^{n-1} \mathbf{1}[X_i=1] \mathbf{1}[X_{i+1}=3] \to \mathbb{P}(X_1 = 3| X_0 = 1) \pi(1) = p_{13}\pi(1)\] since the Markov chain will eventually behave like one started from the stationary distribution \(\pi\).

```
step <- function(i){
q = P[i,]
x=-1
u = runif(1)
j=0
cumq = cumsum(q)
while(x==-1){
j<-j+1
if(u <= cumq[j]){x <-j}
}
x
}
steps <- function(n){
x = 1
for (i in 1:n){
x <- c(x, step(x[i]))
}
x
}
mc=steps(10000)
stat = c(mean(mc==1), mean(mc ==2),mean(mc ==3), mean(mc ==4), mean(mc ==5) )
stat
```

`## [1] 0.2419758 0.2124788 0.1210879 0.1188881 0.3055694`

`stat %*% P`

```
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.2457421 0.2154701 0.1209879 0.119988 0.2978119
```

```
onethree=0
for(i in 1:9999){
if (mc[i]==1 && mc[i+1]==3 ){
onethree <- onethree +1
}
}
onethree/10000
```

`## [1] 0.121`

Let \(C_1\) or \(C_0\) be the second coin that is flipped.

We have \[\mathbb{P}(W =1, Z=1) = \mathbb{P}(W =1, C_1=1) = 1/4\] \[\mathbb{P}(W =1, Z=0) = \mathbb{P}(W =1, C_1=0) = 1/4\] \[\mathbb{P}(W =0, Z=1) = \mathbb{P}(W =0, C_0=1) = 1/8\] \[\mathbb{P}(W =0, Z=0) = \mathbb{P}(W =0, C_0=0) = 3/8\]

We have

\[\mathbb{P}(W =1|Z=1) = \frac{\mathbb{P}(W =1, C_1=1)}{\mathbb{P}(Z=1)} = 2/3\] \[\mathbb{P}(W =1| Z=0) = \frac{\mathbb{P}(W =1, C_1=0)}{\mathbb{P}(Z=0)} = 2/5\] \[\mathbb{P}(W =0|Z=1) = 1/3\] \[\mathbb{P}(W =0| Z=0) = 3/5\] * These conditional probabilites are given to us by the set-up: \[\mathbb{P}(Z=1 | W=1) = 1/2\] and \[ \mathbb{P}(Z=1 | W=0) = 1/4\]

- The code below gives the output for the Gibbs sampler when \(n=100\). We repeat this Gibbs sampler many times and check the resulting joint distribution.

```
fwz <- function(z){
w = rbinom(1,1, 2/3)
if (z ==0){
w <- rbinom(1,1, 2/5)
}
w
}
fzw <- function(w){
z = rbinom(1,1, 1/2)
if (w==0){
z <- rbinom(1,1, 1/4)
}
z
}
gibbs <-function(){
w=1
x=w
for(i in 1:100){
y <- fzw(x)
x <- fwz(y)
}
c(x,y)
}
zz=replicate(10000, gibbs())
a=zz[1,]
b=zz[2,]
num00=0
for(i in 1:10000){
if( a[i]==0 && b[i]==0)
{num00 <- num00 +1}
}
num00/10000
```

`## [1] 0.3747`

```
num10=0
for(i in 1:10000){
if( a[i]==1 && b[i]==0)
{num10 <- num10 +1}
}
num10/10000
```

`## [1] 0.2491`