Symbolic computing inside Python (inside R)

Will we explore basic symbolic computing inside Python. In symbolic computing, the software is able to compute \(2x = x+x\) or \(\int_0^x x^2 dx = \tfrac{x^3}{3}\) without having a numerical value for \(x\). Popular software includes, Maple, Mathematica, and Sage; Python’s sympy is a basic in-house substitute, which we will explore.

import sympy as sym

x = sym.pi   # the digit pi
print(x)

## pi

print("#############")

## #############

print(x.evalf())

## 3.14159265358979

print("#############")

## #############

print(x.evalf(100))

## 3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117068

print("#############")

## #############

x = sym.Symbol('x')   # you got to tell Python that x is symbol
print(sym.simplify((x + x)))

## 2*x

y = sym.Symbol('y')
y=sym.integrate(3 * x ** 2, x)  # integrate 3x^2
print(y)

## x**3

print("#############")

## #############

We can also do linear algebra

a = sym.Symbol('a')
b = sym.Symbol('b')
c = sym.Symbol('c')
d = sym.Symbol('d')
M =  sym.Symbol('M')
M=sym.Matrix([[a, b], [c, d]])
print(M)

## Matrix([[a, b], [c, d]])

print("#############")

## #############

print(M.det())

## a*d - b*c

print("#############")

## #############

print(M.T)

## Matrix([[a, c], [b, d]])

print("#############")

## #############

print(M.eigenvects())

## [(a/2 + d/2 - sqrt(a**2 - 2*a*d + 4*b*c + d**2)/2, 1, [Matrix([
## [-d/c + (a/2 + d/2 - sqrt(a**2 - 2*a*d + 4*b*c + d**2)/2)/c],
## [                                                         1]])]), (a/2 + d/2 + sqrt(a**2 - 2*a*d + 4*b*c + d**2)/2, 1, [Matrix([
## [-d/c + (a/2 + d/2 + sqrt(a**2 - 2*a*d + 4*b*c + d**2)/2)/c],
## [                                                         1]])])]

print("#############")

## #############

Msub = M.subs(a,1).subs(b,1).subs(c,1).subs(d,0)
values = Msub.eigenvects()
print("#############")

## #############

print(values)

## [(1/2 - sqrt(5)/2, 1, [Matrix([
## [1/2 - sqrt(5)/2],
## [              1]])]), (1/2 + sqrt(5)/2, 1, [Matrix([
## [1/2 + sqrt(5)/2],
## [              1]])])]

print("#############")

## #############

The output may be a bit hard to read, we have the eigenvalue, its multiplicity, and the eigenvector(s): here we substitute in the identity matrix in two dimensions, so that the eigenvalue \(1\) has en eigenbasis that is spanned by the usual coordinate basis vectors.

M=sym.Matrix([[a, b], [c, d]])
Msub = M.subs(a,1).subs(b,0).subs(c,0).subs(d,1)
values = Msub.eigenvects()
print(values)

## [(1, 2, [Matrix([
## [1],
## [0]]), Matrix([
## [0],
## [1]])])]

A Markov chain

Consider the three state Markov chain on \(\{1,2,3\}\) where \[p_{11} = p = 1-p_{12}\] \[p_{22} = q=1- p_{23}\] and \[p_{31}=1.\]

We will consider the case where \(q \in (0,1)\) is fixed, but will think of \(p \to 1\).

Write down the transition matrix.
Suppose the Markov chain \(X\) with these transition probabilities starts in state \(1\). Let \(T = \inf\{n \geq 1: X_n = 1\}\). Compute (by brute force) \(\mathbb{E}(T)\).
Find the stationary distribution of the transition matrix.
Simulate the chain, and leave \(p\) and \(q\) as variables that you can choose.
See that all your findings and answers are consistent.

Partial solutions

In order to compute \(\mathbb{E}(T)\), we can save a little time, by employing the strong Markov property. Let \(G\) be geometric random variable with parameter \(1-q\) that is independent of the chain \(X\); note that \(1-q\) is the probability that we can exit state \(2\), given we are there. Then \(T\) has the same law as:

\[T' = 1 \cdot \mathbf{1}[X_1=1] + \mathbf{1}[X_1=2](2+G),\]

here recall that \(\mathbf{1}[X_1=1]\) is one if \(X_1=1\) and zero if \(X_1=0\).
Then by the linearity of expectation and the independence of \(G\) and \(X\), we have

\[\mathbb{E}(T') = p + (1-p)(2 + \frac{1}{1-q});\]

notice as \(p \to 1\), we have that \(\mathbb{E}(T) \to 1\), as expected.

We can compute the eigenvalues of the transition matrix in Python. We know from theory, that if \(\pi\) is the stationary distribution, then \(\mathbb{E}(T) = (\pi(1))^{-1}\), which we can confirm here algebraically.

p = sym.Symbol('p')
q = sym.Symbol('q')
P =  sym.Symbol('P')
P=sym.Matrix([[p, 1-p,0], [0,q, 1-q], [1,0,0]])
Pt =  sym.Symbol('Pt')
Pt= P.T
print("#############")

## #############

print(Pt.eigenvects())

## [(1, 1, [Matrix([
## [-1/(p - 1)],
## [-1/(q - 1)],
## [         1]])]), (p/2 + q/2 - sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2, 1, [Matrix([
## [-1 + (p/2 + q/2 - sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2)/(q - 1)],
## [    -(p/2 + q/2 - sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2)/(q - 1)],
## [                                                                           1]])]), (p/2 + q/2 + sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2, 1, [Matrix([
## [-1 + (p/2 + q/2 + sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2)/(q - 1)],
## [    -(p/2 + q/2 + sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2)/(q - 1)],
## [                                                                           1]])])]

print("#############")

## #############

print(sym.simplify( ( 1/(1-p)  + 1/(1-q) +1  ) / (1/(1-p))         )        )

## (p + q - (p - 1)*(q - 1) - 2)/(q - 1)

print("#############")

## #############

print(sym.simplify(2-p + (1-p)/(1-q)                )     )

## (p + (2 - p)*(q - 1) - 1)/(q - 1)

print("#############")

## #############

print(sym.simplify(    ( 1/(1-p)  + 1/(1-q) +1  ) / (1/(1-p)) -  (2-p + (1-p)/(1-q)  )        )        )

## 0

We will simulate the chain in R, for the values \(p=0.75\) and \(q=0.25\). We will find the average number of times the chain is in state \(1\). We will also run code to estimate \(\mathbb{E}(T)\).

step <-function(i,p,q){
P <- matrix(c(p,1-p,0, 
              0,q,1-q, 
              1,0,0), nrow =3)
P <-t(P)
q = P[i,]
  x=-1
  u = runif(1)
  j=0
  cumq = cumsum(q)
  while(x==-1){
    j<-j+1
    if(u <= cumq[j]){x <-j}
  }
  x
}

steps <- function(n,p,q){
  x = 1
  for (i in 1:n){
    x <- c(x, step(x[i],p,q))
  }
  x
}
mc=steps(5000,0.75,0.25)
stat = c(mean(mc==1), mean(mc ==2),mean(mc ==3))
stat

## [1] 0.6208758 0.2187562 0.1603679

time <- function(i,p,q){
  n =0
  x=i
  n <- n+1
  x <-step(x,p,q)
  while(x !=i){
    n <-n+1
    x <- step(x,p,q)
  }
  n
}

m=mean(replicate(5000, time(1,0.75,0.25)))

1/m

## [1] 0.6205014

x=((p + (2 - p)*(q - 1) - 1)/(q - 1)).subs(p,0.75).subs(q,0.25)
y=1/x
print (y.evalf())

## 0.631578947368421

Endnotes

Version: 22 November 2021
Rmd Source

In class exercise: Return times and symbolic computing

Symbolic computing inside Python (inside R)

A Markov chain

Partial solutions

Endnotes