Symbolic computing inside Python (inside R)

Will we explore basic symbolic computing inside Python. In symbolic computing, the software is able to compute \(2x = x+x\) or \(\int_0^x x^2 dx = \tfrac{x^3}{3}\) without having a numerical value for \(x\). Popular software includes, Maple, Mathematica, and Sage; Python’s sympy is a basic in-house substitute, which we will explore.

import sympy as sym

x = sym.pi   # the digit pi
print(x)
## pi
print("#############")
## #############
print(x.evalf())
## 3.14159265358979
print("#############")
## #############
print(x.evalf(100))
## 3.141592653589793238462643383279502884197169399375105820974944592307816406286208998628034825342117068
print("#############")
## #############
x = sym.Symbol('x')   # you got to tell Python that x is symbol
print(sym.simplify((x + x)))
## 2*x
y = sym.Symbol('y')
y=sym.integrate(3 * x ** 2, x)  # integrate 3x^2
print(y)
## x**3
print("#############")
## #############

We can also do linear algebra

a = sym.Symbol('a')
b = sym.Symbol('b')
c = sym.Symbol('c')
d = sym.Symbol('d')
M =  sym.Symbol('M')
M=sym.Matrix([[a, b], [c, d]])
print(M)
## Matrix([[a, b], [c, d]])
print("#############")
## #############
print(M.det())
## a*d - b*c
print("#############")
## #############
print(M.T)
## Matrix([[a, c], [b, d]])
print("#############")
## #############
print(M.eigenvects())
## [(a/2 + d/2 - sqrt(a**2 - 2*a*d + 4*b*c + d**2)/2, 1, [Matrix([
## [-d/c + (a/2 + d/2 - sqrt(a**2 - 2*a*d + 4*b*c + d**2)/2)/c],
## [                                                         1]])]), (a/2 + d/2 + sqrt(a**2 - 2*a*d + 4*b*c + d**2)/2, 1, [Matrix([
## [-d/c + (a/2 + d/2 + sqrt(a**2 - 2*a*d + 4*b*c + d**2)/2)/c],
## [                                                         1]])])]
print("#############")
## #############
Msub = M.subs(a,1).subs(b,1).subs(c,1).subs(d,0)
values = Msub.eigenvects()
print("#############")
## #############
print(values)
## [(1/2 - sqrt(5)/2, 1, [Matrix([
## [1/2 - sqrt(5)/2],
## [              1]])]), (1/2 + sqrt(5)/2, 1, [Matrix([
## [1/2 + sqrt(5)/2],
## [              1]])])]
print("#############")
## #############

The output may be a bit hard to read, we have the eigenvalue, its multiplicity, and the eigenvector(s): here we substitute in the identity matrix in two dimensions, so that the eigenvalue \(1\) has en eigenbasis that is spanned by the usual coordinate basis vectors.

M=sym.Matrix([[a, b], [c, d]])
Msub = M.subs(a,1).subs(b,0).subs(c,0).subs(d,1)
values = Msub.eigenvects()
print(values)
## [(1, 2, [Matrix([
## [1],
## [0]]), Matrix([
## [0],
## [1]])])]

A Markov chain

Consider the three state Markov chain on \(\{1,2,3\}\) where \[p_{11} = p = 1-p_{12}\] \[p_{22} = q=1- p_{23}\] and \[p_{31}=1.\]

We will consider the case where \(q \in (0,1)\) is fixed, but will think of \(p \to 1\).

Partial solutions

In order to compute \(\mathbb{E}(T)\), we can save a little time, by employing the strong Markov property. Let \(G\) be geometric random variable with parameter \(1-q\) that is independent of the chain \(X\); note that \(1-q\) is the probability that we can exit state \(2\), given we are there. Then \(T\) has the same law as:

\[T' = 1 \cdot \mathbf{1}[X_1=1] + \mathbf{1}[X_1=2](2+G),\]

here recall that \(\mathbf{1}[X_1=1]\) is one if \(X_1=1\) and zero if \(X_1=0\).
Then by the linearity of expectation and the independence of \(G\) and \(X\), we have

\[\mathbb{E}(T') = p + (1-p)(2 + \frac{1}{1-q});\]

notice as \(p \to 1\), we have that \(\mathbb{E}(T) \to 1\), as expected.

We can compute the eigenvalues of the transition matrix in Python. We know from theory, that if \(\pi\) is the stationary distribution, then \(\mathbb{E}(T) = (\pi(1))^{-1}\), which we can confirm here algebraically.

p = sym.Symbol('p')
q = sym.Symbol('q')
P =  sym.Symbol('P')
P=sym.Matrix([[p, 1-p,0], [0,q, 1-q], [1,0,0]])
Pt =  sym.Symbol('Pt')
Pt= P.T
print("#############")
## #############
print(Pt.eigenvects())
## [(1, 1, [Matrix([
## [-1/(p - 1)],
## [-1/(q - 1)],
## [         1]])]), (p/2 + q/2 - sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2, 1, [Matrix([
## [-1 + (p/2 + q/2 - sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2)/(q - 1)],
## [    -(p/2 + q/2 - sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2)/(q - 1)],
## [                                                                           1]])]), (p/2 + q/2 + sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2, 1, [Matrix([
## [-1 + (p/2 + q/2 + sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2)/(q - 1)],
## [    -(p/2 + q/2 + sqrt(p**2 - 2*p*q + 2*p + q**2 + 2*q - 3)/2 - 1/2)/(q - 1)],
## [                                                                           1]])])]
print("#############")
## #############
print(sym.simplify( ( 1/(1-p)  + 1/(1-q) +1  ) / (1/(1-p))         )        )
## (p + q - (p - 1)*(q - 1) - 2)/(q - 1)
print("#############")
## #############
print(sym.simplify(2-p + (1-p)/(1-q)                )     )
## (p + (2 - p)*(q - 1) - 1)/(q - 1)
print("#############")
## #############
print(sym.simplify(    ( 1/(1-p)  + 1/(1-q) +1  ) / (1/(1-p)) -  (2-p + (1-p)/(1-q)  )        )        )
## 0

We will simulate the chain in R, for the values \(p=0.75\) and \(q=0.25\). We will find the average number of times the chain is in state \(1\). We will also run code to estimate \(\mathbb{E}(T)\).

step <-function(i,p,q){
P <- matrix(c(p,1-p,0, 
              0,q,1-q, 
              1,0,0), nrow =3)
P <-t(P)
q = P[i,]
  x=-1
  u = runif(1)
  j=0
  cumq = cumsum(q)
  while(x==-1){
    j<-j+1
    if(u <= cumq[j]){x <-j}
  }
  x
}

steps <- function(n,p,q){
  x = 1
  for (i in 1:n){
    x <- c(x, step(x[i],p,q))
  }
  x
}
mc=steps(5000,0.75,0.25)
stat = c(mean(mc==1), mean(mc ==2),mean(mc ==3))
stat
## [1] 0.6208758 0.2187562 0.1603679
time <- function(i,p,q){
  n =0
  x=i
  n <- n+1
  x <-step(x,p,q)
  while(x !=i){
    n <-n+1
    x <- step(x,p,q)
  }
  n
}

m=mean(replicate(5000, time(1,0.75,0.25)))

1/m
## [1] 0.6205014
x=((p + (2 - p)*(q - 1) - 1)/(q - 1)).subs(p,0.75).subs(q,0.25)
y=1/x
print (y.evalf())
## 0.631578947368421

Endnotes