Please familarize yourself with the following excerpt on plagiarism and collusion from the student handbook
By ticking the submission declaration box in Moodle you are agreeing to the following declaration:
Declaration: I am aware of the UCL Statistical Science Department’s regulations on plagiarism for assessed coursework. I have read the guidelines in the student handbook and understand what constitutes plagiarism. I hereby affirm that the work I am submitting for this in-course assessment is entirely my own.
Please do not write your name anywhere on the submission. Please include only your student number as the proxy identifier.
Let \(X_1, X_2, \ldots\) be iid exponential random variables with unknown rate \(\lambda>0\). Suppose we only get to keep track of whether \(X_i\) is in the unit interval or not; that is, we observe the random variables \(Y_i = \mathbf{1}[ 0 \leq X_i \leq 1]\). Find a reasonable estimator for \(\lambda\) and provide justification that your estimator is good.
Demonstrate by simulations that if three points are sampled independently and uniformly at random inside a disc of radius \(1\), then the expected area of the corresponding triangle is \(35/48\pi\).
Suppose \(X\) is irreducible and aperiodic finite state Markov chain on a state space \(S\), containing the states \(a,b\) and \(c\). Consider the associated sums given by \[T_n:= \frac{1}{n+1} \sum_{k=0} ^n \mathbf{I}[X_{i+2} =c, X_{i+1}=b =, X_{i} =a].\] Guess the limit for \(T_n\) as \(n\to \infty\). Provide evidence (not necessarily a proof) for your guess. [5 points]
Demonstrate your guess, via simulations, using the Markov chain with transition matrix \[ \begin{equation*} P = \begin{pmatrix} 2/6 & 3/6 & 1/6 \\ 1/4 & 1/4 & 1/2 \\ 1/8 & 1/4 & 5/8 \end{pmatrix}. \end{equation*} \] [5 points]
Consider the point process on the unit interval \([0,1]\) that is a sampled by placing \(N=10\) points independently and uniformly on the interval. Let \(X\) and \(Y\) be the number of points on the two disjoint halves of the interval.
We have that \(p:=\mathbb{E}Y_i = 1- e^{-\lambda}\); thus \(\lambda = -\log(1-p)\). Note that the \(Y_i\) are an iid sequence of Bernoulli random variables with parameter \(p\), and the law of large numbers gives that their sample average converges to \(p\), which can be used to give an estimate for \(\lambda\).
We modify the code from the random triangles worksheet and our code on sampling on a disc.
To sample on a disc we have:
point <- function(){
z=2
while(length(z)==1){
x = 2*runif(1) -1
y = 2*runif(1) -1
if (x^2 + y^2 <1) { z<-c(x,y)}
}
z
}
We compute the area of a random sample using the determinant formulation
area <-function(){
p1 = point()
p2 = point()
p3 = point()
M = matrix( c( c(p1,1), c(p2,1),c(p3,1)), nrow=3)
A = 0.5*abs(det(M))
A
}
Then of course we replicate:
x = replicate(5000, area())
abs(mean(x) - (35 /(48 * pi)) )
## [1] 0.004330415
From our previous experience and convergence theorems, without loss of generality, we consider the chain started from stationarity. Let \(\pi\) be the stationary distribtion. Taking the expectation we obtain \[\mathbb{E}T_n = \mathbb{P}(X_2=c,X_1=b, X_0=a ) = \pi_a p_{ab}p_{bc}.\] Thus our experience with the law of large numbers for Markov chains suggest this is the limit.
With the given chain, we can simulate it in R, with the code from the course notes, and compute this average over a sample, and compare. We will start than chain at state \(1\), and take \(a=1\), \(b=2\), and \(c=3\).
P <- matrix(c(2/6, 3/6, 1/6, 1/4, 1/4,1/2, 1/8, 1/4, 5/8), nrow =3)
P <- t(P)
P
## [,1] [,2] [,3]
## [1,] 0.3333333 0.50 0.1666667
## [2,] 0.2500000 0.25 0.5000000
## [3,] 0.1250000 0.25 0.6250000
tP = t(P)
E = eigen(tP)
EE = E$vectors[,1]
F= EE /sum(EE)
F # stationary distribution
## [1] 0.2054795 0.3013699 0.4931507
F[1]
## [1] 0.2054795
step <- function(i){
q = P[i,]
x=-1
u = runif(1)
j=0
cumq = cumsum(q)
while(x==-1){
j<-j+1
if(u <= cumq[j]){x <-j}
}
x
}
steps <- function(n){
x = 1
for (i in 1:n){
x <- c(x, step(x[i]))
}
x
}
mc=steps(10000) # sample path of alot of steps
L = length(mc)
L
## [1] 10001
onetwothree=0
for(i in 1:(L-2)){
if (mc[i]==1 && mc[i+1]==2 && mc[i+2]==3 ){
onetwothree <- onetwothree +1
}
}
Tn = onetwothree/L
Tn
## [1] 0.05339466
abs( Tn - ( F[1] * (3/6) * (1/2) ) )
## [1] 0.002024798