Please familarize yourself with the following excerpt on plagiarism and collusion from the student handbook
By ticking the submission declaration box in Moodle you are agreeing to the following declaration:
Declaration: We are aware of the UCL Statistical Science Department’s regulations on plagiarism for assessed coursework. We have read the guidelines in the student handbook and understand what constitutes plagiarism. We hereby affirm that the work we are submitting for this in-course assessment is entirely our own.
Please do not write your names anywhere on the submission. Please include only your student numbers as the proxy identifier.
You might recall a proof that when there are only twenty-three people in a room, then with probability almost \(1/2\), the room contains two people with the same birthday, under the assumptions that there are \(365\) days in a year, and birthdays throughout the year are equally distributed, and the birthdays of the occupants of the room are independent.
Under these similar assumptions, let \(p(n)\) be the probability that in a room of \(n\) people that there are three people people that share the same birthday. By running simulations:
Plot an approximate graph of \(p(n)\). [8 points]
Find \(n\) such that \(p(n) \approx 1/2\). [2 points]
Find or generate a million or so digits of \(\pi\). See for example you may use this link. Here it is okay to use the internet to find basic importation functions or use fancy formulas for the digits of \(\pi\). [2 points]
Process your digits so that they are in groups of \(10\), so that you can pretend each group of \(10\) is realization of a number that is uniformly distributed in \([0,1]\). You might need to take special care of groups beginning with \(0\). [2 points]
Put everything into a single vector, where the first entry contains the number \(0.1415926535\). [2 points]
Plot of histogram to check if they appear uniformly distributed. [2 points]
Find the mean and variance of the data to make sure the data has been imported properly. [2 points]
Consider the continuous-time Markov chain \(X\) with three state \(\{1,2,3\}\) with \(Q\) matrix given by
Q <- matrix(c(-5,2,3, 2,-3,1, 2,6,-8), nrow =3)
Q = t(Q)
Q
## [,1] [,2] [,3]
## [1,] -5 2 3
## [2,] 2 -3 1
## [3,] 2 6 -8
Use linear algebra and R to find the stationary distribution \(\lambda\). You may need to import a package. [2 points]
Write code, which uses only the randomization provided by the digits of \(\pi\) so that you obtain the state of the chain at time \(t\), starting at a state \(i\). [14 points]
For \(t=35\), and starting at state \(3\), by running a large number of simulation, find the average number of times the Markov chain is in each of the states at time \(t\). You may find that you may need to be more economical, given the number of digits of \(\pi\) that were imported. [2 points]
Discuss your results, as it relates to the theory you learned, and the randomization obtained from \(\pi\). [2 points]
In your coding, you may need a mechanism to keep track of which bits of the randomness from \(\pi\) have already been used; in order to facilitate this, you may need a function to be able to change the value of a variable that exists globally outside definitions of functions; the following example illustrates how to accomplish this:
counterr =0
example <- function(t){
counterr <<- counterr +1
t+5
}
example(1)
## [1] 6
example(10)
## [1] 15
counterr
## [1] 2
Consider a \(M(\lambda) / M(\mu) /1\) queue, where \(\lambda < \mu\) and the queue is started at stationarity so that the number of items in the system at any time \(t\), has the same distribution. Consider an arrival at some time \(t\).
Show that the total time it spends in the system is exponential with rate \(\mu - \lambda\), by using the brute force exercise in HW8. [8 points]
Show that this is consistent with Little’s law. [2 points]
Suppose that customers arrive at a queue with independent exponential service times of rate \(2\), and when they depart, they enter another queue also with independent exponential service times, but at rate \(5\). Assume that the customers enter the first queue arrive as a Poisson process of rate \(1\). Let \(Q_1(t)\) and \(Q_2(t)\) be the number of customers in the respective queues at time \(t\). Verify by simulations that the stationary distribution for \((Q_1, Q_2)\) is given by a product of geometric distributions.
Let \(X\) be an irreducible and aperiodic matrix on a finite number of states \(S\). For each \(a, s \in S\), let
\[N_{as}^n = \sum_{i=0} ^n \mathbf{1}[X_i=a, X_{i+1} =s].\]
\[T_{n} = \frac{N_{ab} ^n}{ \sum_{s \in S} N_{as}^n }\] gives a sequence of consistent estimators for \(p_{ab}\). [5 points]
Is \(T_n\) unbiased? If not, what additional assumptions are needed to ensure that it is? [2 points]
Suppose that you know all the values of the transition matrix, except the values \(p_{ab}\) and \(p_{ac}\). Also assume you know that the Markov chain has a starting distribution \(\rho\). Find the maximum likelihood estimate of the values \(p_{ab}\) and \(p_{ac}\), given a realization \[X=x=(x_0, x_1, \ldots, x_n).\]
It is permissible to review your previous module notes on mle estimation. [3 points]