Please familarize yourself with the following excerpt on plagiarism and collusion from the student handbook
By ticking the submission declaration box in Moodle you are agreeing to the following declaration:
Declaration: I am aware of the UCL Statistical Science Department’s regulations on plagiarism for assessed coursework. I have read the guidelines in the student handbook and understand what constitutes plagiarism. I hereby affirm that the work I am submitting for this in-course assessment is entirely my own.
Please do not write your name anywhere on the submission. Please include only your student number as the proxy identifier.
\[ \frac{1}{n} \sum_{i=1}^n \mathbf{1}[ X_i \leq x] \to F(x). \ \text{[5 points]}\]
Show that if \(X\) is a continuous random variable taking values on \(D\) with a cdf that is strictly increasing on \(D\), then the random variable \(F(X)\) is uniformly distributed on the unit interval \([0,1]\). [3 points]
Show that if \(U\) is uniformly distributed in \([0, \tfrac{\pi}{2}]\), then \(\sin^2(U)\) has the beta distribution with parameters \((\tfrac{1}{2}, \tfrac{1}{2})\). [3 points]
Suppose that you have access to a true source of randomization given by say radioactive decay; that is, you have access to independent random variables that are exponentially distributed with rate \(1\). Show that you can generate random variables with the beta distribution with parameters \((\tfrac{1}{2}, \tfrac{1}{2})\). [2 points]
Demonstrate your procedure in the last question, by computer simulations, and plot a histogram of the results against pdf of the beta \((\tfrac{1}{2}, \tfrac{1}{2})\). [2 points]
Let \(S_n = X_1 + \cdots + X_n\), where \(X_i\) are i.i.d. random variables, with \[\mathbb{P}(X_1 = 1) = \tfrac{1}{2} = \mathbb{P}(X_1 = -1).\]
Let \[L_n = \# \{ 1 \leq k \leq n : S_k >0\}.\] Demonstrate, by simulations, that \(L_n/n\) converges in distribution to the beta \((\tfrac{1}{2}, \tfrac{1}{2})\) distribution.
Suppose you are given the output of a \(100000\) steps of a irreducible and aperiodic finite state Markov chain. Carefully explain how you could estimate the stationary distribution for this Markov chain, and why you estimator is reasonable. [5 points]
Import the data from the file markovchain.txt and use this data and your method above to estimate the stationary distribution. [5 points]
Suppose a shop that operates daily in the time interval \([a,b]\). It has customers arriving according to a Poisson process of intensity \(3\) in the time interval \([a, c)\), and a Poisson process of intensity \(5\) in the time interval \([c,b)\); here \(a\) and \(b\) are known, but \(c\) is unknown. You can imagine the shop keeper notices that at some point in the day, the shop seems to get busier. The shop keeper has a log of all the arrival times, for each of \(n\) days of operation, where \(n\) is large.
Given an open interval \((r,s) \subset [a,b]\), explain how you can use the shop keeper’s log to make a good guess at whether or not \((r,s)\) contains the unknown time \(c\); show that as \(n \to \infty\) you will know with certainty whether \(c \in (r,s)\). Carefully explain your answer. [5 points]
Demonstrate your answer by running simulations; for example, choose \(a=0\), \(b=8\), and \(c=4\), and simulate the arrivals to generate the shop keeper’s log. Now apply your method with the intervals \((2.7, 4.3)\) and \((5,6)\). [5 points]