This document contains a running list of recommended homework exercises.
Using the law of large numbers, carefully justify why a probability histogram of large size sample from a pdf \(f\) ends up looking like \(f\).
Let \(T \geq 1\) be a positive integer-valued random variable. Let \(X_1, X_2, \ldots\) be iid random variables, with finite expectation. Let \(S_n = X_1 + \cdots +X_n\). Show that it may not be true that \[\mathbb{E}(S_T) = \mathbb{E}(T) \cdot \mathbb{E}(X_1).\]
Show by example that the pairwise independence of random variables is a weaker property than the (full) independence of random variables.
Compute using the law of large numbers and simulations the following integral: \[\int_0 ^ 5 \sin(x) e^x dx;\] hint: identify a random variable that has this integral for its expectation.
Let \(X\) be a continuous random variable, with cdf \(F\). Show that \(F(X)\) is uniformly distributed on \([0,1]\); for simplicity, you may assume that \(F\) is invertible.
Assuming that Python or R only has uniform random variables available, code (in two ways) a geometric random variable with support on the positive integers. Simulate your geometric random variable, and demonstrate by simulations that you have correctely coded it.
Demonstrate, by simulations, that the sum of two independent Poisson random variables is again a Poisson random variable; in your simulations you may restrict to the case of means \(2.5\) and \(3.7\).
How would code a uniform point a disc without throwing away randomness, as we do in the rejection sampling?
How would you code a uniform point on the surface of a sphere?
By simulation reproduce Figure 12
Show that it is indeed enough to consider the Weierstrass theorem on the unit interval.
What about a higher dimensional version of Weierstrass?
Do Question 2
If Markov chains have one-step memory, what happens if we study three-step memory processes? Is the theory the same, why?
Suppose \(X\) and \(Y\) are jointly distributed random variables. Find a coupling such that \(Y' = \phi(X', U)\), where \(U\) is independent of \(X'\) and uniformly distributed on the unit interval.
Do Question 2
You may recall the central limit theorem for iid random variables; explore what happens with Markov chains. In particular, see Question 3
Suppose you are given only the data (the realization) of a Markov chain, say for one hundred steps; how would you go about generating another hundred steps of this Markov chain?
So we have all this theory for Markov chains, where you have to look back at one-step; what happens if you allow two-step memory?
Suppose I prove a law of large numbers type result for Markov chains started at stationarity. How would I extend this proof to any starting distribution?
Find an example where the sum of two dependent Poisson random variables is again a Poisson. Hint: perturb the joint mass function of two independent Poisson random variables.
Explore that happens if instead of starting with a Poisson number of iid uniforms, you use other discrete distributions.
Consider iid pertubations of the lattice; that is, for each \(n \in \mathbb{Z}^d\), we perturb it to obtain the perturbed lattice \[\Pi := \{n + X_n: n \in \mathbb{Z}^d\},\] where \((X_n)_{n \in \mathbb{Z}^d}\) are iid \(\mathbb{R}^d\) random variables. Thus \(\Pi\) is a point process– a random scattering points in \(\mathbb{R}^d\). Show that mass is conserved, so that the expected number of points in the set \([0,1)^d\) is (still) one. (Optional)
From the modelling assumptions of Poisson processes, show that if the unit interval contains exactly one point, then that point is uniformly distributed.
How would you simulate a uniform on the surface of an ellipse?
More Poisson questions here
Using the law of large numbers, carefully justify why a probability histogram of large size sample from a pdf \(f\) ends up looking like \(f\).
Examine Exercise 2 by hand and by simulation.
Examine Exercise Pen and Paper by hand and by simulation.
Let \(X \geq 0\) be a continuous random variable with finite first moment. Prove that \[\mathbb{E} X = \int_0 ^{\infty} \mathbb{P}(X >t) dt.\] Hint: use a double integral.
Let \(X\) and \(Y\) be nonnegative independent continuous random variables. Prove that for \(t >0\), we have \[\mathbb{P}(XY > t) = \int_0 ^{\infty} \mathbb{P}(X >\tfrac{t}{y}) f_Y(y) dy,\] where \(f_Y\) is the probability density function for \(Y\).
Using the previous results prove that \[\mathbb{E}( X Y) = (\mathbb{E} X )(\mathbb{E} Y),\] assuming all the expectations are finite.
Recall that \(X\) has the gamma distribution with parameters \(\alpha, \beta >0\) if it has probability density function given by \[
\begin{equation}
f(x; \alpha, \beta)=
\frac{ \mathbf{1}_{(0, \infty)}(x)}{\beta^{\alpha} \Gamma(\alpha)} x^{\alpha-1} e^{-x /\beta}.
\end{equation}
\] Let \(g: [0, \infty) \to \mathbb{R}\) be a bounded continuous function and \(n \in \mathbb{Z}^{+}\). Prove that
\[ \lim_{n \to \infty} \frac{1}{\beta^n \Gamma(n)} \int_ 0^{\infty} g(x/n) x^{n-1} e^{-x /\beta} dx =g(\beta).\]
Let \(Z\) be a real-valued random variable. Recall that if \(m\) is the unique point such that \(\mathbb{P}(Z \leq m) = \tfrac{1}{2}\), then it is the median of \(Z\). We say that \(Z\) is symmetric about \(m\) if for all \(c \geq 0\), we have \(\mathbb{P}(Z -m \geq c) = \mathbb{P}(Z -m \leq -c)\) . Let \(X = (X_1, \ldots, X_{2n+1})\) be a random sample from a symmetric distribution with unique median zero and order statistics given by \(Y_1 \leq Y_2 \cdots \leq Y_{n+1} \leq \cdots \leq Y_{2n+1}\). The sample median is given by \(M(X) = Y_{n+1}\).
Show that \(-Y_{n+1}\) has the same distribution as \(Y_{n+1}\). Hint, you can do this without fancy order statistics knowledge.
Assuming the expectations exist, show that \(\mathbb{E} Y_{n+1} =0\).
Let \(\epsilon >0\). Let \(B \sim Bin(2n+1, p)\), where \(p = \mathbb{P}(X_1 > \epsilon)\). Show that \(\mathbb{P}( Y_{n+1} > \epsilon) = \mathbb{P}(B \geq n+1)\).
Show that \(\mathbb{P}(B \geq n+1) \to 0\) as \(n \to \infty\). Deduce that \(M(X) \to 0\) in probability.
Let \(X= (X_1, \ldots,X_{2n+1})\) be a random sample from the normal distribution with unknown mean \(\mu \in \mathbb{R}\) and known unit variance. Use the previous exercises to show that the sample median is unbiased and consistent estimator for \(\mu\). Recall that a (sequence of) estimators is consistent if they converge in probability to the parameter being estimated.
Prove that there exists a deterministic set of positive integers \(S\) such that for every positive integer \(a\), we have \[\frac{ \big| S \cap \{ a, 2a, \ldots, na\} \big| }{n} \to \frac{1}{2}.\] Hint: choose a random subset, and show that there is an event of probability one for which it will satisfy the above requirement. Your final answer should be a deterministic set.
Let \(Z\) be a random variable with the standard normal distribution.
Show that for \(t >0\), we have \[\mathbb{P}(Z >t)\leq \frac{1}{t\sqrt{2\pi}} e^{-\tfrac{t^2}{2}}.\]
Show that for \(t >0\), we have \[\mathbb{P}(Z >t) \geq \frac{1}{\sqrt{2\pi}} \big(\frac{1}{t} - \frac{1}{t^3}\big) e^{-\tfrac{t^2}{2}}.\] Hint: take the derivative of the lower bound and see what you get.
Let \((X_i)_{i=3} ^{\infty}\) be an i.i.d. sequence of standard normal variables. Prove that almost surely \[\limsup_{n \to \infty} \frac{X_n}{\sqrt{2 \log n}} =1.\]
Let \(X\) and \(Y\) are real-valued random variables. We say that \(X\) stochastically dominates \(Y\) if for all \(z \in \mathbb{R}\), we have \(\mathbb{P}(X \leq z) \leq \mathbb{P}(Y \leq z)\). Let us say that a coupling \((X', Y')\) of \(X\) and \(Y\) is monotone if \(\mathbb{P}(X' \geq Y') =1\).
Let \(0 < q < p <1\). Let \(X \sim Bin(p)\) and \(Y \sim Bin(q)\). Find a coupling \((X', Y')\) of \(X\) and \(Y\) so that \(X' \geq Y'\).
Let \(f:[0,1] \to \mathbb{R}\) be a continuous function. Let \(n\geq 1\). Consider the Bernstein polynomial for \(f\) is defined by \[p(x) = \sum_{k=0}^n f(k/n){n \choose k} x^k(1-x)^{n-k}.\] Show that if \(f\) is an increasing function, then \(p\) is an increasing function.
Let \(X\) and \(Y\) be discrete random variables taking values on the space \(S\), with probability mass functions \(p\) and \(q\). Show that there exists a (maximal) coupling of \((X', Y')\) of \(X\) and \(Y\) such that the equality is achieved in the coupling inequality: \[d_{TV}(X, Y) = d_{TV}(X', Y') = 2 \mathbb{P}(X' \not = Y'),\]
Let \(X_1, \ldots, X_n\) be independent integer-valued random variables. Also let \(Y_1, \ldots, Y_n\) be independent integer-valued random variables. Set \(S=X_1 + \cdots + X_n\) and \(W = Y_1 + \cdots + Y_n\). Show that
\[d_{TV}(S, W) \leq \sum_{i=1}^n d_{TV}(X_i, Y_i).\]
then \(V_n/n \to \pi(s)\) in the mean-squared, where \(\pi\) is the stationary distribution. Show that the assumption that the chain is started at stationarity can be removed.
Prove that for an irreducible Markov chain on a finite state space \(S\), we have that for each \(s \in S\), the return time \[T = \inf \{n\geq 1: X_n=s\}\] has finite expectation, regardless of the starting distribution of the chain.
A measure-preserving system is a probability space \((\Omega, \mathcal{F}, \mu)\) endowed with a self-map \(T: \Omega \to \Omega\), where \(\mu \circ T^{-1} = \mu\). Verify that a Markov chain started at a stationary distribution corresponds to a measure-preserving system. Hint: consider shifting the coordinates of your Markov chain.
We that a measure-preserving system is ergodic if every invariant set has measure zero or one; that is, if \(\mu(A \triangle T^{-1}(A)) = 0\), then \(\mu(A) \in \{0,1\}\). Find an example of a stationary Markov chain with non-trivial invariant sets.
Show that the strong mixing condition given by
\[\mu(A \cap T^{-n}B) \to \mu(A) \mu(B) \text{ for all } A, B \in \mathcal{F}\]
implies ergodicity. Note by the usual measure theory arguments, verifying the above condition for a large enough class of \(A,B\) is equivalent to verifying the strong mixing condition.
Check that aperiodic irreducible finite state Markov chains, started at the stationary distribution, are strongly mixing. This allows us to obtain limit theorems from the general statements in ergodic theory. Hint: first consider the case corresponding to the events \(A = \{ X_1=s \}\) and \(B =\{ X_3=t\}\).
The von Neumann ergodic theorem gives that for any stationary ergodic process \(X = (X_k)_{k=0} ^{\infty}\) endowed the left-shift, where \((T X)_i = X_{i+1}\), we have the following convergence in the mean-squared
\[ \frac{1}{n} \sum_{k=0} ^{n-1} f( T^k X) \to \mathbb{E} f(X)\]
for all \(f\) such that \(\mathbb{E} [f(X)]^2 < \infty\). Consider the function \(f\) where
\[f(x) = \mathbf{1}[x_{2} = c, x_1 = b, x_0 =a].\] What happens for a Markov chain? Run simulations and check that everything is consistent, for a particular Markov chain.