- We will give a quick review of the law of large numbers
- Illustrate applications of this theorem via R
Recall that a countable collection of random variables are independent if every finite subset of them is independent, and they are identical if they have the same distribution.
We say that a sequence of random variables \((X_n)_{n \in \mathbb{Z}^+}\) converges almost-surely to a random variable \(X\) if there exists an event \(\Omega'\) with \(\mathbb{P}(\Omega') =1\) such that for every \(\omega \in \Omega'\), we have convergence in the usual pointwise sense: \(X_n(\omega) \to X(\omega)\), as \(n \to \infty\).
Let \((X_n)_{n \in \mathbb{Z}^{+}}\) be a sequence of independent and identically distributed (i.i.d.) random variables. If \(\mathbb{E} |X_1| < \infty\), then \[ n^{-1}(X_1 + \cdots + X_n) \to \mathbb{E} X_1\] almost-surely.
The almost-sure version of the law of large numbers has a somewhat difficult proof.
Convergence in the mean-square and in probability is easy.
We say that \(X_n\) converges to \(X\) in the mean-squared if \[ \mathbb{E} | X_n - X|^{2} \to 0\]
We say that \(X_n\) coverges to \(X\) in probability if for every \(\epsilon >0\), we have \[\mathbb{P} ( | X_n - X| > \epsilon) \to 0.\]
Let \((X_n)_{n \in \mathbb{Z}^{+}}\) be a sequence of i.i.d. random variables. If \(\mathbb{E} |X_1|^2 < \infty\), then \[ n^{-1}(X_1 + \cdots + X_n) \to \mathbb{E} X_1\] in the mean-squared.
\[\mathbb{E} | n^{-1}S_n - \mathbb{E} X_1 |^2 = n^{-2}\mathbb{E} | S_n - n\mathbb{E} X_1 |^2 \] \[= n^{-2}\mathrm{var}(S_n) \] \[ =n^{-2} \sum_{i=1} ^n \mathrm{var}(X_i) \] \[ =n^{-1} \mathrm{var}(X_1) \to 0.\]
Markov’s inequality
If \(X_n\) converges to \(X\) in the mean-square, then it converges in probability.
Let \(\epsilon >0\). Markov’s inequality gives \[\mathbb{P} (| X_n - X| > \epsilon) \leq \epsilon^{-2} \mathbb{E} | X_n -X|^2 \to 0.\]
Recall that if \(T_n\) is a sequence of estimators for a parameter \(\theta\), then we say that \(T_n\) is consistent if \(T_n \to \theta\) in probability.
inorder = seq(1, 20,1) Hand <-function(){ x <- sample(20,20, replace=F); is.element(TRUE, x==inorder) } r <-replicate(10000, Hand()) sum(r==FALSE)/10000
## [1] 0.3672
1/exp(1)
## [1] 0.3678794
First, we code one iteration of this procedure, which reports the first ball, and the second ball.
We will use Bernoulli random variables, whereby zero will mean a blue ball, and one will mean a red ball, has been drawn.
Next, we play the game a large number of (independent) times.
We also, store as two separate vectors, the outcome of the first pick, and the second pick.
We want to count the number of times the second pick resulted in a red ball, and out of those times, the first pick was a blue ball, and then take the ratio.
We write a simple loop to accomplish this.
game <-function() { x = rbinom(1,1, 4/12); y = rbinom(1,1, 2/10); if (x==0) y = rbinom(1,1, 4/13); c(x,y) }
z = replicate(10000, game()); F = z[1,] S = z[2,] br=0 for (i in 1:10000) if (S[i]==1 && F[i]==0) br <- br+1 br
## [1] 2035
r=sum(S==1) r
## [1] 2731
br/r
## [1] 0.7451483
Carefully explain why plotting a (probability) histogram of some sample data may give an approximation of the probability density function of some random variable.