1. Continuous random variables $X$ and $Y$ have joint probability density function
    (a)$f_{X, Y}(x, y)=C_{1}\left(x^{2}+\frac{1}{3} x y\right), x \in(0,1), y \in(0,2)$
    (b)$f_{X, Y}(x, y)=C_{2} e^{-x-y}, 0<x<y<\infty$
    Find the values of the constants $C_1$ and $C_2$. For each of the joint densities above
    • are $X$ and $Y$ independent?
    • find the marginal probability density functions of $X$ and of $Y$.
    • find $\mathbb P(X ≤ 1/2, Y ≤ 1)$.
    In case (b), if the region had been $0<x,y<∞$, how would this affect your answer to the question about independence?
    Solution.
    (a)$\int _0^1\int_0^2\left(x^2+\frac{x y}{3}\right)\mathrm dy\mathrm dx=1⇒C_1=1$.
    $f_{X,Y}(x,y)$ cannot be written as $g(x)h(y)$, so $X,Y$ are not independent.
    $f_X(x,y)=\int_0^2\left(x^2+\frac{x y}{3}\right)\mathrm dy=2 x^2+\frac{2 x}{3}$.
    $f_Y(x,y)=\int_0^1\left(x^2+\frac{x y}{3}\right)\mathrm dx=\frac y6+\frac13$.
    $\mathbb P(X ≤ 1/2, Y ≤ 1)=\int_0^{1/2}\int_0^1 \left(x^2+\frac{xy}3\right)\mathrm dy\mathrm dx=\frac1{16}$.
    (b)$\int_0^\infty\int_0^ye^{-x-y}\mathrm dx\mathrm dy=\frac12⇒C_2=2$. $X$ and $Y$ are not independent since the ranges are dependent.
    $f_X(x,y)=\int_0^∞e^{-x-y}\mathrm dy=e^{-x}$. $f_Y(x,y)=\int_0^∞e^{-x-y}\mathrm dx=e^{-y}$.
    $\mathbb P(X ≤ 1/2, Y ≤ 1)=\int_0^{1/2}\int_0^1 2 e^{-x-y}\mathrm dy\mathrm dx=2(1-e^{-1/2})(1-e^{-1})$.
    In case (b), if the region had been $0<x,y<∞$, then $X$ and $Y$ are independent, since $f_{X,Y}(x,y)=e^{-x}\cdot e^{-y}$.
  2. In the game of Oxémon Ko, you wander the streets of an old university town in search of a set of $n$ different small furry creatures.
    Let $T_i$ be the time (in hours) at which you first see a creature of type $i$, for $1≤i≤n$.
    Suppose that $(T_i, 1 ≤ i ≤ n)$ are independent, and that $T_i$ has exponential distribution with parameter $λ_i$.
    (a) Let $X=\min\{T_1,T_2,...,T_n\}$ be the time at which you see your first creature. Show that $X$ has an exponential distribution and give its parameter. [Hint: consider $\mathbb P(X>t)$ and use independence.]
    (b) What is the expected number of types of creature that you have not met by time 1?
    (c) Let $M=\max\{T_1,T_2,...,T_n\}$ be the time until you have met all $n$ different types of creature. Suppose now they are all equally common, with $λ_i=1$ for all $i$. Find the median of the distribution of $M$. (As well as giving an exact expression, try to describe how quickly it grows as $n$ becomes large.) [Here you may wish to consider instead $\mathbb P(M≤t)$. You may find useful an estimate like $α^{1/n}−1=e^{\frac1n\logα}−1≈\frac1n\logα$ for large $n$.]
    Solution.
    (a) $\mathbb P(X>t)=\prod_{i=1}^n\mathbb P(T_i>t)=\prod_{i=1}^n\exp\left(-λ_it\right)=\exp\left(-\sum_{i=1}^nλ_it\right)$⇒$X\sim$exponential$\left(\sum_{i=1}^nλ_i\right)$.
    (b) $\sum_{i=1}^n\mathbb P(T_i>1)=\sum_{i=1}^ne^{-λ_i}$.
    (c) $\mathbb{P}\left(M\le t\right)=\prod_{i=1}^n\mathbb{P}\left(T_i\le t\right)=\left(1-e^{-t}\right)^n$. For the median, $\mathbb{P}\left(M\le t\right)=\frac{1}{2}$⇒$t=-\log\left(1-2^{-\frac{1}{n}}\right)\approx\log n$.
  3. Let $U$ and $V$ be independent random variables, both uniformly distributed on [0,1]. Find the probability that the quadratic equation $x^2+2Ux+V=0$ has two real solutions.
    Solution. $x^2+2Ux+V=0$ has two real solutions⇔$U^2\ge V$, therefore $\mathbb P(V\le U^2)=\int_{U=0}^1\int_{V=0}^{U^2}\mathrm dV\mathrm dU=\frac13$.
  4. A fair die is thrown $n$ times. Using Chebyshev’s inequality, show that with probability at least 31/36, the number of sixes obtained is between $n/6-\sqrt n$ and $n/6+\sqrt n$.
    Proof. The number of sixes obtained $Z\sim\text{Bernoulli}\left(\frac16\right)$, so $\mathbb E(Z)=\frac n6$,Var$(Z)=\frac{5}{36}n$. $\mathbb P\left(|Z-\frac n6|≥\sqrt n\right)≤\frac{\frac{5n}{36}}{n}=\frac5{36}$$⇒\mathbb P\left(|Z-\frac n6|\lt\sqrt n\right)≥\frac{31}{36}$.
  5. Suppose that you take a random sample of size $n$ from a distribution with mean µ and variance σ2. Using Chebyshev’s inequality, determine how large $n$ needs to be to ensure that the difference between the sample mean and µ is less than two standard deviations with probability exceeding 0.99.
    Solution. Chebyshev’s inequality is $\mathbb P(|\bar X-\mu |\geq kσ)\le\frac1{k^2n}$. Setting $k=2$, $\frac1{2^2n}<0.01$⇒$n>25$.
  6. A fair coin is tossed $n+1$ times. For $1≤i≤n$, let $A_i$ be 1 if the $i$th and $(i+1)$st outcomes are both H, and 0 otherwise.
    (a) Find the mean and the variance of $A_i$.
    (b) Find the covariance of $A_i$ and $A_j$ for $i≠j$. (Consider the cases $|i−j|=1$ and $|i−j|>1$.)
    (c) Define $M=A_1+···+A_n$, the number of occurrences of the motif HH in the sequence. Find the mean and variance of $M$. [Recall the formula for the variance of a sum of random variables, in terms of their variances and pairwise covariances.]
    (d) Use a similar method to find the mean and variance of the number of occurrences of the motif TH in the sequence.
    Solution.
    (a) $\mathbb P(A_i=1)=\frac14,\mathbb P(A_i=0)=\frac34$⇒$\mathbb E(A_i)=\frac14,\mathbb E(A_i^2)=\frac14$⇒Var$(A_i)=\frac3{16}$.
    (b) For $|i-j|=1$,we can suppose $i+1=j$, if $(i+1)$st outcome is T, then $A_i=A_j=0$; if $(i+1)$st outcome is H, $i$th and $(i+2)$nd outcomes are different, then $A_i=0,A_j=1$ or $A_i=1,A_j=0$; if $(i+1)$st outcome is H, both $i$th and $(i+2)$nd outcomes are H, then $A_i=A_j=1$; if $(i+1)$st outcome is H, both $i$th and $(i+2)$nd outcomes are T, then $A_i=A_j=0$. ∴Cov$(A_i,A_j)$=$\frac12\left(-\frac14\right)^2+\frac14\left(-\frac14\right)\frac34+\frac18\left(-\frac14\right)^2+\frac18\left(\frac34\right)^2=\frac1{16}$.
    For $|i-j|>1$, $A_i,A_j$ are independent, so Cov$(A_i,A_j)$=0.
    (c) $\mathbb E(M)=\sum\mathbb E(A_i)=\frac n4$. Var$(M)=\sum$Var$(A_i)+2\sum_{i<j}$Cov$(A_i,A_j)=\frac3{16}n+2(n-1)·\frac1{16}=\frac{5n-2}{16}$.
    (d) Let $B_i$ be 1 if the $i$th is T and $(i+1)$st outcomes is H, and 0 otherwise. Define $N=B_1+⋯+B_n$.
    $\mathbb P(B_i=1)=\frac14,\mathbb P(B_i=0)=\frac34$⇒$\mathbb E(B_i)=\frac14,\mathbb E(B_i^2)=\frac14$⇒Var$(B_i)=\frac3{16}$.
    For $|i-j|=1$,we can suppose $i+1=j$, if $i$th outcome is H, then $B_i=0$; if $i$th outcome is T, $(i+2)$nd outcome is T, then $B_i=B_j=0$; if $i$th outcome is T, $(i+2)$nd outcome is H, then $B_i=1,B_j=0$ or $B_i=0,B_j=1$. ∴Cov$(B_i,B_j)=\frac14\left(-\frac14\right)^2+\frac14\left(-\frac14\right)\frac34=-\frac{1}{32}$. For $|i-j|>1$, $B_i,B_j$ are independent, so Cov$(B_i,B_j)$=0.
    $\mathbb E(N)=\sum\mathbb E(B_i)=\frac n4$. Var$(N)=\sum$Var$(N_i)+2\sum_{i<j}$Cov$(N_i,N_j)=\frac3{16}n+2(n-1)·\left(-\frac1{32}\right)=\frac{2n+1}{16}$.
  7. Let $a,b,p∈(0,1)$. What is the distribution of the sum of $n$ independent Bernoulli random variables with parameter $p$? By considering this sum and applying the weak law of large
    numbers, identify the limit$$\lim _{n \rightarrow \infty} \sum_{r \in \mathbb{N}: \atop a n<r<b n}\left(\begin{array}{c}n \\ r\end{array}\right) p^{r}(1-p)^{n-r}$$in the cases (i) $p<a$; (ii) $a<p<b$; (iii) $b<p$.
    Solution.
    Let $X_i\sim$Bernoulli$(p)$, $X=\sum_{i=1}^nX_i$, then $X\sim$Binomial$(n,p)$. $\mathbb E[X]=pn$.
    (i)$p<a$, then $\mathbb P(an<r<bn)\le\mathbb P(r>an)$. By weak law of large numbers, $\lim_{n\to∞}\mathbb P(r>an)=0⇒\lim_{n\to∞}\mathbb P(an<r<bn)=0$.
    (ii)$a<p<b$, let $c=\min\{p-a,b-p\}$, then $\mathbb P(an<r<bn)\ge\mathbb P(|r-pn|\le cn)$. By weak law of large numbers, $\lim_{n\to∞}\mathbb P(|r-pn|\le cn)=1⇒\lim_{n\to∞}\mathbb P(an\le r\le bn)=1$.
    (iii)$b<p$, then $\mathbb P(an<r<bn)\le\mathbb P(r<bn)$. By weak law of large numbers, $\lim_{n\to∞}\mathbb P(r<bn)=0⇒\lim_{n\to∞}\mathbb P(an<r<bn)=0$.