Post

Collecting Verification Codes (2)

Collecting Verification Codes (2)

Introduction

In the previous post, we discussed the expected number of steps to collect all verification codes. In this post, we will calculate the variance of the number of log-in tries to collect all verification codes.

Problem Statement

(See the previous post.) Let S be a finite set of size N. At the beginning, a set F is initialized to be an empty set. For each step, a random element r is chosen at uniformly random from S and put to F, i.e., FF{r}. Let X be the random variable that represents the number of steps to reach S=F. Then, what is the variance of X?

Solution

Fix N and let Xi be the random variable for the number of steps to choose r which has not been chosen when there are i out of N elements are not in F. Then, we have X=i=1NXi where each Xi are independent. Hence, we have Var(X)=i=1NVar(Xi) where Var() represents the variance.

Now, note that Xi is the geometric distribution with parameter pi=i/N. The probability mass function of the geometric distribution Y with parameter p is given by

Pr[Y=k]=(1p)k1p.

Variance of Geometric Distribution

Well, one may calculate the variance by calculating

Var(Y)=E[X2](E[X])2=k=1k2(1p)k1p(k=1k(1p)k1p)2

but I don’t want to do that. Instead, I will use the following formula:

E[Xn]=dndtnϕ(t)|t=0

where ϕ(t)=E[etY]=k=1etk(1p)k1p is the moment generating function. Calculating the moment generating function of Y, we get

ϕ(t)=k=1etk(1p)k1p=pet1qet

where q=1p. Then, we have

ϕ(t)=pet(1qet)2 and ϕ(t)=pet[1(qet)2](1qet)4.

(Yes, I calculated for you.) Now, we finally have

Var(Y)=E[Y2]E[Y]2=ϕ(0)ϕ(0)2=2pp2(1p)2=1pp2.

Finally Calculating the Variance

Now, we have

Var(Xi)=1pipi2=N2i2Ni.

Hence, the variance we want to calculate is

Var(X)=N2i=1N1i2Ni=1N1i.

In particular, the growth rate is Var(X)ζ(2)N2 where ζ is the Riemann zeta function.

Conclusion

We have, Var(X)15831.10 and Var(X)125.82. Hence, in a high probability, I shall try 518.74±125.82 times to collect all verification codes!

This post is licensed under CC BY 4.0 by the author.