Simulation of a continuous random variable. We'll consider a `dartboard' question, where the outcome of the experiment is a point chosen uniformly from a circle of radius 1, and the random variable $X$ gives the distance of the point from the center.

We saw in class what the cumulative distribution function for this random variable looks like.

In [2]:
from pylab import *
x=arange(0,1,0.01)

xsquare=[t*t for t in x]
x=concatenate(([-1],x,[1,2]))
xsquare=concatenate(([0],xsquare,[1,1]))
plot(x,xsquare,color='k')
title('Cumulative distribution function of distance of dart from center of the board.')
xlim(-1,2)
show()
<Figure size 640x480 with 1 Axes>

We differentiate to obtain the probability density function.

In [3]:
plot([-1,0,1],[0,0,2],color='k')
plot([1,2],[0,0],color='k')
plot([1,1],[2,0],linestyle='dotted',color='r')
xlim(-1,2)
title('Probability density function of distance of dart from center of the board.')
show()

We'll do the simulation by generating N points uniformly at random in the 2x2 square centered at the origin, retaining only those that are within the circle of radius 1, and returning a list of their distances from the center.

In [4]:
 def gensamples1(N):
    v=[(2*random()-1,2*random()-1) for j in range(N)]
    w=[sqrt(t[0]**2+t[1]**2) for t in v if t[0]**2+t[1]**2<=1]
    return (w)

Here is a graphic depiction of the points generated. We only use the red ones!

In [8]:
t=arange(0,2*pi+0.005,0.01)
plot(cos(t),sin(t),color='k')

plot([-1,-1,1,1,-1],[-1,1,1,-1,-1],color='k')
axis('scaled')
v=[(2*random()-1,2*random()-1) for j in range(800)]
v1=[z for z in v if z[0]**2+z[1]**2<=1]
v2=[z for z in v if not z in v1]
scatter([z[0] for z in v1],[z[1] for z in v1],s=3,color='r')
scatter([z[0] for z in v2],[z[1] for z in v2],s=3,color='b')
show()

Now let's plot a histogram of the samples and compute their mean value. The shape should resemble the density function, and the mean should be approximately $2\over 3,$ which is the value we computed for $E(X).$ (The half-sized column at the right is an artifact of my sloppy code.)

In [9]:
w=gensamples1( 50000)
b =arange(-0.01,1.02,0.02)
hist(w,bins=b,edgecolor='k')
avg=sum(w)/len(w)
title('Simulation of X. Mean value='+str(avg))
Out[9]:
Text(0.5,1,'Simulation of X. Mean value=0.6671228225872385')

We can do this a different way, by simply using the inverse of the CDF of $X$.

In [7]:
w=[sqrt(rand()) for j in range(50000)]
b =arange(-0.01,1.02,0.02)
hist(w,bins=b,edgecolor='k')
avg=sum(w)/len(w)
title('Simulation of X. Mean value='+str(avg))
Out[7]:
Text(0.5,1,'Simulation of X. Mean value=0.6663905684296646')