Simulation of a continuous random variable. We'll consider a `dartboard' question, where the outcome of the experiment is a point chosen uniformly from a circle of radius 1, and the random variable $X$ gives the distance of the point from the center.
We saw in class what the cumulative distribution function for this random variable looks like.
from pylab import *
x=arange(0,1,0.01)
xsquare=[t*t for t in x]
x=concatenate(([-1],x,[1,2]))
xsquare=concatenate(([0],xsquare,[1,1]))
plot(x,xsquare,color='k')
title('Cumulative distribution function of distance of dart from center of the board.')
xlim(-1,2)
show()
We differentiate to obtain the probability density function.
plot([-1,0,1],[0,0,2],color='k')
plot([1,2],[0,0],color='k')
plot([1,1],[2,0],linestyle='dotted',color='r')
xlim(-1,2)
title('Probability density function of distance of dart from center of the board.')
show()
We'll do the simulation by generating N points uniformly at random in the 2x2 square centered at the origin, retaining only those that are within the circle of radius 1, and returning a list of their distances from the center.
def gensamples1(N):
v=[(2*random()-1,2*random()-1) for j in range(N)]
w=[sqrt(t[0]**2+t[1]**2) for t in v if t[0]**2+t[1]**2<=1]
return (w)
Here is a graphic depiction of the points generated. We only use the red ones!
t=arange(0,2*pi+0.005,0.01)
plot(cos(t),sin(t),color='k')
plot([-1,-1,1,1,-1],[-1,1,1,-1,-1],color='k')
axis('scaled')
v=[(2*random()-1,2*random()-1) for j in range(800)]
v1=[z for z in v if z[0]**2+z[1]**2<=1]
v2=[z for z in v if not z in v1]
scatter([z[0] for z in v1],[z[1] for z in v1],s=3,color='r')
scatter([z[0] for z in v2],[z[1] for z in v2],s=3,color='b')
show()
Now let's plot a histogram of the samples and compute their mean value. The shape should resemble the density function, and the mean should be approximately $2\over 3,$ which is the value we computed for $E(X).$ (The half-sized column at the right is an artifact of my sloppy code.)
w=gensamples1( 50000)
b =arange(-0.01,1.02,0.02)
hist(w,bins=b,edgecolor='k')
avg=sum(w)/len(w)
title('Simulation of X. Mean value='+str(avg))
We can do this a different way, by simply using the inverse of the CDF of $X$.
w=[sqrt(rand()) for j in range(50000)]
b =arange(-0.01,1.02,0.02)
hist(w,bins=b,edgecolor='k')
avg=sum(w)/len(w)
title('Simulation of X. Mean value='+str(avg))