Thursday, August 16, 2012

Kernel Density Estimation with scipy

This post continues the last one where we have seen how to how to fit two types of distribution functions (Normal and Rayleigh). This time we will see how to use Kernel Density Estimation (KDE) to estimate the probability density function. KDE is a nonparametric technique for density estimation in which a known density function (the kernel) is averaged across the observed data points to create a smooth approximation. Also, KDE is a non-parametric density estimators, this means that the estimator has not a fixed functional form but only it depends upon all the data points we used to reach an estimate and the result of the procedure has no meaningful associated parameters. Let's see the snippet:
from scipy.stats.kde import gaussian_kde
from scipy.stats import norm
from numpy import linspace,hstack
from pylab import plot,show,hist

# creating data with two peaks
sampD1 = norm.rvs(loc=-1.0,scale=1,size=300)
sampD2 = norm.rvs(loc=2.0,scale=0.5,size=300)
samp = hstack([sampD1,sampD2])

# obtaining the pdf (my_pdf is a function!)
my_pdf = gaussian_kde(samp)

# plotting the result
x = linspace(-5,5,100)
plot(x,my_pdf(x),'r') # distribution function
hist(samp,normed=1,alpha=.3) # histogram
show()
The result should be as follows:

2 comments:

  1. Great thoughts you got there, believe I may possibly try just some of it throughout my daily life.










    Function Point Estimation Training in Chennai

    ReplyDelete