where cov(X,Y) is the covariance between X and Y, while σX and σY are the standard deviations. If N is number of variables then R is a N-by-N matrix. Then, when we have a large number of variables we need a way to visualize R. The following snippet uses a pseudocolor plot to visualize R:
from numpy import corrcoef, sum, log, arange from numpy.random import rand from pylab import pcolor, show, colorbar, xticks, yticks # generating some uncorrelated data data = rand(10,100) # each row of represents a variable # creating correlation between the variables # variable 2 is correlated with all the other variables data[2,:] = sum(data,0) # variable 4 is correlated with variable 8 data[4,:] = log(data[8,:])*0.5 # plotting the correlation matrix R = corrcoef(data) pcolor(R) colorbar() yticks(arange(0.5,10.5),range(0,10)) xticks(arange(0.5,10.5),range(0,10)) show()The result should be as follows:
As we expected, the correlation coefficients for the variable 2 are higher than the others and we observe a strong correlation between the variables 4 and 8.

Don't use the jet colormap!
ReplyDeletehttp://www.jwave.vt.edu/~rkriz/Projects/create_color_table/color_07.pdf
https://abandonmatlab.wordpress.com/2011/05/07/lets-talk-colormaps/
http://cresspahl.blogspot.com/2012/03/expanded-control-of-octaves-colormap.html
I think the hot colormap would be a better choice here
Agreed.
DeleteIn some cases, Hinton diagrams can be far more useful. See http://www.scipy.org/Cookbook/Matplotlib/HintonDiagrams
ReplyDeletehey,
ReplyDeletei get a strange error when running the script:
/Users/xxx/src/matplotlib/lib/matplotlib/backends/backend_macosx.pyc in draw_quad_mesh(self, gc, master_transform, meshWidth, meshHeight, coordinates, offsets, offsetTrans, facecolors, antialiased, showedges)
98 facecolors,
99 antialiased,
--> 100 showedges)
101
102 def new_gc(self):
"only length-1 arrays can be converted to Python scalars"
also, the colorbar is not visible
what to do?
which version of matplotlib/python are you using?
ReplyDeletehey,
ReplyDeletei'm using Python 2.7.3 and matplotlib '1.2.x' on os x.
btw: if i leave out the colorbar command the error doesn't show up.
I use matplotlib 1.1.1rc.
Deletehello again.
ReplyDeleteactually, i dont know why i had this unstable version installed.
i used pip to install the stable 1.1.1 version and now it works like a charm.
thanks for the fast reply and keep up the good work here :)
I like the correlation example and will try that later on some of my data. It is also cool that we uses the same theme on blogger. /Magnus
ReplyDeleteThanks Magnus. I like this theme because it's simple. If you're interested in matrix visualization don't forget to try Hinton diagrams also.
Delete