Sunday, July 8, 2012

Color quantization

The aim of color clustering is to produce a small set of representative colors which captures the color properties of an image. Using the small set of color found by the clustering, a quantization process can be applied to the image to find a new version of the image that has been "simplified," both in colors and shapes.
In this post we will see how to use the K-Means algorithm to perform color clustering and how to apply the quantization. Let's see the code:
from pylab import imread,imshow,figure,show,subplot
from numpy import reshape,uint8,flipud
from scipy.cluster.vq import kmeans,vq

img = imread('clearsky.jpg')

# reshaping the pixels matrix
pixel = reshape(img,(img.shape[0]*img.shape[1],3))

# performing the clustering
centroids,_ = kmeans(pixel,6) # six colors will be found
# quantization
qnt,_ = vq(pixel,centroids)

# reshaping the result of the quantization
centers_idx = reshape(qnt,(img.shape[0],img.shape[1]))
clustered = centroids[centers_idx]

The result shoud be as follows:

We have the original image on the top and the quantized version on the bottom. We can see that the image on the bottom has only six colors. Now, we can plot the colors found with the clustering in the RGB space with the following code:
# visualizing the centroids into the RGB space
from mpl_toolkits.mplot3d import Axes3D
fig = figure(2)
ax = fig.gca(projection='3d')

And this is the result:

The result of the same script on another follows:

In this case I used four color. Here's the plot of the color in the RGB space:


  1. I recently packaged this functionality into SimpleCV. You may want to take a look.

  2. Thank you Katherine, I never used SimpleCV.

  3. Here is another implementation using the scikit-learn K-Means:

  4. Do you normally compose for this blog or maybe for other Internet networks?

  5. Hello Miss Teegans, I usually write only for this blog.

  6. Can we get similar results using Agglomerative Clustering instead of K-means. If yes how can we proceed.

    1. Hi Mehmood, the approach I followed is very simple. I thought the pixel as a sample. This way, every pixel is a point in the 3d space that you can cluster using any algorithm.

      You scipy functions for agglomerative clustering are listed here: